Sponsored by Deepsite.site

Gemini Mcp

Created By
RLabs-Inc9 months ago
MCP Server that enables Claude code to interact with Gemini
Content

MCP Server Gemini

A Model Context Protocol (MCP) server for integrating Google's Gemini 3 models with Claude Code, enabling powerful collaboration between both AI systems. Now with a beautiful CLI!

npm version MCP Registry

MCP Registry Support: Now discoverable in the official MCP ecosystem!

Features

FeatureDescription
Deep Research AgentAutonomous multi-step research with web search and citations
Token CountingCount tokens and estimate costs before API calls
Text-to-Speech30 unique voices, single speaker or two-speaker dialogues
URL AnalysisAnalyze, compare, and extract data from web pages
Context CachingCache large documents for efficient repeated queries
YouTube AnalysisAnalyze videos by URL with timestamp clipping
Document AnalysisPDFs, DOCX, spreadsheets with table extraction
4K Image GenerationGenerate images up to 4K with 10 aspect ratios
Multi-Turn Image EditingIteratively refine images through conversation
Video GenerationCreate videos with Veo 2.0 (async with polling)
Code ExecutionGemini writes and runs Python code (pandas, numpy, matplotlib)
Google SearchReal-time web information with inline citations
Structured OutputJSON responses with schema validation
Data ExtractionExtract entities, facts, sentiment from text
Thinking LevelsControl reasoning depth (minimal/low/medium/high)
Direct QuerySend prompts to Gemini 3 Pro/Flash models
BrainstormingClaude + Gemini collaborative problem-solving
Code AnalysisAnalyze code for quality, security, performance
SummarizationSummarize content at different detail levels

Quick Installation

MCP Server for Claude Code

# Using npm (Recommended)
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY npx -y @rlabs-inc/gemini-mcp

# Using bun
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY bunx @rlabs-inc/gemini-mcp

CLI (Global Install)

# Install globally
npm install -g @rlabs-inc/gemini-mcp

# Set your API key once (stored securely)
gcli config set api-key YOUR_KEY

# Now use any command!
gcli search "latest news"
glci image "sunset over mountains" --ratio 16:9

Get your API key: Visit Google AI Studio - it's free and takes seconds!

Installation Options

# With verbose logging
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY VERBOSE=true bunx -y @rlabs-inc/gemini-mcp

# With custom output directory for generated images/videos
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY GEMINI_OUTPUT_DIR=/path/to/output bunx -y @rlabs-inc/gemini-mcp

Available Tools

gemini-query

Direct queries to Gemini with thinking level control:

prompt: "Explain quantum entanglement"
model: "pro" or "flash"
thinkingLevel: "low" | "medium" | "high" (optional)
  • low: Fast responses, minimal reasoning
  • medium: Balanced (Flash only)
  • high: Deep reasoning for complex tasks (default)

gemini-generate-image

Generate images with Nano Banana Pro (Claude can SEE them!):

prompt: "a futuristic city at sunset"
style: "cyberpunk" (optional)
aspectRatio: "16:9" (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
imageSize: "2K" (1K, 2K, 4K)
useGoogleSearch: false (ground in real-world info)

gemini-start-image-edit

Start a multi-turn image editing session:

prompt: "a cozy cabin in the mountains"
aspectRatio: "16:9"
imageSize: "2K"
useGoogleSearch: false

Returns a session ID for iterative editing.

gemini-continue-image-edit

Continue refining an image:

sessionId: "edit-123456789"
prompt: "add snow on the roof and make it nighttime"

gemini-end-image-edit

Close an editing session:

sessionId: "edit-123456789"

gemini-list-image-sessions

List all active editing sessions.

gemini-generate-video

Generate videos using Veo:

prompt: "a cat playing piano"
aspectRatio: "16:9" (optional)
negativePrompt: "blurry, text" (optional)

Video generation is async (takes 1-5 minutes). Use gemini-check-video to poll.

gemini-check-video

Check video generation status and download when complete:

operationId: "operations/xxx-xxx-xxx"

gemini-analyze-code

Analyze code for issues:

code: "function foo() { ... }"
language: "typescript" (optional)
focus: "quality" | "security" | "performance" | "bugs" | "general"

gemini-analyze-text

Analyze text content:

text: "Your text here..."
type: "sentiment" | "summary" | "entities" | "key-points" | "general"

gemini-brainstorm

Collaborative brainstorming:

prompt: "How could we implement real-time collaboration?"
claudeThoughts: "I think we should use WebSockets..."
maxRounds: 3 (optional)

gemini-summarize

Summarize content:

content: "Long text to summarize..."
length: "brief" | "moderate" | "detailed"
format: "paragraph" | "bullet-points" | "outline"

gemini-run-code

Let Gemini write and execute Python code:

prompt: "Calculate the first 50 prime numbers and plot them"
data: "optional CSV data to analyze" (optional)

Supports libraries: numpy, pandas, matplotlib, scipy, scikit-learn, tensorflow, and more. Generated charts are saved to the output directory and returned as images.

Real-time web search with citations:

query: "What happened in tech news this week?"
returnCitations: true (default)

Returns grounded responses with inline citations and source URLs.

gemini-structured

Get JSON responses matching a schema:

prompt: "Extract the meeting details from this email..."
schema: '{"type":"object","properties":{"date":{"type":"string"},"attendees":{"type":"array"}}}'
useGoogleSearch: false (optional)

gemini-extract

Convenience tool for common extraction patterns:

text: "Your text to analyze..."
extractType: "entities" | "facts" | "summary" | "keywords" | "sentiment" | "custom"
customFields: "name, date, amount" (for custom extraction)

gemini-youtube

Analyze YouTube videos directly:

url: "https://www.youtube.com/watch?v=..."
question: "What happens at 2:30?"
startTime: "1m30s" (optional, for clipping)
endTime: "5m00s" (optional, for clipping)

gemini-youtube-summary

Quick video summarization:

url: "https://www.youtube.com/watch?v=..."
style: "brief" | "detailed" | "bullet-points" | "chapters"

gemini-analyze-document

Analyze PDFs and documents:

filePath: "/path/to/document.pdf"
question: "Summarize the key findings"
mediaResolution: "low" | "medium" | "high"

gemini-summarize-pdf

Quick PDF summarization:

filePath: "/path/to/document.pdf"
style: "brief" | "detailed" | "outline" | "key-points"

gemini-extract-tables

Extract tables from documents:

filePath: "/path/to/document.pdf"
outputFormat: "markdown" | "csv" | "json"

Workflow: Claude + Gemini

The killer combination for development:

ClaudeGemini
Complex logicFrontend/UI
ArchitectureVisual components
Backend codeImage generation
IntegrationReact/CSS styling
ReasoningCreative generation

Example workflow:

  1. Ask Claude to design the backend API
  2. Use gemini-generate-image for UI mockups
  3. Ask Gemini to generate React components via gemini-query
  4. Use multi-turn editing to refine visuals
  5. Let Claude wire everything together

Environment Variables

VariableRequiredDefaultDescription
GEMINI_API_KEYYes-Your Google Gemini API key
GEMINI_OUTPUT_DIRNo./gemini-outputWhere to save generated files
GEMINI_MODELNo-Override model for init test
GEMINI_PRO_MODELNogemini-3-pro-previewPro model (Gemini 3)
GEMINI_FLASH_MODELNogemini-3-flash-previewFlash model (Gemini 3)
GEMINI_IMAGE_MODELNogemini-3-pro-image-previewImage model (Nano Banana Pro)
GEMINI_VIDEO_MODELNoveo-2.0-generate-001Video model
VERBOSENofalseEnable verbose logging
QUIETNofalseMinimize logging

Manual Installation

Global Install

# Using npm
npm install -g @rlabs-inc/gemini-mcp

# Using bun
bun install -g @rlabs-inc/gemini-mcp

Claude Code Configuration

{
  "gemini": {
    "command": "npx",
    "args": ["-y", "@rlabs-inc/gemini-mcp"],
    "env": {
      "GEMINI_API_KEY": "your-api-key",
      "GEMINI_OUTPUT_DIR": "/path/to/save/files"
    }
  }
}

Troubleshooting

Rate Limits (429 Errors)

If you're hitting rate limits on the free tier:

  • Set GEMINI_MODEL=gemini-3-flash-preview to use Flash for init (higher limits)
  • Or upgrade to a paid plan

Connection Issues

  1. Verify your API key at Google AI Studio
  2. Check server status: claude mcp list
  3. Try with verbose logging: VERBOSE=true

Image/Video Issues

  • Ensure your API key has access to image/video generation
  • Check output directory permissions
  • Files save to GEMINI_OUTPUT_DIR (default: ./gemini-output)
  • For 4K images, generation takes longer

Previous Versions

0.7.2

Beautiful CLI with Themes! Use Gemini directly from your terminal:

# Install globally
npm install -g @rlabs-inc/gemini-mcp

# Set your API key once
gcli config set api-key YOUR_KEY

# Generate images, videos, search, research, and more!
gcli image "a cat astronaut" --size 4K
gcli search "latest AI news"
gcli research "quantum computing applications" --wait
gcli speak "Hello world" --voice Puck

5 Beautiful Themes: terminal, neon, ocean, forest, minimal

CLI Commands:

  • gcli query - Direct Gemini queries with thinking levels
  • gcli search - Real-time web search with citations
  • gcli research - Deep research agent
  • gcli image - Generate images (up to 4K)
  • gcli video - Generate videos with Veo
  • gcli speak - Text-to-speech with 30 voices
  • gcli tokens - Count tokens and estimate costs
  • gcli config - Manage settings

v0.6.x: Deep Research, Token Counting, TTS, URL analysis, Context Caching v0.5.x: 30+ tools, YouTube analysis, Document analysis v0.4.x: Code execution, Google Search v0.3.x: Thinking levels, Structured output, 4K images v0.2.x: Image/Video generation with Veo


Development

git clone https://github.com/rlabs-inc/gemini-mcp.git
cd gemini-mcp
bun install
bun run build
bun run dev -- --verbose

Scripts

CommandDescription
bun run buildBuild for production
bun run devDevelopment mode with watch
bun run typecheckType check without emitting
bun run formatFormat with Prettier
bun run lintLint with ESLint

License

MIT License


Made with Claude + Gemini working together

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
Amap Maps高德地图官方 MCP Server
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
CursorThe AI Code Editor
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Tavily Mcp
DeepChatYour AI Partner on Desktop
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
ChatWiseThe second fastest AI chatbot™
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
WindsurfThe new purpose-built IDE to harness magic
Playwright McpPlaywright MCP server
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Serper MCP ServerA Serper MCP Server