Sponsored by Deepsite.site

Google AI Studio MCP Server

Created By
eternnoir5 months ago
Google AI Studio MCP Server - Powerful Gemini API integration for Model Context Protocol with multi-modal file processing, PDF-to-Markdown conversion, image analysis, and audio transcription capabilities. Supports all Gemini 2.5 models with comprehensive file format support.
Content

AI Studio MCP Server

A Model Context Protocol (MCP) server that integrates with Google AI Studio / Gemini API, providing content generation capabilities with support for files, conversation history, and system prompts.

Installation and Usage

Prerequisites

  • Node.js 20.0.0 or higher
  • Google AI Studio API key
GEMINI_API_KEY=your_api_key npx -y aistudio-mcp-server

Local Installation

npm install -g aistudio-mcp-server
GEMINI_API_KEY=your_api_key aistudio-mcp-server

Configuration

Set your Google AI Studio API key as an environment variable:

export GEMINI_API_KEY=your_api_key_here

Optional Configuration

  • GEMINI_MODEL: Gemini model to use (default: gemini-2.5-flash)
  • GEMINI_TIMEOUT: Request timeout in milliseconds (default: 300000 = 5 minutes)
  • GEMINI_MAX_OUTPUT_TOKENS: Maximum output tokens (default: 8192)
  • GEMINI_MAX_FILES: Maximum number of files per request (default: 10)
  • GEMINI_MAX_TOTAL_FILE_SIZE: Maximum total file size in MB (default: 50)
  • GEMINI_TEMPERATURE: Temperature for generation (0-2, default: 0.2)

Example:

export GEMINI_API_KEY=your_api_key_here
export GEMINI_MODEL=gemini-2.5-flash
export GEMINI_TIMEOUT=600000  # 10 minutes
export GEMINI_MAX_OUTPUT_TOKENS=16384  # More output tokens
export GEMINI_MAX_FILES=5  # Limit to 5 files per request
export GEMINI_MAX_TOTAL_FILE_SIZE=100  # 100MB limit
export GEMINI_TEMPERATURE=0.7  # More creative responses

Available Tools

generate_content

Generates content using Gemini with comprehensive support for files, conversation history, and system prompts. Supports various file types including images, PDFs, Office documents, and text files.

Parameters:

  • user_prompt (string, required): User prompt for generation
  • system_prompt (string, optional): System prompt to guide AI behavior
  • files (array, optional): Array of files to include in generation
    • Each file object must have either path or content
    • path (string): Path to file
    • content (string): Base64 encoded file content
    • type (string, optional): MIME type (auto-detected from file extension)
  • model (string, optional): Gemini model to use (default: gemini-2.5-flash)
  • temperature (number, optional): Temperature for generation (0-2, default: 0.2). Lower values produce more focused responses, higher values more creative ones

Supported file types (Gemini 2.5 models):

  • Images: JPG, JPEG, PNG, GIF, WebP, SVG, BMP, TIFF
  • Video: MP4, AVI, MOV, WEBM, FLV, MPG, WMV (up to 10 files per request)
  • Audio: MP3, WAV, AIFF, AAC, OGG, FLAC (up to 15MB per file)
  • Documents: PDF (treated as images, one page = one image)
  • Text: TXT, MD, JSON, XML, CSV, HTML

File limitations:

  • Maximum file size: 15MB per audio/video/document file
  • Maximum total request size: 20MB (2GB when using Cloud Storage)
  • Video files: Up to 10 per request
  • PDF files follow image pricing (one page = one image)

Basic example:

{
  "user_prompt": "Analyze this image and describe what you see",
  "files": [
    {
      "path": "/path/to/image.jpg"
    }
  ]
}

PDF to Markdown conversion:

{
  "user_prompt": "Convert this PDF to well-formatted Markdown, preserving structure and formatting. Return only the Markdown content.",
  "files": [
    {
      "path": "/path/to/document.pdf"
    }
  ]
}

With system prompt:

{
  "system_prompt": "You are a helpful document analyst specialized in technical documentation",
  "user_prompt": "Please provide a detailed explanation of the authentication methods shown in this document",
  "files": [
    {"path": "/api-docs.pdf"}
  ]
}

Multiple files example:

{
  "user_prompt": "Compare these documents and images",
  "files": [
    {"path": "/document.pdf"},
    {"path": "/chart.png"},
    {"content": "base64encodedcontent", "type": "image/jpeg"}
  ]
}

Common Use Cases

PDF to Markdown Conversion

To convert PDF files to Markdown format, use the generate_content tool with an appropriate prompt:

{
  "user_prompt": "Convert this PDF to well-formatted Markdown, preserving structure, headings, lists, and formatting. Include table of contents if the document has sections.",
  "files": [
    {
      "path": "/path/to/document.pdf"
    }
  ]
}

Image Analysis

Analyze images, charts, diagrams, or photos with detailed descriptions:

{
  "system_prompt": "You are an expert image analyst. Provide detailed, accurate descriptions of visual content.",
  "user_prompt": "Analyze this image and describe what you see. Include details about objects, people, text, colors, and composition.",
  "files": [
    {
      "path": "/path/to/image.jpg"
    }
  ]
}

For screenshots or technical diagrams:

{
  "user_prompt": "Describe this system architecture diagram. Explain the components and their relationships.",
  "files": [
    {
      "path": "/architecture-diagram.png"
    }
  ]
}

Audio Transcription

Generate transcripts from audio files:

{
  "system_prompt": "You are a professional transcription service. Provide accurate, well-formatted transcripts.",
  "user_prompt": "Please transcribe this audio file. Include speaker identification if multiple speakers are present, and format it with proper punctuation and paragraphs.",
  "files": [
    {
      "path": "/meeting-recording.mp3"
    }
  ]
}

For interview or meeting transcripts:

{
  "user_prompt": "Transcribe this interview and provide a summary of key points discussed.",
  "files": [
    {
      "path": "/interview.wav"
    }
  ]
}

MCP Client Configuration

Add this server to your MCP client configuration:

{
  "mcpServers": {
    "aistudio": {
      "command": "npx",
      "args": ["-y", "aistudio-mcp-server"],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here",
        "GEMINI_MODEL": "gemini-2.5-flash",
        "GEMINI_TIMEOUT": "600000",
        "GEMINI_MAX_OUTPUT_TOKENS": "16384",
        "GEMINI_MAX_FILES": "10",
        "GEMINI_MAX_TOTAL_FILE_SIZE": "50",
        "GEMINI_TEMPERATURE": "0.2"
      }
    }
  }
}

Development

Setup

Make sure you have Node.js 20.0.0 or higher installed.

npm install
npm run build

Running locally

GEMINI_API_KEY=your_api_key npm run dev

License

MIT

Server Config

{
  "mcpServers": {
    "aistudio": {
      "command": "npx",
      "args": [
        "-y",
        "aistudio-mcp-server"
      ],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here",
        "GEMINI_MODEL": "gemini-2.5-flash",
        "GEMINI_TIMEOUT": "600000",
        "GEMINI_MAX_OUTPUT_TOKENS": "16384",
        "GEMINI_MAX_FILES": "10",
        "GEMINI_MAX_TOTAL_FILE_SIZE": "50",
        "GEMINI_TEMPERATURE": "0.2"
      }
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Tavily Mcp
CursorThe AI Code Editor
ChatWiseThe second fastest AI chatbot™
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Serper MCP ServerA Serper MCP Server
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
WindsurfThe new purpose-built IDE to harness magic
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
DeepChatYour AI Partner on Desktop
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Amap Maps高德地图官方 MCP Server
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Playwright McpPlaywright MCP server
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code