VidMCP — Smart MCP Server for AI Video Generation

Created By

aparna1623 days ago

The missing layer between your AI assistant and every major video tool. Describe what you want — VidMCP picks the best model automatically, runs the full pipeline, and delivers your video. BYOK — bring your own API keys for Kling, Runway, Fal.ai and ElevenLabs. Free fallbacks included for everything. Works with Claude Desktop, ChatGPT, Cursor and Windsurf.

# video

# mcp

Overview Content Tools Comments

Overview

VidMCP

The missing layer between your AI assistant and every major video tool.

Describe what you want. VidMCP picks the best model, runs the pipeline, and delivers your video. No switching between tools. No manual API setup. Just results.

What it does

Smart routing — automatically picks Kling, Runway, or Fal.ai based on your prompt
Full pipeline — generate, transcribe, add audio, merge, quality check in one flow
Memory — remembers your style and preferences across sessions
Raw video processing — upload footage, auto-transcribe, detect filler words
File support — animate images, restyle footage, add voiceover
BYOK — bring your own API keys, pay providers directly, zero markup
Works everywhere — Claude Desktop, ChatGPT, Cursor, Windsurf

BYOK — Bring Your Own Keys

VidMCP follows a BYOK model. You connect your own API keys from Kling, Runway, Fal.ai, and ElevenLabs. You pay those providers directly at their standard rates. VidMCP charges only for the intelligence layer — smart routing, pipeline management, and memory — not for the generation itself.

This means:

No hidden markup on video generation costs
Full control over which providers you use
Switch providers any time without changing your workflow
Use free tiers where available

No keys at all? VidMCP still works using free fallbacks automatically.

Install

Make sure you have Python 3.11+ and uv installed.

Install uv if you don't have it:

Windows:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Mac:

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone and install:

git clone https://github.com/aparna162/vidmcp.git
cd vidmcp
uv sync

Add your API keys

Create a .env file in the project folder and paste this:

FAL_API_KEY=your_key
KLING_API_KEY=your_key
RUNWAY_API_KEY=your_key
ELEVENLABS_API_KEY=your_key
ANTHROPIC_API_KEY=your_key
QUALITY_THRESHOLD=3.5
MAX_RETRIES=3

Don't have all keys? No problem. VidMCP auto-detects what you have and uses the best available. Falls back to free options automatically.

Priority order VidMCP follows:

Video:   Kling → Fal.ai → Pollinations (free)
Audio:   ElevenLabs → Edge TTS (free)

Get free credits:

Fal.ai → fal.ai — free $1 on signup, no credit card
ElevenLabs → elevenlabs.io — free tier available
Pollinations — zero signup, always free, used as fallback

Connect to Claude Desktop

Step 1 — Find your config file:

Windows:

C:\Users\YOUR_NAME\AppData\Roaming\Claude\claude_desktop_config.json

Mac:

~/Library/Application Support/Claude/claude_desktop_config.json

Step 2 — Add this to the file. Update YOUR_NAME to your Windows username:

{
  "mcpServers": {
    "vidmcp": {
      "command": "C:\\Users\\YOUR_NAME\\vidmcp\\.venv\\Scripts\\python.exe",
      "args": ["C:\\Users\\YOUR_NAME\\vidmcp\\server.py"],
      "cwd": "C:\\Users\\YOUR_NAME\\vidmcp"
    }
  }
}

Mac version:

{
  "mcpServers": {
    "vidmcp": {
      "command": "/Users/YOUR_NAME/vidmcp/.venv/bin/python",
      "args": ["/Users/YOUR_NAME/vidmcp/server.py"],
      "cwd": "/Users/YOUR_NAME/vidmcp"
    }
  }
}

Step 3 — Fully quit Claude Desktop and reopen it. VidMCP will appear in the tools menu.

Connect to other AI tools

Works with any MCP compatible client using the same config format above.

Cursor — paste config into Cursor MCP settings
Windsurf — paste config into Windsurf MCP settings
ChatGPT Desktop — paste config into ChatGPT MCP settings

How to use

Just talk naturally. No commands needed.

"make a cinematic video of a dancer in rain"
"clean up this recording and remove filler words"
"animate this product photo"
"remember that I prefer warm colour grades"
"make a 10 second video for my Instagram reel"
"upload this footage and transcribe it"

VidMCP figures out which tools to call automatically.

All 14 tools

Tool	What it does
route_generation	Picks best model for your prompt
generate_video	Creates video via best available API
create_pipeline	Defines a multi-step workflow
run_pipeline	Executes pipeline async
get_pipeline_status	Tracks progress
quality_check	Scores generated output
process_video	Transcribes raw footage, finds filler words
generate_audio	Voiceover via ElevenLabs or free Edge TTS
upload_asset	Uploads local image, video, or audio
animate_image	Turns a photo into video
restyle_video	Applies style transfer to footage
remember	Stores your preferences
recall	Finds relevant past generations
learn_from	Rate outputs to improve routing

Models supported

Model	Best for	Requires
Kling 3.0	Human motion, cinematic	KLING_API_KEY
Runway Gen-4	Style transfer, camera moves	RUNWAY_API_KEY
Wan 2.2 via Fal	General video, fast	FAL_API_KEY
Pollinations	Testing, always free	Nothing

Optional — raw video processing

Install Whisper for transcription and filler word detection:

uv add openai-whisper

Install ffmpeg:

Windows:

winget install ffmpeg

Mac:

brew install ffmpeg

Test it works

Run this before connecting to Claude Desktop:

uv run python server.py

If it starts silently with no errors it is working. That is normal. It waits for a client to connect.

Project structure

vidmcp/
├── server.py          — MCP server, all 14 tools
├── router.py          — smart model picker
├── pipeline.py        — async job manager
├── memory.py          — preferences and history
├── assets.py          — file handling
├── config.py          — settings and benchmarks
├── providers/
│   ├── kling.py       — Kling API
│   ├── runway.py      — Runway API
│   ├── fal_provider.py — Fal.ai + auto fallback
│   ├── elevenlabs.py  — audio + Edge TTS fallback
│   └── whisper_provider.py — transcription
└── .env               — your API keys (never committed)

Built with

Anthropic MCP SDK
Fal.ai — Wan 2.2 video generation
Kling AI — cinematic video
Runway — style transfer
ElevenLabs — voiceover
OpenAI Whisper — transcription
Microsoft Edge TTS — free audio fallback
SQLite — pipeline state and memory
Pollinations — free image fallback

License

MIT — free to use, modify, and build on top of.

Questions or feedback

Open an issue on GitHub or find me on LinkedIn.

Server Config

{
  "mcpServers": {
    "vidmcp": {
      "command": "python",
      "args": [
        "server.py"
      ],
      "cwd": "/path/to/vidmcp",
      "env": {
        "FAL_API_KEY": "<YOUR_FAL_API_KEY>",
        "KLING_API_KEY": "<YOUR_KLING_API_KEY>",
        "RUNWAY_API_KEY": "<YOUR_RUNWAY_API_KEY>",
        "ELEVENLABS_API_KEY": "<YOUR_ELEVENLABS_API_KEY>"
      }
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Serper MCP ServerA Serper MCP Server

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

DeepChatYour AI Partner on Desktop

Playwright McpPlaywright MCP server

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

CursorThe AI Code Editor

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

WindsurfThe new purpose-built IDE to harness magic

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.