mcp-rag-server

Created By

kwanLeeFrmVi10 months ago

mcp-rag-server is a Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG) capabilities. It empowers Large Language Models (LLMs) to answer questions based on your document content by indexing and retrieving relevant information efficiently.

# rag

# mcp-server

Overview Content Tools Comments

Content

mcp-rag-server

A Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG). It indexes your documents and serves relevant context to Large Language Models via the MCP protocol.

Integration Examples

Generic MCP Client Configuration

{
  "mcpServers": {
    "rag": {
      "command": "npx",
      "args": ["-y", "mcp-rag-server"],
      "env": {
        "BASE_LLM_API": "http://localhost:11434/v1",
        "EMBEDDING_MODEL": "nomic-embed-text",
        "VECTOR_STORE_PATH": "./vector_store",
        "CHUNK_SIZE": "500"
      }
    }
  }
}

Example Interaction

# Index documents
>> tool:embedding_documents {"path":"./docs"}

# Check status
>> resource:embedding-status

<< rag://embedding/status
Current Path: ./docs/file1.md
Completed: 10
Failed: 0
Total chunks: 15
Failed Reason:

Integration Examples
Features
Installation
Quick Start
Configuration
Usage
- MCP Tools
- MCP Resources
How RAG Works
Development
Contributing
License

Features

Index documents in .txt, .md, .json, .jsonl, and .csv formats
Customizable chunk size for splitting text
Local vector store powered by SQLite (via LangChain's LibSQLVectorStore)
Supports multiple embedding providers (OpenAI, Ollama, Granite, Nomic)
Exposes MCP tools and resources over stdio for seamless integration with MCP clients

Installation

From npm

npm install -g mcp-rag-server

From Source

git clone https://github.com/kwanLeeFrmVi/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build
npm start

Quick Start

export BASE_LLM_API=http://localhost:11434/v1
export EMBEDDING_MODEL=granite-embedding-278m-multilingual-Q6_K-1743674737397:latest
export VECTOR_STORE_PATH=./vector_store
export CHUNK_SIZE=500

# Run (global install)
mcp-rag-server

# Or via npx
npx mcp-rag-server

💡 Tip: We recommend using Ollama for embedding. Install and pull the nomic-embed-text model:

ollama pull nomic-embed-text
export EMBEDDING_MODEL=nomic-embed-text

Configuration

Variable	Description	Default
`BASE_LLM_API`	Base URL for embedding API	`http://localhost:11434/v1`
`LLM_API_KEY`	API key for your LLM provider	(empty)
`EMBEDDING_MODEL`	Embedding model identifier	`nomic-embed-text`
`VECTOR_STORE_PATH`	Directory for local vector store	`./vector_store`
`CHUNK_SIZE`	Characters per text chunk (number)	`500`

💡 Recommendation: Use Ollama embedding models like nomic-embed-text for best performance.

Usage

MCP Tools

Once running, the server exposes these tools via MCP:

embedding_documents(path: string): Index documents under the given path
query_documents(query: string, k?: number): Retrieve top k chunks (default 15)
remove_document(path: string): Remove a specific document
remove_all_documents(confirm: boolean): Clear the entire index (confirm=true)
list_documents(): List all indexed document paths

MCP Resources

Clients can also read resources via URIs:

rag://documents — List all document URIs
rag://document/{path} — Fetch full content of a document
rag://query-document/{numberOfChunks}/{query} — Query documents as a resource
rag://embedding/status — Check current indexing status (completed, failed, total)

How RAG Works

Indexing: Reads files, splits text into chunks based on CHUNK_SIZE, and queues them for embedding.
Embedding: Processes each chunk sequentially against the embedding API, storing vectors in SQLite.
Querying: Embeds the query and retrieves nearest text chunks from the vector store, returning them to the client.

Development

npm install
npm run build      # Compile TypeScript
npm start          # Run server
npm run watch      # Watch for changes

Contributing

Contributions are welcome! Please open issues or pull requests on GitHub.

License

MIT 2025 Quan Le

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

Tavily Mcp

CursorThe AI Code Editor

Serper MCP ServerA Serper MCP Server

Amap Maps高德地图官方 MCP Server

Playwright McpPlaywright MCP server

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

DeepChatYour AI Partner on Desktop

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

ChatWiseThe second fastest AI chatbot™

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

WindsurfThe new purpose-built IDE to harness magic

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

mcp-rag-server

mcp-rag-server

Integration Examples

Generic MCP Client Configuration

Example Interaction

Table of Contents

Features

Installation

From npm

From Source

Quick Start

Configuration

Usage

MCP Tools

MCP Resources

How RAG Works

Development

Contributing

License