Rag Starter

Created By

cstamigo-droid6 hours ago

Overview

rag-starter — chat with your documents (RAG), with citations

A production-ready starter that turns a folder of documents into a cited Q&A service. Drop in your PDFs / Markdown / text, ask questions, get answers grounded in the source — every claim traceable to the exact passage it came from.

Exposed two ways from one codebase:

MCP server — plug it into Claude Desktop / Claude Code / any MCP host and chat with your docs.
HTTP API (FastAPI) — call it from any app.

Built once, reskinned per client. Swap the data/ folder, tweak config.py, ship.

Why it's different

Keyless by default. Embeddings run locally (ONNX MiniLM) — no API key, no per-query cost, runs offline. Demo it anywhere in seconds.
Citations, not hallucinations. Retrieval returns ranked passages tagged [source#chunk] / [file.pdf p3]. A missing answer returns "Not found in the documents" — never a fabricated one.
Answer synthesis is optional. With an ANTHROPIC_API_KEY it writes a cited answer for you; without one it returns passages for the host LLM to answer. Either way the RAG works.
Idempotent ingestion. Re-ingesting a file updates it in place (no duplicates).

Quickstart

pip install -r requirements.txt          # or: pip install -e .
PYTHONUTF8=1 python tests/test_smoke.py   # proves retrieval works on the sample docs

As an HTTP API

rag-starter-api            # uvicorn on 127.0.0.1:8000
# then:
curl -X POST localhost:8000/ingest  -H "content-type: application/json" -d '{"path":"./data"}'
curl -X POST localhost:8000/search  -H "content-type: application/json" -d '{"query":"refund policy"}'
curl -X POST localhost:8000/answer  -H "content-type: application/json" -d '{"query":"how long is the refund window?"}'

As an MCP server (Claude Desktop)

Add to claude_desktop_config.json → mcpServers:

{
  "rag-starter": {
    "command": "python",
    "args": ["-m", "rag_starter"],
    "cwd": "C:/path/to/rag-starter"
  }
}

Restart Claude Desktop, then: "Ingest the folder ./data", then "What's the refund policy?"

Tools / endpoints

MCP tool	HTTP	Purpose
`rag_ingest(path)`	`POST /ingest`	Index a file or folder (txt/md/pdf).
`rag_search(query, k)`	`POST /search`	Top-k cited passages (keyless).
`rag_answer(query, k)`	`POST /answer`	Cited answer (with key) or passages (without).
—	`GET /health`	Liveness + indexed-chunk count + embedding backend.

Configuration

Everything is in config.py (or env / .env — see .env.example): chunk size & overlap, top-k, embedding backend (default local vs openai), and the answer model. A client reskin is usually just: replace data/, set RAG_COLLECTION, re-ingest.

Architecture

documents ──ingest.py──> chunks(+citations) ──store.py──> Chroma (local, persistent)
                                                              │
question ─────────────────retrieve.py──> top-k cited passages┤
                                                              ├─> rag_search  (host LLM answers)
                                                  answer.py ──┴─> rag_answer  (server synthesizes, optional)

Reuses the mcp-factory contract (Result, formatting, cache) so it composes with the rest of the catalog.

License

MIT.

Server Config

{
  "mcpServers": {
    "rag-starter": {
      "command": "python",
      "args": [
        "-m",
        "rag_starter"
      ],
      "cwd": "C:/path/to/rag-starter"
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

ChatWiseThe second fastest AI chatbot™

Amap Maps高德地图官方 MCP Server

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.

DeepChatYour AI Partner on Desktop

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

Playwright McpPlaywright MCP server