Sponsored by Deepsite.site

Local Rag

Created By
TheWincia month ago
Semantic code search for AI agents — hybrid vector + BM25 with cross-encoder reranking, AST-aware chunking for 14 languages, conversation memory, code annotations, and search analytics. Zero config, zero API keys. Just bunx.
Overview
mimirs logo

MIMIRS

Named after Mímir, the Norse god of wisdom and knowledge.

Persistent project memory for AI coding agents. One command to set up, nothing to maintain.

npm license

Your agent starts every session blind — guessing filenames, grepping for keywords, burning context on irrelevant files, and forgetting everything you discussed yesterday.

On a real project, that costs 380K tokens per prompt and 12-second response times.

After indexing with mimirs: 91K tokens, 3 seconds. A 76% reduction — depending on your model and usage, that's hundreds to thousands in monthly API savings.

No API keys. No cloud. No Docker. Just bun and SQLite.

Works with: Claude Code  ·  Cursor  ·  Windsurf  ·  JetBrains (Junie)  ·  GitHub Copilot  ·  any MCP client

Auto-generated project wiki

One command turns your codebase into a structured, cross-linked markdown wiki — architecture docs, module pages, entity pages, guides, and Mermaid diagrams — all built from the semantic index. See the wiki generated for this project →

generated wiki example

Search quality

93–98% recall. Benchmarked on four real codebases across three languages (120 queries total) — from 97 files to 8,553 — with known expected results per query. Full methodology in BENCHMARKS.md.

CodebaseLanguageFilesQueriesRecall@10MRRZero-miss
mimirsTypeScript973098.3%0.6830.0%
ExcalidrawTypeScript6933096.7%0.4423.3%
DjangoPython3,0903093.3%0.6886.7%
KubernetesGo8,5533090.0%0.58910.0%

Kubernetes excludes test files and demotes generated files. With searchTopK: 15, recall reaches 100%. See Kubernetes benchmarks for details.

How it compares

mimirsNo tool (grep + Read)Context stuffingCloud RAG services
SetupOne commandNothingNothingAPI keys, accounts
Token cost~91K/prompt~380K/promptEntire codebaseVaries
Search quality93–98% Recall@10Depends on keywordsN/A (everything loaded)Varies
Code understandingAST-aware (24 langs)Line-levelNoneUsually line-level
Cross-session memoryConversations + checkpointsNoneNoneSome
PrivacyFully localLocalLocalData leaves your machine
PriceFreeFreeHigh token bills$10-50/mo + tokens

What it gives your agent

Find code by meaning, not filename. "Where do we handle authentication errors?" → mimirs finds middleware/session-guard.ts. Hybrid vector + BM25 search, boosted by dependency graph centrality.

Remember past sessions. Conversation transcripts are indexed in real time. Three days later, your agent can search for "why did we switch to JWT?" and get the exact discussion.

Know what changed since last time. git_context shows uncommitted changes and recent commits in one call, so agents don't propose edits that conflict with in-progress work.

Leave notes for future sessions. annotate attaches persistent caveats to files or symbols — "known race condition", "blocked on auth rewrite" — that surface automatically in search results.

Mark decisions, not just code. Checkpoints capture milestones, direction changes, and blockers. Searchable across sessions so context doesn't evaporate.

Understand codebase structure. Dependency graphs, reverse-dependency lookups, and find_usages show the blast radius before any refactor.

Generate a project wiki. generate_wiki produces a structured, cross-linked markdown wiki — architecture docs, module pages, entity pages, guides, and Mermaid diagrams — all built from the semantic index.

Expose documentation gaps. Analytics log every query locally — nothing leaves your machine. Zero-result and low-relevance queries reveal what's missing from your docs.

Quick start

1. Install SQLite (macOS)

Apple's bundled SQLite doesn't support extensions:

brew install sqlite

2. Set up your editor

bunx mimirs init --ide claude   # or: cursor, windsurf, copilot, jetbrains, all

This creates the MCP server config, editor rules, .mimirs/config.json, and .gitignore entry. Run with --ide all to set up every supported editor at once.

3. Try the demo (optional)

bunx mimirs demo

Claude Code plugin

For deeper integration, mimirs is also available as a Claude Code plugin. In a Claude Code session:

/plugin marketplace add https://github.com/TheWinci/mimirs.git
/plugin install mimirs

The plugin adds SessionStart (context summary), PostToolUse (auto-reindex on edit), and SessionEnd (auto-checkpoint) hooks. No CLAUDE.md instructions needed — the plugin's built-in skill handles tool usage.

How it works

  1. Parse & chunk — Splits content using type-matched strategies: function/class boundaries for code (via tree-sitter across 24 languages), headings for markdown, top-level keys for YAML/JSON. Chunks that exceed the embedding model's token limit are windowed and merged.

  2. Embed — Each chunk becomes a 384-dimensional vector using all-MiniLM-L6-v2 (in-process via Transformers.js + ONNX, no API calls). Vectors are stored in sqlite-vec.

  3. Build dependency graph — Import specifiers and exported symbols are captured during AST chunking, then resolved to build a file-level dependency graph.

  4. Hybrid search — Queries run vector similarity and BM25 in parallel, blended by configurable weight. Results are boosted by dependency graph centrality and path heuristics. read_relevant returns individual chunks with entity names and exact line ranges (path:start-end).

  5. Watch & re-index — File changes are detected with a 2-second debounce. Changed files are re-indexed; deleted files are pruned.

  6. Conversation & checkpoints — Tails Claude Code's JSONL transcripts in real time. Agents can create checkpoints at important moments for future sessions to search.

  7. Annotations — Notes attached to files or symbols surface as [NOTE] blocks inline in read_relevant results.

  8. Analytics — Every query is logged. Analytics surface zero-result queries, low-relevance queries, and period-over-period trends.

Supported languages

AST-aware chunking via bun-chunk with tree-sitter grammars:

TypeScript/JavaScript, Python, Go, Rust, Java, C, C++, C#, Ruby, PHP, Scala, Kotlin, Lua, Zig, Elixir, Haskell, OCaml, Dart, Bash/Zsh, TOML, YAML, HTML, CSS/SCSS/LESS

Also indexes: Markdown, JSON, XML, SQL, GraphQL, Protobuf, Terraform, Dockerfiles, Makefiles, and more. Files without a known extension fall back to paragraph splitting.

Documentation

Stack

LayerChoice
RuntimeBun (built-in SQLite, fast TS)
AST chunkingbun-chunk — tree-sitter grammars for 24 languages
EmbeddingsTransformers.js + ONNX (in-process, no daemon)
Embedding modelall-MiniLM-L6-v2 (~23MB, 384 dimensions) — configurable
Vector storesqlite-vec (single .db file)
MCP@modelcontextprotocol/sdk (stdio transport)
PluginClaude Code plugin with skills + hooks

All data lives in .mimirs/ inside your project — add it to .gitignore.

Server Config

{
  "mcpServers": {
    "local-rag": {
      "command": "bunx",
      "args": [
        "@winci/local-rag@latest",
        "serve"
      ],
      "env": {
        "RAG_PROJECT_DIR": "/path/to/your/project"
      }
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
ChatWiseThe second fastest AI chatbot™
CursorThe AI Code Editor
Tavily Mcp
Amap Maps高德地图官方 MCP Server
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Serper MCP ServerA Serper MCP Server
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
DeepChatYour AI Partner on Desktop
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
WindsurfThe new purpose-built IDE to harness magic
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.
Playwright McpPlaywright MCP server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。