# Transform any MCP-compatible LLM into a codebase expert through semantic intelligence
A blazingly fast graphRAG implementation. 100% Rust for indexing and querying large codebases with natural language.
Supports multiple embedding providers: modes cpu (no graph just AST parsing), onnx (blazingly fast medium quality embeddings with Qdrant/all-MiniLM-L6-v2-onnx) and Ollama (time consuming SOTA embeddings with hf.co/nomic-ai/nomic-embed-code-GGUF:Q4_K_M).
I would argue this is the fastest codebase indexer on the Github atm.
Includes a Rust SDK made stdio MCP server so that your agents can query the indexed codegraph with natural language and get deep insights from your codebase before starting development or making changes. Currently supports typescript, javascript, rust, go, Python and C++ codebases.
π Performance Benchmarking (M4 Max 128GB)
Production Codebase Results (1,505 files, 2.5M lines, Python, Javascript, Typescript and Go)
π INDEXING COMPLETE!
π Performance Summary
ββββββββββββββββ. ββ
β π Files: 1,505 indexed β
β π Lines: 2,477,824 processed β
β π§ Functions: 30,669 extracted β
β ποΈ Classes: 880 extracted β
β πΎ Embeddings: 538,972 generated β
ββββββββββββββββ. ββ
Embedding Provider Performance Comparison
Provider Time Quality Use Case
π§ Ollama nomic-embed-code ~15-18h SOTA retrieval accuracy Production, smaller codebases
β‘ ONNX all-MiniLM-L6-v2 32m 22s Good general embeddings Large codebases, lunch-break indexing
π LEANN ~4h The next best thing I could find in Github
CodeGraph Advantages
β
Incremental Updates: Only reprocess changed files (LEANN can't do this)
β
Provider Choice: Speed vs. quality optimization based on needs
β
Memory Optimization: Automatic optimisations based on your system
β
Production Ready: Index 2.5M lines while having lunch
Read the README.md carefully the installation is complex and requires you to download the embedding model in onnx format and Ollama and setting up multiple environment variables (I would recommend setting these in your bash configuration)