🚀 Obsidian-Milvus-FastMCP

Created By

jayjeo8 months ago

obsidian-milvus-FastMCP

Content

🚀 Obsidian-Milvus-FastMCP

A powerful, production-ready system that connects your Obsidian vault to Claude Desktop via FastMCP, leveraging Milvus vector database for intelligent document search and retrieval.

Conventional MCPs (Notion MCP, Obsidian MCP) provide fast embedding calculations and convenient question-answering results. However, these results are limited to a level of convenience suitable for daily life. If a user has built a large volume of materials in obsidian and wants to read numerous notes and PDFs stored in obsidian through a single question to conduct in-depth analysis, conventional note program-based MCPs are not suitable. My obsidian-milvus-FastMCP was created to address this need. As a Ph.D. in Economics myself, I store a large amount of research materials in Obsidian and required comprehensive analytical results that correspond to my inquiries.

Warning! This program is extremely heavy and requires high memory (RAM) usage due to the need to keep the Milvus server running constantly, even when not performing embedding tasks. Therefore, it is not suitable for PCs with limited memory or laptops where power consumption needs to be conserved. Additionally, this program only supports Windows 10 and 11.

✨ Features

📝 Video Showcase

https://youtu.be/wPFiG9mC7e8?si=uF-TJrgG-guC33JG

Capabilities

🔍 Core Search Capabilities

🟢 Intelligent Semantic Search: Converts documents and queries into vector embeddings to retrieve highly relevant documents based on semantic similarity.
🟢 Multi-modal Search: Indexes both Markdown notes and PDF attachments so that all textual content can be searched together.

🧠 Advanced AI & RAG Features

🏷️ Advanced Metadata Filtering

🟢 Basic Tag-Based Filtering: Restricts search results based on a single tag, quickly finding documents labeled with that tag.

⚡ Performance Optimization

🟢 HNSW Index Optimization: Automatically adjusts HNSW index parameters (e.g., ef, M) based on collection size to optimize search speed and accuracy.
🟢 Batch Processing Optimization: Processes document embeddings and Milvus insertions in batches to ensure high throughput for large-scale operations.

📊 System Intelligence

🟢 Auto-tuning: Detects collection size and hardware resources (CPU/GPU) to automatically optimize index building and search parameters.
🟢 Resource Management: Monitors GPU availability and system memory in real time to offload embedding computations to GPU or adjust batch sizes for efficient resource usage.

🌐 Integration & Connectivity

🟢 Claude Desktop Integration: Connects to Claude AI via the FastMCP protocol, allowing Claude to query the Obsidian vault directly and retrieve results.
🟢 Real-time File Monitoring: Watches the Obsidian vault folder in real time and updates the index automatically when files change, ensuring the index is always up-to-date.
🟢 Multilingual Support: Uses multilingual Sentence-Transformers embedding models to search content in Korean, English, and other languages equally.
🟢 Container-based Deployment: Deploys Milvus as a Podman container for easy setup and automatically manages starting, stopping, and restarting the Milvus service.

📋 Requirements

Python 3.12+ (All Python packages are compatible with Python version 3.9 or higher, but the MCP package requires Python 3.12. Therefore, the minimum requirement is Python 3.12.)
Podman (for Milvus containers)
CUDA-compatible GPU (optional, for acceleration)
Obsidian vault with markdown files

📋 OpenWebUI Instructions

Move to README-OpenWebUI.md for instructions on how to connect to OpenWebUI.

🔧 Installation

Recommended Installation

Make sure git is installed
Download an installer from here, and run it
```
Obsidian_Milvus_Installer_AMD64.exe
```
🚨 GPU not detected issue even though my GPU exists and supports CUDA

If you do not have a CUDA supported GPU, you may skip this step
Pytorch has two different versions. One is for CPU and the other is for GPU. Make sure you have the correct version installed.

If you arn't sure, follow the instruction below:

Install CUDA Toolkit

Run the following commands:

pip uninstall torch torchvision torchaudio -y
pip install torch==2.7.0+cu118 torchvision==0.22.0+cu118 torchaudio==2.7.0+cu118 --index-url https://download.pytorch.org/whl/cu118

Manual Installation

Make sure git is installed

Clone the repository

cd to the parent folder of your desired directory 
For example, if your deired location is `G:\JJ Dropbox\J J\PythonWorks\milvus\obsidian-milvus-FastMCP`
Then cd "G:\JJ Dropbox\J J\PythonWorks\milvus"
git clone https://github.com/jayjeo/obsidian-milvus-FastMCP
cd obsidian-milvus-fastmcp

Install dependencies
- Do the following:
```
cd to your desired directory
pip uninstall -y numpy sentence-transformers
pip install "numpy>2.0.0"
pip install --no-deps transformers==4.52.3
pip install sentence-transformers==4.1.0 tqdm filelock fsspec
pip install PyPDF2 markdown beautifulsoup4 python-dotenv watchdog psutil colorama pyyaml tqdm requests pymilvus mcp fastmcp torch nvidia-ml-py 
```
- Sentence Transformers Library (sentence-transformers): The v3.1.1 release announced in September 2024 removed the numpy<2 constraint that was previously set to prevent conflicts in Windows environments on GitHub. This means that the latest version (≥3.1.1) of Sentence Transformers officially supports NumPy 2.x, allowing users to freely choose between NumPy 1.x and 2.x.
- In conclusion, the paraphrase-multilingual-mpnet-base-v2 model provided through the Sentence Transformers framework can operate normally in NumPy 2.x environments. This model is implemented based on Hugging Face Transformers and PyTorch, and as mentioned earlier, the combination of the latest Sentence Transformers version and PyTorch 2.5.1 has resolved compatibility issues with NumPy 2.x.
- In fact, since the Sentence Transformers library officially supports NumPy 2.x on GitHub, and PyTorch 2.5.1 was also built to accommodate NumPy 2.x on GitHub, embedding extraction and other operations during model inference proceed without any additional errors. No differences based on NumPy version have been reported in either CPU environments or CUDA (GPU) accelerated environments, and since NumPy 2.x itself only affects CPU operations, CUDA usage is irrelevant to compatibility.
- However, one thing to note is that the official requirements of the Hugging Face Transformers package still point to NumPy 1.x, which may cause warnings or conflicts during pip installation on GitHub. For example, when installing transformers via pip while NumPy 2.x is already installed, dependency conflict warnings may appear. However, this is merely an installation constraint and does not mean that the paraphrase-multilingual-mpnet-base-v2 model malfunctions due to NumPy 2.x at runtime. The model itself operates normally with the latest compatible library combinations, and there are no reports of embedding quality or accuracy changing based on NumPy version.
- The dependency warnings/conflicts that occur when installing Hugging Face Transformers via pip can be bypassed. First, if you attempt pip install transformers while maintaining NumPy 2.x, warnings will appear, but you can skip dependency checks using the --no-deps option.
Install Podman
- From CMD: winget install RedHat.Podman
- Open PowerShell as Administrator and run (Enable Virtual Machine)
```
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
```
- Open PowerShell as Administrator and run (Linux kernel update package)
```
wsl.exe --install
```
- Set WSL 2 as your default version
```
wsl --set-default-version 2
```
- Install your Linux distribution of choice
  - Ubuntu 18.04 LTS
  - Ubuntu 20.04 LTS
  - Ubuntu 22.04 LTS
  - and so on
- Restart after setting up WSL Linux system
- pip install podman-compose using CMD at your path
Configure paths
- cd to your desired directory
- Edit .env and set your Obsidian vault path
- Edit .env and set your podman path
  - Find podman path using find_podman_path.bat
Initialize Podman Container
```
complete-podman-reset.bat
```
Initialize Milvus Server
```
start_mcp_with_encoding_fix.bat
```
Podman auto launch at startup
- Follow instructions in Podman auto launch.md
- When Windows starts, nothing will pop up unless there is an error
  - If you want to figure out the error, see podman_startup.log
Milvus Server auto launch at startup

Follow instructions in Milvus auto launch.md
When Windows starts, nothing will pop up unless there is an error
- If you want to figure out the error, see auto_startup_mcp.log and vbs_startup.log
- Launching the server takes approximately 5~7 minites depending on your PC performance. Wait for it to finish before you start Claude Desktop
A manual execution alternative is available, though this method proves operationally cumbersome
- run-main.bat, and select option 1
- You have to keep this CMD opened. Otherwise, the server will be terminated

Interactive Setup & Testing

run-setup.bat

(1) Package Installation - Installs required dependencies
(2) Milvus Setup - Automatically deploys Milvus using Podman with persistent data storage
(3) Collection Testing - Verifies database operations
(4) MCP Server Validation - Checks server files
(5) Claude Desktop Integration - Configures Claude Desktop automatically
(8) Safe Server Restart (Preserve All Data. Use this if MCP server has launching issues)
(9) Emergency: Complete Data Reset (DELETE All Data)

🚨 GPU not detected issue even though my GPU exists and supports CUDA

If you do not have a CUDA supported GPU, you may skip this step
Pytorch has two different versions. One is for CPU and the other is for GPU. Make sure you have the correct version installed.

If you arn't sure, follow the instruction below:

Install CUDA Toolkit

Run the following commands:

pip uninstall torch torchvision torchaudio -y
pip install torch==2.7.0+cu118 torchvision==0.22.0+cu118 torchaudio==2.7.0+cu118 --index-url https://download.pytorch.org/whl/cu118

🎮 Daily Use

Once setup is complete:

Wait for the server to start: You should have followed the instructions to automaticallystart the server when Windows starts. The server takes approximately 3~7 minites depending on your PC performance. Wait for it to finish before you start anything below
Start Embedding (Indexing) your Obsidian vault: Run run-main.bat and select option 2 or 3
Open Claude Desktop: Your Obsidian vault is now searchable
Start Searching: Use natural language queries in Claude Desktop

Troubleshooting

MCP Server disconnection issue

If you have configured an autostart method using Windows' Task Scheduler, the Milvus server should establish a connection to FastMCP (or Claude Desktop) within 4 to 6 minutes after your PC's start up, depending on your computer's performance.
If the connection is still not working, try to close Claude Desktop and restart it
- Cliking [X] (right top button) in Claude does not exit Claude
- You have to click [≡] (left top button) → File → Exit
If the issue persists, try the following:
- The Claude connection requires three components as below
  - 1. Podman running
  - 1. Milvus server running
  - 1. MCP server running
- You can check the running status of all components at check_all_status.bat
- You can check the current Milvus data storage at check_milvus_data.bat
- Try re-initializing these in reverse order (3→2→1)
  - To Restart the MCP server, run start-mcp-server-manual.bat
  - To Restart the Milvus server, run smart-start-milvus.bat
  - To Restart the Podman server, run complete-podman-reset.bat
    - at CMD, podman machine reset
    - at CMD, podman machine init
    - at CMD, podman machine start

If you want to check logs, open the following:

"your_desired_directory\logs", there are many log files

Backup & Restore Milvus Data (Embedding Data)

🟢 Backup : Run backup-all-data.bat
🟢 Restore : Run restore-backup.bat

Available MCP Search Tools

🟢 comprehensive_search : Comprehensive search across entire collection (no limit restrictions)

Role: Searches all documents in batches to find all query-related results
Suitable situations:
- When you need to find all relevant documents on a specific topic
- When completeness of search results is critical
- When comprehensive analysis is needed on large datasets 🟢 multi_query_fusion : Advanced search leveraging Milvus's powerful metadata filtering
Role: Search with various filters including time range, tags, file size, content quality
Suitable situations:
- When searching for documents from specific time periods
- When filtering by specific tags or file types is needed
- When precise search with complex condition combinations is required 🟢 milvus_power_search : Power search utilizing all Milvus optimization features
Role: High-performance search using GPU acceleration, HNSW optimization, adaptive parameters
Suitable situations:
- When fast response time is critical
- When large-scale search with GPU acceleration is needed
- When you want to adjust search modes (fast/balanced/precise) based on situation 🟢 get_document_content : Retrieve complete content of specific document
Role: Query full document content and metadata via file path
Suitable situations:
- When you want to read the full content of a specific document
- When checking document metadata (tags, creation date, etc.)
- When analyzing document chunk structure 🟢 get_similar_documents : Find documents similar to specific document
Role: Search for other documents semantically similar to specified document
Suitable situations:
- When finding other documents related to current document
- When expanding search for related materials on specific topic
- When finding duplicate or similar content documents 🟢 knowledge_graph_builder : Build knowledge graph based on Milvus vector similarity
Role: Construct network of related documents from starting document
Suitable situations:
- When visualizing relationships between documents
- When exploring knowledge network of specific topic
- When identifying clusters of related documents 🟢 performance_analysis : Milvus performance optimization analysis and recommendations
Role: Analyze current system performance metrics and suggest optimizations
Suitable situations:
- When improving search performance
- When understanding system resource usage
- When adjusting optimization settings

Use Project feature in Claude Desktop for better search results

For example, I use a project called "obsidian" to search my Obsidian vault.
My project instructions are as follows.
- Use only information from the obsidian assistant MCP
- For general searches, always use the auto_search_mode_decision function
- Tell me how many md or pdf files were searched and provide the list
- Summarize the content comprehensively
- Only save md files to obsidian when specifically requested
- If saving, the location should be G:\jayjeo
- When saving, create the note list in a clickable format for obsidian
- Create a list of referenced md and pdf files in clickable obsidian format at the top of the note

Use Filesystem MCP to save markdown notes to Obsidian

This MCP is amazingly helpful for saving and reading any files in your system.
This includes saving markdown notes to your Obsidian vault
https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem

🔴 EMERGENCY RESET

If you encounter container conflicts or system issues, you can use the emergency reset script:

complete-reset.bat  # Windows

🔴 CRITICAL WARNING: This script will:

Kill ALL Podman containers (not just Milvus)
Remove ALL Podman containers, pods, volumes, and networks
Delete local MilvusData and volumes directories
Permanently destroy all container data system-wide

🔴 Use ONLY if

You have container name conflicts
Milvus services fail to start properly
You need a complete clean state
You don't have other important Podman containers running

Before running: Make sure you don't have other Podman projects running that you want to preserve. This reset affects the entire Podman system, not just this project.

🎯 Claude Desktop Integration

Claude Desktop configuration for this program is as follows.
However, this should be already set automatically by using run-setup.bat (option 5).

{
  "mcpServers": {
    "obsidian-assistant": {
      "command": "python",
      "args": ["path/to/mcp_server.py"],
      "env": {}
    }
  }
}

Obsidian-Milvus-FastMCP Project Structure and Module Descriptions

Project Dependency Tree

📁 obsidian-milvus-FastMCP/
│
├── 🎯 main.py (Core Entry Point #1)
│   ├── logger.py ← Centralized logging system
│   ├── config.py ← Configuration management
│   ├── milvus_manager.py ← Milvus database operations
│   │   ├── logger.py (reused)
│   │   ├── config.py (reused)
│   │   └── embeddings.py ← AI embedding model management
│   │       ├── warning_suppressor.py ← Warning suppression utility
│   │       └── config.py (reused)
│   ├── obsidian_processor.py ← Obsidian note processing
│   │   ├── logger.py (reused)
│   │   ├── config.py (reused)
│   │   ├── embeddings.py (reused)
│   │   └── progress_monitor_cmd.py ← Progress monitoring for CLI
│   ├── watcher.py ← File system monitoring
│   │   ├── logger.py (reused)
│   │   └── config.py (reused)
│   └── robust_incremental_embedding.py ← Incremental embedding processing
│       ├── logger.py (reused)
│       └── config.py (reused)
│
├── 🔧 setup.py (Core Entry Point #2 - Testing & Configuration)
│   └── config.py (reused)
│
├── 📦 installer/installer_ui.py (Core Entry Point #3 - Windows Installer)
│   └── (PyQt5-based GUI installer)
│
├── 🌐 mcp_server.py ← MCP Server for Claude Desktop integration
│   ├── mcp_server_helpers.py ← Helper functions for MCP
│   ├── config.py (reused)
│   ├── milvus_manager.py (reused)
│   ├── obsidian_processor.py (reused)
│   ├── enhanced_search_engine.py ← Advanced search functionality
│   │   ├── embeddings.py (reused)
│   │   └── config.py (reused)
│   └── search_engine.py ← Basic search functionality
│       ├── embeddings.py (reused)
│       └── config.py (reused)
│
├── 🚀 Startup Scripts
│   ├── run-main.py ← Python wrapper for main.py
│   ├── run-main.bat ← Windows batch launcher
│   ├── auto_start_mcp_server.vbs ← Windows auto-startup script
│   ├── start-milvus.bat ← Milvus container startup
│   └── Various other .bat/.vbs startup utilities
│
└── 📄 Configuration Files
    ├── .env / .env.example ← Environment variables
    ├── requirements.txt ← Python dependencies
    ├── milvus-podman-compose.yml ← Podman container configuration
    └── milvus-docker-compose.yml ← Docker container configuration

Module Descriptions

Core Entry Points

main.py - Primary Application Entry

Purpose: Main command-line interface for the Obsidian-Milvus integration
Key Functions:

Initializes the entire system (Milvus connection, Obsidian processor)
Provides interactive menu for operations:

Start MCP Server for Claude Desktop
Full embedding (reindex all files)
Incremental embedding with cleanup
Cleanup deleted files


Manages embedding progress and monitoring
Handles system resource management

setup.py - Interactive Test & Configuration Tool

Purpose: System setup, testing, and troubleshooting
Key Functions:

Tests Milvus connection and operations
Manages Podman/Docker containers
Configures Claude Desktop integration
Provides safe server restart functionality
Handles auto-startup configuration

installer/installer_ui.py - Windows GUI Installer

Purpose: Automated installation wizard for Windows users
Key Functions:

PyQt5-based graphical interface
Clones repository from GitHub
Installs Python dependencies
Sets up Podman and WSL
Configures system for first use

Core Modules

config.py - Configuration Management

Purpose: Central configuration hub for all settings
Key Functions:

Manages paths (Obsidian vault, storage, Podman)
Sets embedding model parameters
Configures batch sizes and performance limits
Handles GPU/CPU settings
Auto-detects system paths

logger.py - Centralized Logging System

Purpose: Unified logging across all modules
Key Functions:

Module-specific loggers with consistent formatting
File and console output
Log rotation and size management
Error tracking and debugging support

milvus_manager.py - Milvus Database Interface

Purpose: Manages all interactions with Milvus vector database
Key Functions:

Connection management and health checking
Collection creation and management
Vector insertion and deletion
Search operations (with GPU optimization)
Container lifecycle management
Batch operations with intelligent sizing

obsidian_processor.py - Document Processing Engine

Purpose: Processes Obsidian notes for embedding
Key Functions:

Extracts text from Markdown and PDF files
Chunks documents intelligently
Manages embedding generation with batch optimization
Tracks processing progress
Handles incremental updates
Cleans up deleted files

embeddings.py - AI Embedding Model Management

Purpose: Manages sentence transformer models for text embeddings
Key Features:

Hardware profiling and optimization
Dynamic batch size optimization
GPU/CPU automatic selection
Memory management and caching
Support for multiple embedding models
Performance monitoring

watcher.py - File System Monitor

Purpose: Monitors Obsidian vault for changes
Key Functions:

Real-time file change detection
Triggers incremental processing
Handles file creation, modification, deletion
Efficient directory watching

mcp_server.py - MCP Server for Claude Desktop

Purpose: Provides search interface for Claude Desktop
Key Functions:

FastMCP-based server implementation
Multiple search tools exposed to Claude
Document retrieval and content access
Advanced search with metadata filtering

Helper Modules

enhanced_search_engine.py - Advanced Search Features

Purpose: Provides sophisticated search capabilities
Key Functions:

Intelligent search with context expansion
Multi-query fusion search
Knowledge graph exploration
Performance optimization analysis

search_engine.py - Basic Search Implementation

Purpose: Core search functionality
Key Functions:

Vector similarity search
Metadata filtering
Result ranking and formatting

progress_monitor_cmd.py - CLI Progress Display

Purpose: Real-time progress monitoring in terminal
Key Functions:

Live progress bars and statistics
System resource monitoring
Error logging display
ETA calculations

robust_incremental_embedding.py - Incremental Processing

Purpose: Handles incremental updates efficiently
Key Functions:

Detects changed files
Processes only updates
Maintains consistency

Utility Scripts

warning_suppressor.py - Warning Management

Purpose: Suppresses unnecessary warnings from libraries
Key Functions:

Filters TensorFlow/PyTorch warnings
Cleans console output

Data Flow

Initialization: main.py → config.py → milvus_manager.py → obsidian_processor.py
File Processing: watcher.py detects changes → obsidian_processor.py → embeddings.py → milvus_manager.py
Search Operations: mcp_server.py → enhanced_search_engine.py → milvus_manager.py
Progress Monitoring: All operations → progress_monitor_cmd.py → Terminal display

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

🔵 Milvus: High-performance vector database with advanced indexing
🔵 Claude: Advanced AI assistant with MCP protocol support
🔵 FastMCP: Efficient MCP implementation framework
🔵 Sentence Transformers: State-of-the-art multilingual embedding models
🔵 HNSW Algorithm: Hierarchical Navigable Small World graphs for fast similarity search

🟢 Author: Jay Jeong, Ph.D. in Economics, Research Fellow at KCTDI, acubens555@gmail.com

Recommend Clients

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Cherry Studio🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

CursorThe AI Code Editor

MCP PlaygroundCall MCP Server Tools Online

A Sleek AI Assistant & MCP Client5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .

Refact.aiOpen-source AI Agent for VS Code and JetBrains that autonomously solves coding tasks end-to-end.

Roo Code (prev. Roo Cline)Roo Code (prev. Roo Cline) gives you a whole dev team of AI agents in your code editor.

Continue⏩ Create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks

ChatWiseThe second fastest AI chatbot™

HyperChatHyperChat is a Chat client that strives for openness, utilizing APIs from various LLMs to achieve the best Chat experience, as well as implementing productivity tools through the MCP protocol.