- Oboyu (覚ゆ)
Oboyu (覚ゆ)
Oboyu (覚ゆ)
Lightning-fast semantic search for your local documents with best-in-class Japanese support.

What is Oboyu?
Oboyu (覚ゆ - "to remember" in ancient Japanese) is a powerful local semantic search engine that helps you instantly find information in your documents using natural language queries. Unlike traditional keyword search, Oboyu understands the meaning behind your questions, making it perfect for finding relevant content even when you don't know the exact terms.
Why Oboyu?
- 🚀 Fast: Indexes thousands of documents in seconds, searches in milliseconds
- 🎯 Accurate: Semantic search finds what you mean, not just what you type
- 🇯🇵 Japanese Excellence: First-class support with automatic encoding detection
- 🔒 Private: Everything runs locally - your documents never leave your machine
- 🤖 AI-Ready: Built-in MCP server for Claude, Cursor, and other AI assistants
Quick Start
Get up and running in under 5 minutes:
# Install Oboyu
pip install oboyu
# Index your documents
oboyu index ~/Documents
# Search interactively
oboyu query --interactive
That's it! See our Quick Start Guide for more examples.
Key Features
🔍 Advanced Search Capabilities
- Hybrid Search: Combines semantic understanding with keyword matching for best results
- Multiple Modes: Switch between semantic, keyword, or hybrid search modes
- Smart Reranking: Built-in AI reranker improves result accuracy
- Interactive Mode: Real-time search with command history and auto-suggestions
📚 Document Support
- Wide Format Support: Plain text, Markdown, code files, PDFs, Jupyter notebooks, and more
- Incremental Indexing: Only process new or changed files for lightning-fast updates
- Smart Chunking: Intelligent document splitting for optimal search results
- Automatic Encoding: Handles various text encodings seamlessly
🇯🇵 Japanese Language Excellence
- Native Support: Purpose-built for Japanese text processing
- Automatic Detection: Detects and handles Shift-JIS, EUC-JP, and UTF-8
- Specialized Models: Optimized embedding models for Japanese content
- Mixed Language: Seamlessly handles Japanese and English in the same document
🚀 Performance & Integration
- ONNX Acceleration: 2-4x faster with automatic model optimization
- MCP Server: Direct integration with Claude Desktop and AI coding assistants
- Rich CLI: Beautiful terminal interface with progress tracking
- Low Memory: Efficient processing even on modest hardware
Installation
Using UV (Recommended)
uv tool install oboyu
Using pip
pip install oboyu
From Source
git clone https://github.com/sonesuke/oboyu.git
cd oboyu
pip install -e .
System Requirements
- Python: 3.10 or higher
- OS: macOS, Linux (Windows via WSL)
- Memory: 2GB RAM minimum
- Storage: 1GB for models and index
Note: Models are automatically downloaded on first use (~90MB).
Usage Examples
Basic Usage
# Index a directory
oboyu index ~/Documents/notes
# Search your documents
oboyu query "machine learning optimization techniques"
# Interactive mode (recommended!)
oboyu query --interactive
Advanced Examples
# Index only specific file types
oboyu index ~/projects --include "*.md,*.txt"
# Search with filters
oboyu query "API design" --filter "docs/"
# Use semantic search mode
oboyu query "concepts similar to dependency injection" --mode semantic
# Enable reranking for better accuracy
oboyu query "complex technical topic" --rerank
MCP Server for AI Assistants
# Start MCP server
oboyu mcp
# Or configure in Claude Desktop's settings
See our MCP Integration Guide for detailed setup instructions.
Documentation
🚀 Getting Started
- Quick Start Guide - Get up and running in 5 minutes
- Troubleshooting - Solutions to common issues
- CLI Reference - Complete command-line interface documentation
📖 User Guides
- Configuration - Customize Oboyu for your needs
- Japanese Support - Working with Japanese documents
- MCP Integration - Setup for AI assistants
- Reranker Guide - Improve search accuracy
🔧 Technical Documentation
- Architecture - System design and components
- Query Engine - How search works
- Indexer - Document processing details
- Crawler - File discovery system
Common Use Cases
📚 Academic Research
Index and search through research papers, notes, and references:
oboyu index ~/research --include "*.pdf,*.md,*.txt"
oboyu query "transformer architecture improvements"
💻 Code Documentation
Search through project documentation and code comments:
oboyu index ~/projects/myapp --include "*.md,*.py"
oboyu query "authentication implementation"
📝 Personal Knowledge Base
Organize and search your notes and documents:
oboyu index ~/Documents/notes
oboyu query "meeting notes from last week"
🌏 Multilingual Documents
Perfect for mixed Japanese and English content:
oboyu index ~/Documents/bilingual
oboyu query "プロジェクト管理 best practices"
Testing
Unit and Integration Tests
# Run fast tests (recommended for development)
uv run pytest -m "not slow"
# Run all tests with coverage
uv run pytest --cov=src
E2E Display Testing
Oboyu includes comprehensive E2E display testing using Claude Code SDK:
# Run all E2E display tests
python tests/e2e/run_display_tests.py
# Run specific test category
python tests/e2e/run_display_tests.py --test search
See our E2E Display Testing Guide for details.
Contributing
We welcome contributions! See our Contributing Guidelines for details.
# Quick start for contributors
git clone https://github.com/YOUR_USERNAME/oboyu.git
cd oboyu
uv sync
uv run pytest -m "not slow"
Support
- 📋 GitHub Issues - Report bugs or request features
- 📖 Documentation - Comprehensive guides and references
- 💬 Discussions - Ask questions and share ideas
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Acknowledgments
- The name "Oboyu" (覚ゆ) comes from ancient Japanese, meaning "to remember"
- Built with ❤️ for the Japanese NLP community
- Inspired by the goal of making knowledge accessible across languages
Made with 🇯🇵 by sonesuke