- Mistral OCR MCP Server
Mistral OCR MCP Server
Mistral OCR MCP Server
A local OCR server using Mistral OCR, compliant with MCP principles. This server allows you to perform OCR on uploaded images and interact with it via a Model Context Protocol (MCP) interface.
Features
- FastAPI Backend: Built with the modern, fast (high-performance) FastAPI framework.
- OCR Processing: (Details about the specific Mistral OCR capabilities would go here - e.g., supported languages, image formats, accuracy).
- Standard OCR Endpoint: Provides a regular HTTP endpoint (
/v1/ocr) for direct file uploads and OCR. - Configuration Management: Uses Pydantic settings for easy configuration.
- Logging: Integrated logging for monitoring and debugging.
- Health Check: A simple
/healthendpoint to verify server status.
Project Structure
.
├── .gitignore
├── main.py # FastAPI application and MCP server setup
├── pyproject.toml # Project metadata and dependencies
├── README.md # This file
├── ocr/
│ ├── __init__.py
│ ├── config.py # Configuration settings
│ ├── router.py # Handles the OCR model routing and processing logic
│ ├── schemas.py # Pydantic models for API requests and responses
│ └── adapters/ # (If you have model-specific adapter logic)
│ └── __init__.py
└── tests/ # (If you have tests)
└── ...
Prerequisites
- Python 3.10 or higher
- An environment with
piporuvfor package management.
Installation
-
Clone the repository (if applicable):
# git clone <your-repository-url> # cd mistral-ocr-mcp-server -
Create and activate a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install dependencies: The project uses
pyproject.tomlto manage dependencies. You can install them usingpip:pip install .For development, including tools for testing, linting, and formatting, install the
devextras:pip install .[dev]
Configuration
Configuration for the application (like supported file types, model settings, etc.) is managed in ocr/config.py using Pydantic settings. You can override these settings using environment variables. Refer to ocr/config.py for available settings and their corresponding environment variable names.
Default supported file types usually include common image formats (e.g., image/jpeg, image/png).
The maximum upload file size is set by default (e.g., 1MB in main.py lifespan function).
Running the Server
You can run the FastAPI server using Uvicorn:
uvicorn main:app --reload
This will typically start the server on http://127.0.0.1:8000. The --reload flag enables auto-reloading when code changes are detected, which is useful for development.
API Endpoints
1. OCR Endpoint
- POST
/v1/ocr- Description: Upload an image file to perform OCR.
- Request Body:
multipart/form-datawith afilefield containing the image. - Content-Type Header for file: Must be one of the
settings.supported_file_types. - Response:
OCRResult(JSON object containing the extracted text and other relevant information as defined inocr.schemas.OCRResult). - Example using cURL:
curl -X POST -F "file=@/path/to/your/image.png" http://127.0.0.1:8000/v1/ocr
2. MCP Endpoint
BASE_URL/mcp- Description: Provides MCP-compliant tools for interacting with the OCR service. The available tools and resources can be discovered by MCP clients.
- This endpoint is automatically managed by
fastapi-mcp.
3. Health Check
- GET
/health- Description: A simple endpoint to check if the server is running and healthy.
- Response:
{ "status": "ok" }
Development
Running Type Checker
To check types with Mypy:
mypy .
(Ensure Mypy is configured in pyproject.toml to scan the correct paths, e.g., mypy ocr main.py)
Running Tests
If you have tests in a tests/ directory, you can run them using Pytest:
pytest
Coverage reports can also be generated if configured (see tool.pytest.ini_options in pyproject.toml).
Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an issue. (Add more specific contribution guidelines if you have them).