Mistral OCR MCP Server

Created By

sathizz78 months ago

Mcp ocr server

Content

Mistral OCR MCP Server

A local OCR server using Mistral OCR, compliant with MCP principles. This server allows you to perform OCR on uploaded images and interact with it via a Model Context Protocol (MCP) interface.

Features

FastAPI Backend: Built with the modern, fast (high-performance) FastAPI framework.
OCR Processing: (Details about the specific Mistral OCR capabilities would go here - e.g., supported languages, image formats, accuracy).
Standard OCR Endpoint: Provides a regular HTTP endpoint (/v1/ocr) for direct file uploads and OCR.
Configuration Management: Uses Pydantic settings for easy configuration.
Logging: Integrated logging for monitoring and debugging.
Health Check: A simple /health endpoint to verify server status.

Project Structure

.
├── .gitignore
├── main.py                 # FastAPI application and MCP server setup
├── pyproject.toml          # Project metadata and dependencies
├── README.md               # This file
├── ocr/
│   ├── __init__.py
│   ├── config.py           # Configuration settings
│   ├── router.py           # Handles the OCR model routing and processing logic
│   ├── schemas.py          # Pydantic models for API requests and responses
│   └── adapters/           # (If you have model-specific adapter logic)
│       └── __init__.py
└── tests/                  # (If you have tests)
    └── ...

Prerequisites

Python 3.10 or higher
An environment with pip or uv for package management.

Installation

Clone the repository (if applicable):

# git clone <your-repository-url>
# cd mistral-ocr-mcp-server

Create and activate a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies: The project uses pyproject.toml to manage dependencies. You can install them using pip:
```
pip install .
```
For development, including tools for testing, linting, and formatting, install the dev extras:
```
pip install .[dev]
```

Configuration

Configuration for the application (like supported file types, model settings, etc.) is managed in ocr/config.py using Pydantic settings. You can override these settings using environment variables. Refer to ocr/config.py for available settings and their corresponding environment variable names.

Default supported file types usually include common image formats (e.g., image/jpeg, image/png).

The maximum upload file size is set by default (e.g., 1MB in main.py lifespan function).

Running the Server

You can run the FastAPI server using Uvicorn:

uvicorn main:app --reload

This will typically start the server on http://127.0.0.1:8000. The --reload flag enables auto-reloading when code changes are detected, which is useful for development.

API Endpoints

1. OCR Endpoint

POST /v1/ocr
- Description: Upload an image file to perform OCR.
- Request Body: multipart/form-data with a file field containing the image.
- Content-Type Header for file: Must be one of the settings.supported_file_types.
- Response: OCRResult (JSON object containing the extracted text and other relevant information as defined in ocr.schemas.OCRResult).
- Example using cURL:
```
curl -X POST -F "file=@/path/to/your/image.png" http://127.0.0.1:8000/v1/ocr
```

2. MCP Endpoint

BASE_URL/mcp
- Description: Provides MCP-compliant tools for interacting with the OCR service. The available tools and resources can be discovered by MCP clients.
- This endpoint is automatically managed by fastapi-mcp.