xCOMET MCP Server

Created By

shuji-bonji5 months ago

xCOMET MCP Server provides AI agents with the ability to evaluate machine translation quality. It integrates with the xCOMET model from Unbabel to provide: Quality Scoring: Scores between 0-1 indicating translation quality Error Detection: Identifies error spans with severity levels (minor/major/critical) Batch Processing: Evaluate multiple translation pairs efficiently (optimized single model load) GPU Support: Optional GPU acceleration for faster inference

# xcomet

# machine-translation

Overview Content Tools Comments

Content

xCOMET MCP Server

⚠️ This is an unofficial community project, not affiliated with Unbabel.

Translation quality evaluation MCP Server powered by xCOMET (eXplainable COMET).

🎯 Overview

xCOMET MCP Server provides AI agents with the ability to evaluate machine translation quality. It integrates with the xCOMET model from Unbabel to provide:

Quality Scoring: Scores between 0-1 indicating translation quality
Error Detection: Identifies error spans with severity levels (minor/major/critical)
Batch Processing: Evaluate multiple translation pairs efficiently (optimized single model load)
GPU Support: Optional GPU acceleration for faster inference

graph LR
    A[AI Agent] --> B[Node.js MCP Server]
    B --> C[Python FastAPI Server]
    C --> D[xCOMET Model<br/>Persistent in Memory]
    D --> C
    C --> B
    B --> A

    style D fill:#9f9

🔧 Prerequisites

Python Environment

xCOMET requires Python with the following packages:

pip install "unbabel-comet>=2.2.0" fastapi uvicorn

Model Download

The first run will download the xCOMET model (~14GB for XL, ~42GB for XXL):

# Test model availability
python -c "from comet import download_model; download_model('Unbabel/XCOMET-XL')"

Node.js

Node.js >= 18.0.0
npm or yarn

📦 Installation

# Clone the repository
git clone https://github.com/shuji-bonji/xcomet-mcp-server.git
cd xcomet-mcp-server

# Install dependencies
npm install

# Build
npm run build

🚀 Usage

With Claude Desktop (npx)

Add to your Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "xcomet": {
      "command": "npx",
      "args": ["-y", "xcomet-mcp-server"]
    }
  }
}

With Claude Code

claude mcp add xcomet -- npx -y xcomet-mcp-server

Local Installation

If you prefer a local installation:

npm install -g xcomet-mcp-server

Then configure:

{
  "mcpServers": {
    "xcomet": {
      "command": "xcomet-mcp-server"
    }
  }
}

HTTP Mode (Remote Access)

TRANSPORT=http PORT=3000 npm start

Then connect to http://localhost:3000/mcp

🛠️ Available Tools

`xcomet_evaluate`

Evaluate translation quality for a single source-translation pair.

Parameters:

Name	Type	Required	Description
`source`	string	✅	Original source text
`translation`	string	✅	Translated text to evaluate
`reference`	string	❌	Reference translation
`source_lang`	string	❌	Source language code (ISO 639-1)
`target_lang`	string	❌	Target language code (ISO 639-1)
`response_format`	"json" \| "markdown"	❌	Output format (default: "json")
`use_gpu`	boolean	❌	Use GPU for inference (default: false)

Example:

{
  "source": "The quick brown fox jumps over the lazy dog.",
  "translation": "素早い茶色のキツネが怠惰な犬を飛び越える。",
  "source_lang": "en",
  "target_lang": "ja",
  "use_gpu": true
}

Response:

{
  "score": 0.847,
  "errors": [],
  "summary": "Good quality (score: 0.847) with 0 error(s) detected."
}

`xcomet_detect_errors`

Focus on detecting and categorizing translation errors.

Parameters:

Name	Type	Required	Description
`source`	string	✅	Original source text
`translation`	string	✅	Translated text to analyze
`reference`	string	❌	Reference translation
`min_severity`	"minor" \| "major" \| "critical"	❌	Minimum severity (default: "minor")
`response_format`	"json" \| "markdown"	❌	Output format
`use_gpu`	boolean	❌	Use GPU for inference (default: false)

`xcomet_batch_evaluate`

Evaluate multiple translation pairs in a single request.

Performance Note: With the persistent server architecture (v0.3.0+), the model stays loaded in memory. Batch evaluation processes all pairs efficiently without reloading the model.

Parameters:

Name	Type	Required	Description
`pairs`	array	✅	Array of {source, translation, reference?} (max 500)
`source_lang`	string	❌	Source language code
`target_lang`	string	❌	Target language code
`response_format`	"json" \| "markdown"	❌	Output format
`use_gpu`	boolean	❌	Use GPU for inference (default: false)
`batch_size`	number	❌	Batch size 1-64 (default: 8). Larger = faster but uses more memory

Example:

{
  "pairs": [
    {"source": "Hello", "translation": "こんにちは"},
    {"source": "Goodbye", "translation": "さようなら"}
  ],
  "use_gpu": true,
  "batch_size": 16
}

🔗 Integration with Other MCP Servers

xCOMET MCP Server is designed to work alongside other MCP servers for complete translation workflows:

sequenceDiagram
    participant Agent as AI Agent
    participant DeepL as DeepL MCP Server
    participant xCOMET as xCOMET MCP Server
    
    Agent->>DeepL: Translate text
    DeepL-->>Agent: Translation result
    Agent->>xCOMET: Evaluate quality
    xCOMET-->>Agent: Score + Errors
    Agent->>Agent: Decide: Accept or retry?

Recommended Workflow

Translate using DeepL MCP Server (official)
Evaluate using xCOMET MCP Server
Iterate if quality is below threshold

Example: DeepL + xCOMET Integration

Configure both servers in Claude Desktop:

{
  "mcpServers": {
    "deepl": {
      "command": "npx",
      "args": ["-y", "@anthropic/deepl-mcp-server"],
      "env": {
        "DEEPL_API_KEY": "your-api-key"
      }
    },
    "xcomet": {
      "command": "npx",
      "args": ["-y", "xcomet-mcp-server"]
    }
  }
}

Then ask Claude:

"Translate this text to Japanese using DeepL, then evaluate the translation quality with xCOMET. If the score is below 0.8, suggest improvements."

⚙️ Configuration

Environment Variables

Variable	Default	Description
`TRANSPORT`	`stdio`	Transport mode: `stdio` or `http`
`PORT`	`3000`	HTTP server port (when TRANSPORT=http)
`XCOMET_MODEL`	`Unbabel/XCOMET-XL`	xCOMET model to use
`XCOMET_PYTHON_PATH`	(auto-detect)	Python executable path (see below)
`XCOMET_PRELOAD`	`false`	Pre-load model at startup (v0.3.1+)
`XCOMET_DEBUG`	`false`	Enable verbose debug logging (v0.3.1+)

Model Selection

Choose the model based on your quality/performance needs:

Model	Parameters	Size	Memory	Reference	Quality	Use Case
`Unbabel/XCOMET-XL`	3.5B	~14GB	~8-10GB	Optional	⭐⭐⭐⭐	Recommended for most use cases
`Unbabel/XCOMET-XXL`	10.7B	~42GB	~20GB	Optional	⭐⭐⭐⭐⭐	Highest quality, requires more resources
`Unbabel/wmt22-comet-da`	580M	~2GB	~3GB	Required	⭐⭐⭐	Lightweight, faster loading

Important: wmt22-comet-da requires a reference translation for evaluation. XCOMET models support referenceless evaluation.

Tip: If you experience memory issues or slow model loading, try Unbabel/wmt22-comet-da for faster performance with slightly lower accuracy (but remember to provide reference translations).

To use a different model, set the XCOMET_MODEL environment variable:

{
  "mcpServers": {
    "xcomet": {
      "command": "npx",
      "args": ["-y", "xcomet-mcp-server"],
      "env": {
        "XCOMET_MODEL": "Unbabel/XCOMET-XXL"
      }
    }
  }
}

Python Path Auto-Detection

The server automatically detects a Python environment with unbabel-comet installed:

XCOMET_PYTHON_PATH environment variable (if set)
pyenv versions (~/.pyenv/versions/*/bin/python3) - checks for comet module
Homebrew Python (/opt/homebrew/bin/python3, /usr/local/bin/python3)
Fallback: python3 command

This ensures the server works correctly even when the MCP host (e.g., Claude Desktop) uses a different Python than your terminal.

Example: Explicit Python path configuration

{
  "mcpServers": {
    "xcomet": {
      "command": "npx",
      "args": ["-y", "xcomet-mcp-server"],
      "env": {
        "XCOMET_PYTHON_PATH": "/Users/you/.pyenv/versions/3.11.0/bin/python3"
      }
    }
  }
}

⚡ Performance

Persistent Server Architecture (v0.3.0+)

The server uses a persistent Python FastAPI server that keeps the xCOMET model loaded in memory:

Request	Time	Notes
First request	~25-90s	Model loading (varies by model size)
Subsequent requests	~500ms	Model already loaded

This provides a 177x speedup for consecutive evaluations compared to reloading the model each time.

Eager Loading (v0.3.1+)

Enable XCOMET_PRELOAD=true to pre-load the model at server startup:

{
  "mcpServers": {
    "xcomet": {
      "command": "npx",
      "args": ["-y", "xcomet-mcp-server"],
      "env": {
        "XCOMET_PRELOAD": "true"
      }
    }
  }
}

With preload enabled, all requests are fast (~500ms), including the first one.

graph LR
    A[MCP Request] --> B[Node.js Server]
    B --> C[Python FastAPI Server]
    C --> D[xCOMET Model<br/>in Memory]
    D --> C
    C --> B
    B --> A

    style D fill:#9f9

Batch Processing Optimization

The xcomet_batch_evaluate tool processes all pairs with a single model load:

Pairs	Estimated Time
10	~30-40 sec
50	~1-1.5 min
100	~2 min

GPU vs CPU Performance

Mode	100 Pairs (Estimated)
CPU (batch_size=8)	~2 min
GPU (batch_size=16)	~20-30 sec

Note: GPU requires CUDA-compatible hardware and PyTorch with CUDA support. If GPU is not available, set use_gpu: false (default).

Best Practices

1. Let the persistent server do its job

With v0.3.0+, the model stays in memory. Multiple xcomet_evaluate calls are now efficient:

✅ Fast: First call loads model, subsequent calls reuse it
   xcomet_evaluate(pair1)  # ~90s (model loads)
   xcomet_evaluate(pair2)  # ~500ms (model cached)
   xcomet_evaluate(pair3)  # ~500ms (model cached)

2. For many pairs, use batch evaluation

✅ Even faster: Batch all pairs in one call
   xcomet_batch_evaluate(allPairs)  # Optimal throughput

3. Memory considerations

XCOMET-XL requires ~8-10GB RAM
For large batches (500 pairs), ensure sufficient memory
If memory is limited, split into smaller batches (100-200 pairs)

Auto-Restart (v0.3.1+)

The server automatically recovers from failures:

Monitors health every 30 seconds
Restarts after 3 consecutive health check failures
Up to 3 restart attempts before giving up

📊 Quality Score Interpretation

Score Range	Quality	Recommendation
0.9 - 1.0	Excellent	Ready for use
0.7 - 0.9	Good	Minor review recommended
0.5 - 0.7	Fair	Post-editing needed
0.0 - 0.5	Poor	Re-translation recommended

🔍 Troubleshooting

Common Issues

"No module named 'comet'"

Cause: Python environment without unbabel-comet installed.

Solution:

# Check which Python is being used
python3 -c "import sys; print(sys.executable)"

# Install all required packages
pip install "unbabel-comet>=2.2.0" fastapi uvicorn

# Or specify Python path explicitly
export XCOMET_PYTHON_PATH=/path/to/python3

Model download fails or times out

Cause: Large model files (~14GB for XL) require stable internet connection.

Solution:

# Pre-download the model manually
python -c "from comet import download_model; download_model('Unbabel/XCOMET-XL')"

GPU not detected

Cause: PyTorch not installed with CUDA support.

Solution:

# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# If False, reinstall PyTorch with CUDA
pip install torch --index-url https://download.pytorch.org/whl/cu118

Slow performance on Mac (MPS)

Cause: Mac MPS (Metal Performance Shaders) has compatibility issues with some operations.

Solution: The server automatically uses num_workers=1 for Mac MPS compatibility. For best performance on Mac, use CPU mode (use_gpu: false).

High memory usage or crashes

Cause: XCOMET-XL requires ~8-10GB RAM.

Solutions:

Use the persistent server (v0.3.0+): Model loads once and stays in memory, avoiding repeated memory spikes
Use a lighter model: Set XCOMET_MODEL=Unbabel/wmt22-comet-da for lower memory usage (~3GB)
Reduce batch size: For large batches, process in smaller chunks (100-200 pairs)
Close other applications: Free up RAM before running large evaluations

# Check available memory
free -h  # Linux
vm_stat | head -5  # macOS

VS Code or IDE crashes during evaluation

Cause: High memory usage from the xCOMET model (~8-10GB for XL).

Solution:

With v0.3.0+, the model loads once and stays in memory (no repeated loading)
If memory is still an issue, use a lighter model: XCOMET_MODEL=Unbabel/wmt22-comet-da
Close other memory-intensive applications before evaluation

Getting Help

If you encounter issues:

Check the GitHub Issues
Enable debug logging by checking Claude Desktop's Developer Mode logs
Open a new issue with:
- Your OS and Python version
- The error message
- Your configuration (without sensitive data)

🧪 Development

# Install dependencies
npm install

# Build TypeScript
npm run build

# Watch mode
npm run dev

# Test with MCP Inspector
npm run inspect

📋 Changelog

See CHANGELOG.md for version history and updates.

📝 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

Unbabel for the xCOMET model
Anthropic for the MCP protocol
Model Context Protocol community

📚 References

Server Config

{
  "mcpServers": {
    "xcomet": {
      "command": "npx",
      "args": [
        "-y",
        "xcomet-mcp-server"
      ]
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

CursorThe AI Code Editor

Tavily Mcp

ChatWiseThe second fastest AI chatbot™

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

WindsurfThe new purpose-built IDE to harness magic

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

Amap Maps高德地图官方 MCP Server

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

Playwright McpPlaywright MCP server

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.

DeepChatYour AI Partner on Desktop

Serper MCP ServerA Serper MCP Server