Dingo MCP Server

Created By

DataEval9 months ago

MCP server for the Dingo: a comprehensive data quality evaluation tool. Server enables interaction with Dingo's rule-based and LLM-based evaluation capabilities and rules, and prompts listing. Official GitHub link: https://github.com/DataEval/dingo

# data-science

# data-quality

Overview Content Tools Comments

Content

Installation

Prerequisites: Ensure you have Git and a Python environment (e.g., 3.8+) set up.
Clone the Repository: Clone this repository to your local machine.
```
git clone https://github.com/DataEval/dingo.git
cd dingo
```
Install Dependencies: Install the required dependencies, including FastMCP and other Dingo requirements. It's recommended to use the requirements.txt file.
```
pip install -r requirements.txt
# Alternatively, at minimum: pip install fastmcp
```
Ensure Dingo is Importable: Make sure your Python environment can find the dingo package within the cloned repository when you run the server script.

Running the Server

Navigate to the directory containing mcp_server.py and run it using Python:

python mcp_server.py

By default, the server starts using the Server-Sent Events (SSE) transport protocol. You can customize its behavior using arguments within the script's mcp.run() call:

# Example customization in mcp_server.py
mcp.run(
    transport="sse",      # Communication protocol (sse is default)
    host="127.0.0.1",     # Network interface to bind to (default: 0.0.0.0)
    port=8888,            # Port to listen on (default: 8000)
    log_level="debug"     # Logging verbosity (default: info)
)

Important: Note the host and port the server is running on, as you will need these to configure your MCP client.

Integration with Cursor

Configuration

To connect Cursor to your running Dingo MCP server, you need to edit Cursor's MCP configuration file (mcp.json). This file is typically located in Cursor's user configuration directory (e.g., ~/.cursor/ or %USERPROFILE%\.cursor\).

Add or modify the entry for your Dingo server within the mcpServers object. Use the url property to specify the address of your running server.

Example mcp.json entry:

{
  "mcpServers": {
    // ... other servers ...
    "dingo_evaluator": {
      "url": "http://127.0.0.1:8888/sse" // <-- MUST match host, port, and transport of your running server
    }
    // ...
  }
}

Ensure the url exactly matches the host, port, and transport (currently only sse is supported for the URL scheme) your mcp_server.py is configured to use. If you didn't customize mcp.run, the default URL is likely http://127.0.0.1:8000/sse or http://0.0.0.0:8000/sse.
Restart Cursor after saving changes to mcp.json.

Usage in Cursor

Once configured, you can invoke the Dingo tools within Cursor:

List Components: "Use the dingo_evaluator tool to list available Dingo components."
Run Evaluation: "Use the dingo_evaluator tool to run a rule evaluation..." or "Use the dingo_evaluator tool to run an LLM evaluation..."

Cursor will prompt you for the necessary arguments.

Tool Reference

`list_dingo_components()`

Lists available Dingo rule groups and registered LLM model identifiers.

Arguments: None
Returns: Dict[str, List[str]] - A dictionary containing rule_groups and llm_models.

Example Cursor Usage:

Use the dingo_evaluator tool to list dingo components.

`run_dingo_evaluation(...)`

Runs a Dingo evaluation (rule-based or LLM-based).

Arguments:
- input_path (str): Path to the input file or directory (relative to the project root or absolute).
- evaluation_type (Literal["rule", "llm"]): Type of evaluation.
- eval_group_name (str): Rule group name for rule type (default: "" which uses 'default'). Only 'default', 'sft', 'pretrain' are validated by the server logic. Ignored for llm type.
- output_dir (Optional[str]): Directory to save outputs. Defaults to a dingo_output_* subdirectory within the parent directory of input_path.
- task_name (Optional[str]): Name for the task (used in output path generation). Defaults to mcp_eval_<uuid>.
- save_data (bool): Whether to save detailed JSONL output (default: True).
- save_correct (bool): Whether to save correct data (default: True).
- kwargs (dict): Dictionary for additional dingo.io.InputArgs. Common uses:
  - dataset (str): Dataset type (e.g., 'local', 'hugging_face'). Defaults to 'local' if input_path is given.
  - data_format (str): Input data format (e.g., 'json', 'jsonl', 'plaintext'). Inferred from input_path extension if possible.
  - column_content (str): Required for formats like JSON/JSONL - specifies the key containing the text to evaluate.
  - column_id, column_prompt, column_image: Other column mappings.
  - custom_config (str | dict): Path to a JSON config file, a JSON string, or a dictionary for LLM evaluation or custom rule settings. API keys for LLMs must be provided here.
  - max_workers, batch_size: Dingo execution parameters (default to 1 in MCP for stability).
Returns: str - The absolute path to the primary output file (e.g., summary.json).

Example Cursor Usage (Rule-based):

Use the Dingo Evaluator tool to run the default rule evaluation on test/data/test_local_jsonl.jsonl. Make sure to use the 'content' column.

(Cursor should propose a tool call like below)

<use_mcp_tool>
<server_name>dingo_evaluator</server_name>
<tool_name>run_dingo_evaluation</tool_name>
<arguments>
{
  "input_path": "test/data/test_local_jsonl.jsonl",
  "evaluation_type": "rule",
  "eval_group_name": "default",
  "kwargs": {
    "column_content": "content"
    // data_format="jsonl" and dataset="local" will be inferred
  }
}
</arguments>
</use_mcp_tool>

Example Cursor Usage (LLM-based):

Use the Dingo Evaluator tool to perform an LLM evaluation on test/data/test_local_jsonl.jsonl. Use the 'content' column. Configure it using the file examples/mcp/config_self_deployed_llm.json.

(Cursor should propose a tool call like below. Note eval_group_name can be omitted or set when using custom_config for LLM evals)

<use_mcp_tool>
<server_name>dingo_evaluator</server_name>
<tool_name>run_dingo_evaluation</tool_name>
<arguments>
{
  "input_path": "test/data/test_local_jsonl.jsonl",
  "evaluation_type": "llm",
  "kwargs": {
    "column_content": "content",
    "custom_config": "examples/mcp/config_self_deployed_llm.json"
    // data_format="jsonl" and dataset="local" will be inferred
  }
}
</arguments>
</use_mcp_tool>

Refer to examples/mcp/config_api_llm.json (for API-based LLMs) and examples/mcp/config_self_deployed_llm.json (for self-hosted LLMs) for the structure of the custom_config file, including where to place API keys or URLs.

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

Serper MCP ServerA Serper MCP Server

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

DeepChatYour AI Partner on Desktop

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

Playwright McpPlaywright MCP server

ChatWiseThe second fastest AI chatbot™

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.