Sponsored by Deepsite.site

MCP Server for CVDLT(Computer Vision & Deep Learning Tools)

Created By
MRonaldo-gif8 months ago
The repo is based on Model Context procotol of Python SDK, including DL models in CV, and provide the abilities to the LLM or vLLM model
Content

MCP Server for CVDLT(Computer Vision & Deep Learning Tools)

The repo is based on Ultralytics and Model Context procotol of Python SDK Related Links: Ultralytics - https://github.com/ultralytics/ultralytics MCP of Python - https://github.com/modelcontextprotocol/python-sdk

Python server implementing Model Context Protocol (MCP) for image object detection, segmentation, and pose estimation operations.

Features

  • Detect objects in images using YOLOv10
  • Segment objects in images using YOLOv8
  • Segment entire images using Ultralytics SAM
  • Estimate human poses in images using YOLOv8
  • Support for local and network image inputs
  • MCP tool integration for client interactions
  • Stdio and SSE transport protocols

Note: The server requires valid image paths or URLs and access to the following model files: yolov10b.pt (YOLOv10 detection), yolov8n-seg.pt (YOLOv8 segmentation), yolov8n-pose.pt (YOLOv8 pose estimation), and sam_b.pt (Ultralytics SAM).

QucikStart

Install Dependencies

uv sync
//如需要清华源
uv sync --index https://pypi.tuna.tsinghua.edu.cn/simple --extra-index-url https://pypi.org/simple

uv pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

Start Server

  1. stdio 模式

    python server.py
    

    输出:

    使用 stdio 传输启动 MCP 服务器(YOLO)
    
  2. SSE 模式

    python server.py sse [端口号]
    

    示例:

    python server.py sse 8080
    

    输出:

    在端口 8080 上启动 MCP 服务器(YOLO),使用 SSE 传输
    

Moreover, users need to download the weights into the ./checkpoints directory. Downloads Links🔗:https://docs.ultralytics.com/models/yolov10/,https://docs.ultralytics.com/models/yolov8/,https://docs.ultralytics.com/models/sam-2/

├── checkpoints │ ├── sam_b.pt │ ├── yolov10b.pt │ ├── yolov8n-pose.pt │ └── yolov8n-seg.pt

API

Resources

  • image://system: Image processing operations interface

Tools

  • detect_objects
    • Detect objects in an image using YOLOv10
    • Input: image_url (string)
    • Supports local paths (file:// or relative) and network URLs (http:// or https://)
    • Returns JSON array of detected objects with bounding boxes, confidence scores, and class labels
    • Example output: [{"box": [x, y, w, h], "confidence": 0.9, "class": "person"}, ...]
  • segment_objects
    • Segment objects in an image using YOLOv8
    • Input: image_url (string)
    • Supports local paths (file:// or relative) and network URLs (http:// or https://)
    • Returns JSON array of segmented objects with bounding boxes, confidence scores, and class labels
    • Example output: [{"box": [x, y, w, h], "confidence": 0.85, "class": "car"}, ...]
  • segment_image
    • Segment entire image using Ultralytics SAM
    • Input: image_url (string)
    • Supports local paths (file:// or relative) and network URLs (http:// or https://)
    • Returns JSON array of segmented regions with bounding boxes, areas, and confidence scores
    • Example output: [{"bbox": [x, y, w, h], "area": 2500, "confidence": 0.95}, ...]
  • estimate_pose
    • Estimate human poses in an image using YOLOv8
    • Input: image_url (string)
    • Supports local paths (file:// or relative) and network URLs (http:// or https://)
    • Returns JSON array of detected poses with keypoint coordinates and confidence scores
    • Example output: [{"keypoints": [[x1, y1], [x2, y2], ...], "confidence": [0.9, 0.8, ...]}, ...]

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

Note: You can provide sandboxed directories to the server by mounting them to /projects. Adding the ro flag will make the directory readonly by the server.

SSE

{
  "mcpServers": {
    "server-with-yolo": {
      "url": "http://localhost:8080/sse"
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
CursorThe AI Code Editor
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
DeepChatYour AI Partner on Desktop
Tavily Mcp
WindsurfThe new purpose-built IDE to harness magic
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Amap Maps高德地图官方 MCP Server
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Serper MCP ServerA Serper MCP Server
ChatWiseThe second fastest AI chatbot™
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Playwright McpPlaywright MCP server
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors