Sponsored by Deepsite.site

Unicrawler

Created By
5 hours ago
Stop writing selectors. Start describing data. UniCrawler ships an MCP (Model Context Protocol) server that exposes UniCrawler’s crawling/parsing/storage capabilities to MCP-capable clients.
Content

UniCrawler MCP Server

Stop writing selectors. Start describing data.

UniCrawler ships an MCP (Model Context Protocol) server that exposes UniCrawler’s crawling/parsing/storage capabilities to MCP-capable clients.

This server is designed to be used via stdio transport (recommended for desktop clients).

Why UniCrawler (as an MCP server)

Traditional crawlers break when a div moves. UniCrawler is built around natural-language intent (“what you want”), so your workflows are more resilient to minor UI changes.

  • Natural Language Driven: no CSS/XPath required—describe the fields you want.
  • AI-Powered Parsing: extract structured data from messy HTML/DOM.
  • Browser Automation: works with dynamic pages through browser automation.

UniCrawler vs. traditional frameworks

FeatureTraditional (Scrapy, Selenium...)UniCrawler MCP
ConfigurationComplex selectorsNatural language descriptions
MaintenanceBreaks on layout changesMore resilient intent-driven approach
ParsingRegex / brittle rulesSemantic extraction + normalization
InterfacePython scriptsTools callable from MCP clients

Core capabilities

UniCrawler MCP is backed by the same core modules as the library:

  1. Intelligent crawling
  • Browser automation for dynamic rendering
  • Strategy-driven crawling (pagination / limits)
  1. Smart parsing
  • Filter structured data, or extract from HTML
  • Dedupe support for stable outputs
  1. One-click storage (PostgreSQL)
  • Write structured results to PostgreSQL
  • Optional dedupe during writes

Install (from PyPI)

Create a clean environment, then install UniCrawler with MCP extra:

Windows (venv)

python -m venv .venv
.\.venv\Scripts\pip install -U pip
.\.venv\Scripts\pip install "UniCrawler[mcp]"

macOS/Linux (venv)

python -m venv .venv
. .venv/bin/activate
pip install -U pip
pip install "UniCrawler[mcp]"

Run

After installation, you can start the server with either:

unicrawler-mcp

or:

python -m unicrawler.mcp

The server uses stdio transport by default (recommended for desktop clients).

(Optional) Start Chrome for CDP

If your workflow uses CDP-based crawling, make sure Chrome/Chromium is running with a remote debugging port.

Windows PowerShell:

unicrawler-start-chrome --port 9222 --profile .\chrome_cdp_profile --headless

macOS/Linux:

unicrawler-start-chrome --port 9222 --profile ./chrome_cdp_profile --headless

Tools

The MCP server exposes the following tools:

crawl_url

Crawl a URL and return structured data.

Parameters:

  • url (str)
  • what (str) – describe which fields you want (e.g. "title, price, image")
  • page_limit (int, default 1)
  • item_limit (int, default 200)
  • llm_config_json (str, optional) – JSON string like { "api_base": "...", "api_key": "..." }

parse_data

Parse/filter either structured data or HTML into structured fields.

Parameters:

  • data (any) – list of dicts or HTML
  • what (str)
  • mode ("auto" | "structured" | "html")
  • dedupe_on (str, optional)

write_to_database

Write results to PostgreSQL.

Parameters:

  • data (list)
  • host (str)
  • db (str)
  • password (str)
  • table (str)
  • port (int, default 5432)
  • user (str, default "postgres")
  • schema (str, default "public")
  • drop_dup (bool, default False)
  • drop_dup_on (str, optional)

Claude Desktop configuration (stdio)

You can configure Claude Desktop to launch the server via stdio.

Use the Python executable inside your venv so the correct packages are used.

Windows example:

{
  "mcpServers": {
    "unicrawler": {
      "command": "C:/path/to/your/project/.venv/Scripts/python.exe",
      "args": ["-m", "unicrawler.mcp"],
      "env": {}
    }
  }
}

macOS/Linux example:

{
  "mcpServers": {
    "unicrawler": {
      "command": "/path/to/your/project/.venv/bin/python",
      "args": ["-m", "unicrawler.mcp"],
      "env": {}
    }
  }
}

Option B: call the console script

If your environment Scripts/ (Windows) or bin/ (macOS/Linux) is on PATH, you can use:

{
  "mcpServers": {
    "unicrawler": {
      "command": "unicrawler-mcp",
      "args": [],
      "env": {}
    }
  }
}

Notes

  • If you run unicrawler-mcp and see a message about missing mcp, install with UniCrawler[mcp].
  • The server is intended to run as a long-lived process controlled by the MCP client.
  • If unicrawler-start-chrome is not found, ensure your environment Scripts/ (Windows) or bin/ (macOS/Linux) directory is on PATH, or call it through the venv Python (python -m unicrawler.browser.cdp_launcher).

Support & Contact

For issues, feature requests, or questions about UniCrawler MCP:

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
DeepChatYour AI Partner on Desktop
CursorThe AI Code Editor
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Tavily Mcp
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
WindsurfThe new purpose-built IDE to harness magic
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Serper MCP ServerA Serper MCP Server
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Playwright McpPlaywright MCP server
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Amap Maps高德地图官方 MCP Server
ChatWiseThe second fastest AI chatbot™