- Unicrawler
Unicrawler
UniCrawler MCP Server
Stop writing selectors. Start describing data.
UniCrawler ships an MCP (Model Context Protocol) server that exposes UniCrawler’s crawling/parsing/storage capabilities to MCP-capable clients.
This server is designed to be used via stdio transport (recommended for desktop clients).
Why UniCrawler (as an MCP server)
Traditional crawlers break when a div moves. UniCrawler is built around natural-language intent (“what you want”), so your workflows are more resilient to minor UI changes.
- Natural Language Driven: no CSS/XPath required—describe the fields you want.
- AI-Powered Parsing: extract structured data from messy HTML/DOM.
- Browser Automation: works with dynamic pages through browser automation.
UniCrawler vs. traditional frameworks
| Feature | Traditional (Scrapy, Selenium...) | UniCrawler MCP |
|---|---|---|
| Configuration | Complex selectors | Natural language descriptions |
| Maintenance | Breaks on layout changes | More resilient intent-driven approach |
| Parsing | Regex / brittle rules | Semantic extraction + normalization |
| Interface | Python scripts | Tools callable from MCP clients |
Core capabilities
UniCrawler MCP is backed by the same core modules as the library:
- Intelligent crawling
- Browser automation for dynamic rendering
- Strategy-driven crawling (pagination / limits)
- Smart parsing
- Filter structured data, or extract from HTML
- Dedupe support for stable outputs
- One-click storage (PostgreSQL)
- Write structured results to PostgreSQL
- Optional dedupe during writes
Install (from PyPI)
Create a clean environment, then install UniCrawler with MCP extra:
Windows (venv)
python -m venv .venv
.\.venv\Scripts\pip install -U pip
.\.venv\Scripts\pip install "UniCrawler[mcp]"
macOS/Linux (venv)
python -m venv .venv
. .venv/bin/activate
pip install -U pip
pip install "UniCrawler[mcp]"
Run
After installation, you can start the server with either:
unicrawler-mcp
or:
python -m unicrawler.mcp
The server uses stdio transport by default (recommended for desktop clients).
(Optional) Start Chrome for CDP
If your workflow uses CDP-based crawling, make sure Chrome/Chromium is running with a remote debugging port.
Windows PowerShell:
unicrawler-start-chrome --port 9222 --profile .\chrome_cdp_profile --headless
macOS/Linux:
unicrawler-start-chrome --port 9222 --profile ./chrome_cdp_profile --headless
Tools
The MCP server exposes the following tools:
crawl_url
Crawl a URL and return structured data.
Parameters:
url(str)what(str) – describe which fields you want (e.g. "title, price, image")page_limit(int, default 1)item_limit(int, default 200)llm_config_json(str, optional) – JSON string like{ "api_base": "...", "api_key": "..." }
parse_data
Parse/filter either structured data or HTML into structured fields.
Parameters:
data(any) – list of dicts or HTMLwhat(str)mode("auto" | "structured" | "html")dedupe_on(str, optional)
write_to_database
Write results to PostgreSQL.
Parameters:
data(list)host(str)db(str)password(str)table(str)port(int, default 5432)user(str, default "postgres")schema(str, default "public")drop_dup(bool, default False)drop_dup_on(str, optional)
Claude Desktop configuration (stdio)
You can configure Claude Desktop to launch the server via stdio.
Option A (recommended): point to the venv Python
Use the Python executable inside your venv so the correct packages are used.
Windows example:
{
"mcpServers": {
"unicrawler": {
"command": "C:/path/to/your/project/.venv/Scripts/python.exe",
"args": ["-m", "unicrawler.mcp"],
"env": {}
}
}
}
macOS/Linux example:
{
"mcpServers": {
"unicrawler": {
"command": "/path/to/your/project/.venv/bin/python",
"args": ["-m", "unicrawler.mcp"],
"env": {}
}
}
}
Option B: call the console script
If your environment Scripts/ (Windows) or bin/ (macOS/Linux) is on PATH, you can use:
{
"mcpServers": {
"unicrawler": {
"command": "unicrawler-mcp",
"args": [],
"env": {}
}
}
}
Notes
- If you run
unicrawler-mcpand see a message about missingmcp, install withUniCrawler[mcp]. - The server is intended to run as a long-lived process controlled by the MCP client.
- If
unicrawler-start-chromeis not found, ensure your environmentScripts/(Windows) orbin/(macOS/Linux) directory is on PATH, or call it through the venv Python (python -m unicrawler.browser.cdp_launcher).
Support & Contact
For issues, feature requests, or questions about UniCrawler MCP:
- Email: inficonn@proton.me
- Repository: https://pypi.org/project/UniCrawler/