Sponsored by Deepsite.site

Crawlab MCP Server

Created By
crawlab-team9 months ago
Content

Crawlab MCP Server

This is a Model Context Protocol (MCP) server for Crawlab, allowing AI applications to interact with Crawlab's functionality.

Overview

The MCP server provides a standardized way for AI applications to access Crawlab's features, including:

  • Spider management (create, read, update, delete)
  • Task management (run, cancel, restart)
  • File management (read, write)
  • Resource access (spiders, tasks)

Architecture

The MCP Server/Client architecture facilitates communication between AI applications and Crawlab:

graph TB
    User[User] --> Client[MCP Client]
    Client --> LLM[LLM Provider]
    Client <--> Server[MCP Server]
    Server <--> Crawlab[Crawlab API]

    subgraph "MCP System"
        Client
        Server
    end

    subgraph "Crawlab System"
        Crawlab
        DB[(Database)]
        Crawlab <--> DB
    end

    class User,LLM,Crawlab,DB external;
    class Client,Server internal;

    %% Flow annotations
    LLM -.-> |Tool calls| Client
    Client -.-> |Executes tool calls| Server
    Server -.-> |API requests| Crawlab
    Crawlab -.-> |API responses| Server
    Server -.-> |Tool results| Client
    Client -.-> |Human-readable response| User

    classDef external fill:#f9f9f9,stroke:#333,stroke-width:1px;
    classDef internal fill:#d9edf7,stroke:#31708f,stroke-width:1px;

Communication Flow

  1. User Query: The user sends a natural language query to the MCP Client
  2. LLM Processing: The Client forwards the query to an LLM provider (e.g., Claude, OpenAI)
  3. Tool Selection: The LLM identifies necessary tools and generates tool calls
  4. Tool Execution: The Client sends tool calls to the MCP Server
  5. API Interaction: The Server executes the corresponding Crawlab API requests
  6. Response Generation: Results flow back through the Server to the Client to the LLM
  7. User Response: The Client delivers the final human-readable response to the user

Installation and Usage

Option 1: Install as a Python package

You can install the MCP server as a Python package, which provides a convenient CLI:

# Install from source
pip install -e .

# Or install from GitHub (when available)
# pip install git+https://github.com/crawlab-team/crawlab-mcp-server.git

After installation, you can use the CLI:

# Start the MCP server
crawlab_mcp-mcp server [--spec PATH_TO_SPEC] [--host HOST] [--port PORT]

# Start the MCP client
crawlab_mcp-mcp client SERVER_URL

Option 2: Running Locally

Prerequisites

  • Python 3.8+
  • Crawlab instance running and accessible
  • API token from Crawlab

Configuration

  1. Copy the .env.example file to .env:

    cp .env.example .env
    
  2. Edit the .env file with your Crawlab API details:

    CRAWLAB_API_BASE_URL=http://your-crawlab-instance:8080/api
    CRAWLAB_API_TOKEN=your_api_token_here
    

Running Locally

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Run the server:

    python server.py
    

Running with Docker

  1. Build the Docker image:

    docker build -t crawlab-mcp-server .
    
  2. Run the container:

    docker run -p 8000:8000 --env-file .env crawlab-mcp-server
    

Integration with Docker Compose

To add the MCP server to your existing Crawlab Docker Compose setup, add the following service to your docker-compose.yml:

services:
  # ... existing Crawlab services
  
  mcp-server:
    build: ./backend/mcp-server
    ports:
      - "8000:8000"
    environment:
      - CRAWLAB_API_BASE_URL=http://backend:8000/api
      - CRAWLAB_API_TOKEN=your_api_token_here
    depends_on:
      - backend

Using with AI Applications

The MCP server enables AI applications to interact with Crawlab through natural language. Following the architecture diagram above, here's how to use the MCP system:

Setting Up the Connection

  1. Start the MCP Server: Make sure your MCP server is running and accessible
  2. Configure the AI Client: Connect your AI application to the MCP server

Example: Using with Claude Desktop

  1. Open Claude Desktop
  2. Go to Settings > MCP Servers
  3. Add a new server with the URL of your MCP server (e.g., http://localhost:8000)
  4. In a conversation with Claude, you can now use Crawlab functionality by describing what you want to do in natural language

Example Interactions

Based on our architecture, here are example interactions with the system:

Create a Spider:

User: "Create a new spider named 'Product Scraper' for the e-commerce project"
LLM identifies intent and calls the create_spider tool
MCP Server executes the API call to Crawlab
Spider is created and details are returned to the user

Run a Task:

User: "Run the 'Product Scraper' spider on all available nodes"
LLM calls the run_spider tool with appropriate parameters
MCP Server sends the command to Crawlab API
Task is started and confirmation is returned to the user

Available Commands

You can interact with the system using natural language commands like:

  • "List all my spiders"
  • "Create a new spider with these specifications..."
  • "Show me the code for the spider named X"
  • "Update the file main.py in spider X with this code..."
  • "Run spider X and notify me when it's complete"
  • "Show me the results of the last run of spider X"

Available Resources and Tools

These are the underlying tools that power the natural language interactions:

Resources

  • spiders: List all spiders
  • tasks: List all tasks

Tools

Spider Management

  • get_spider: Get details of a specific spider
  • create_spider: Create a new spider
  • update_spider: Update an existing spider
  • delete_spider: Delete a spider

Task Management

  • get_task: Get details of a specific task
  • run_spider: Run a spider
  • cancel_task: Cancel a running task
  • restart_task: Restart a task
  • get_task_logs: Get logs for a task

File Management

  • get_spider_files: List files for a spider
  • get_spider_file: Get content of a specific file
  • save_spider_file: Save content to a file
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Playwright McpPlaywright MCP server
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Tavily Mcp
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
WindsurfThe new purpose-built IDE to harness magic
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
CursorThe AI Code Editor
DeepChatYour AI Partner on Desktop
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
ChatWiseThe second fastest AI chatbot™
Amap Maps高德地图官方 MCP Server
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Serper MCP ServerA Serper MCP Server