Sponsored by Deepsite.site

Fetch MCP Server with CSS selectors function

Created By
burnworks9 months ago
Content

Fetch MCP Server with CSS selectors function

A Model Context Protocol server that provides web content fetching capabilities. This server enables LLMs to retrieve and process content from web pages, converting HTML to markdown for easier consumption.

The fetch tool will truncate the response, but by using the start_index argument, you can specify where to start the content extraction. This lets models read a webpage in chunks, until they find the information they need.

Available Tools

  • custom-fetch - Fetches a URL from the internet and extracts its contents as markdown.
    • url (string, required): URL to fetch
    • max_length (integer, optional): Maximum number of characters to return (default: 5000)
    • start_index (integer, optional): Start content from this character index (default: 0)
    • raw (boolean, optional): Get raw content without markdown conversion (default: false)
    • selector (string, optional): CSS selector, ID, or element name to extract specific content
    • selector_type (string, optional): Type of selector: 'css', 'id', or 'element'

Prompts

  • custom-fetch
    • Fetch a URL and extract its contents as markdown
    • Arguments:
      • url (string, required): URL to fetch
      • selector (string, optional): CSS selector, ID, or element name to extract specific content
      • selector_type (string, optional): Type of selector: 'css', 'id', or 'element'

Selector Feature

This enhanced version includes a powerful selector feature that allows you to extract specific content from web pages:

Types of Selectors

  • ID Selector: Extract a specific element by its ID attribute

    {
      "url": "https://example.com",
      "selector": "main-content",
      "selector_type": "id"
    }
    
  • Element Selector: Extract the first element of a specific type

    {
      "url": "https://example.com",
      "selector": "main",
      "selector_type": "element"
    }
    
  • CSS Selector: Extract content using CSS selector syntax

    {
      "url": "https://example.com",
      "selector": ".article-content > p",
      "selector_type": "css"
    }
    

Use Cases

  • Extract just the main article content from news sites
  • Focus on specific sections of documentation pages
  • Target precisely the content you need from large web pages

Installation

Optionally: Install node.js, this will cause the fetch server to use a different HTML simplifier that is more robust.

When using uv no specific installation is needed. We will use uvx to directly run burnworks-mcp-server-fetch.

Using PIP

Alternatively you can install burnworks-mcp-server-fetch via pip:

pip install burnworks-mcp-server-fetch

After installation, you can run it as a script using:

python -m burnworks_mcp_server_fetch

Configuration

Configure for Claude.app

Add to your Claude settings:

Using uvx
"mcpServers": {
  "custom-fetch": {
    "command": "uvx",
    "args": ["burnworks-mcp-server-fetch"]
  }
}
Using pip installation
"mcpServers": {
  "custom-fetch": {
    "command": "python",
    "args": ["-m", "burnworks_mcp_server_fetch"]
  }
}

Customization - robots.txt

By default, the server will obey a websites robots.txt file if the request came from the model (via a tool), but not if the request was user initiated (via a prompt). This can be disabled by adding the argument --ignore-robots-txt to the args list in the configuration.

Customization - User-agent

By default, depending on if the request came from the model (via a tool), or was user initiated (via a prompt), the server will use either the user-agent

ModelContextProtocol/1.0 (Autonomous; +https://github.com/modelcontextprotocol/servers)

or

ModelContextProtocol/1.0 (User-Specified; +https://github.com/modelcontextprotocol/servers)

This can be customized by adding the argument --user-agent=YourUserAgent to the args list in the configuration.

Customization - Proxy

The server can be configured to use a proxy by using the --proxy-url argument.

Debugging

You can use the MCP inspector to debug the server. For uvx installations:

npx @modelcontextprotocol/inspector uvx burnworks-mcp-server-fetch

Or if you've installed the package in a specific directory or are developing on it:

cd path/to/servers/src/fetch
npx @modelcontextprotocol/inspector uv run burnworks-mcp-server-fetch

Example Selector Usage

Extract Just the Main Content Area

custom-fetch
  url: https://example.com/article
  selector: main
  selector_type: element

Extract Content by ID

custom-fetch
  url: https://example.com/blog
  selector: article-body
  selector_type: id

Extract with Complex CSS Selector

custom-fetch
  url: https://example.com/documentation
  selector: .content-wrapper article > section:first-child
  selector_type: css

Contributing

This project, burnworks_mcp_server_fetch, was developed as a fork of the original mcp-server-fetch with added CSS selector functionality. The original project can be found at:

https://github.com/modelcontextprotocol/servers

If you'd like to contribute to this enhanced version, feel free to submit issues or pull requests to our repository. For information about the base MCP servers architecture and implementation patterns, please refer to the original project link above.

License

This project is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
WindsurfThe new purpose-built IDE to harness magic
Tavily Mcp
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Amap Maps高德地图官方 MCP Server
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Playwright McpPlaywright MCP server
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
CursorThe AI Code Editor
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
ChatWiseThe second fastest AI chatbot™
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
DeepChatYour AI Partner on Desktop
Serper MCP ServerA Serper MCP Server