Sponsored by Deepsite.site

Lionscraper Mcp

Created By
dowanta month ago
LionScraper is a browser extension that can collect lists, articles, links, images, and more from web pages. This repository provides the companion bridge between your tools and that extension in three ways: MCP (lionscraper-mcp): connect an AI app (e.g. Cursor) so the model can call scraping tools over stdio. CLI (lionscraper): run daemon, scrape, ping, and more from a terminal on the same local HTTP/WebSocket port as the extension. HTTP API: when the daemon is running, call the same capabilities over loopback JSON HTTP (e.g. /v1/...) from scripts or any HTTP client—no MCP or CLI front-end required. The real scraping logic runs in the extension; these packages connect and forward.
Overview

LionScraper MCP + CLI + HTTP API bridge

简体中文

What is this?

LionScraper is a browser extension that can collect lists, articles, links, images, and more from web pages. This repository provides the companion bridge between your tools and that extension in three ways:

  • MCP (lionscraper-mcp): connect an AI app (e.g. Cursor) so the model can call scraping tools over stdio.
  • CLI (lionscraper): run daemon, scrape, ping, and more from a terminal on the same local HTTP/WebSocket port as the extension.
  • HTTP API: when the daemon is running, call the same capabilities over loopback JSON HTTP (e.g. /v1/...) from scripts or any HTTP client—no MCP or CLI front-end required.

The real scraping logic runs in the extension; these packages connect and forward.

Before you start

  1. Browser: Chrome or Edge (follow what the extension supports).
  2. LionScraper extension: install and enable from the store.
  3. Runtime (pick one or both implementations):
    • Node.js 18+ for the npm package — Node.js
    • Python 3.10+ for the PyPI package — Python
  4. For MCP: an AI app that supports MCP (e.g. Cursor, Trae).
  5. For the HTTP API: same browser, extension, and daemon as the CLI; see the package READMEs for paths and examples.

HTTP fallback without Chrome/Edge: If neither browser is detected under standard paths and the extension is not connected, MCP still starts; ping succeeds with http_fetch mode and scrape* use a minimal server-side HTTP GET (no JS execution). If a browser is installed but the extension is not connected, you still get the extension connection flow. The Node auto-spawn path fixes Unix installs where lionscraper.js was resolved without a leading / (e.g. Glama/Docker). The Python package uses aiohttp for outbound HTTP/WebSocket to the daemon.

Two implementations

Node.js (npm)Python (pip)
Registryio.github.dowant/lionscraper-nodeio.github.dowant/lionscraper-python
Docs (EN)packages/node/README.mdpackages/python/README.md
Docs (ZH)packages/node/README_cn.mdpackages/python/README_cn.md

Install one or both; they are separate packages with the same CLI command names.

Install (npm)

Published as lionscraper on npm.

npm install -g lionscraper

Without a global install, MCP can use npx; see the npx JSON examples under Add MCP in your AI app.

Install (pip)

Published as lionscraper on PyPI.

pip install -U lionscraper

A virtual environment is recommended, or pip install -U --user lionscraper if you prefer not to install into the system interpreter.

Commands (both packages)

CommandRole
lionscraper-mcpThin MCP server (stdio) for AI apps
lionscraperCLI: daemon, stop, scrape, ping, … (also serves the HTTP API on the same port)

After pip install -U lionscraper, if lionscraper-mcp is not on your PATH, use python -m lionscraper with no extra arguments for MCP stdio (see packages/python/README.md).

PORT (default 13808) must match the extension bridge port in all modes.

CLI quick start

lionscraper daemon
lionscraper ping
lionscraper scrape -u https://www.example.com

Full flags, multiple URLs, pagination, and HTTP API details: packages/node/README.md / packages/python/README.md.

Add MCP in your AI app

Examples assume lionscraper-mcp is on your PATH (from npm or pip). In MCP JSON, every env value is a string.

Minimal config (PORT defaults to 13808; must match the extension bridge port):

{
  "mcpServers": {
    "lionscraper": {
      "command": "lionscraper-mcp"
    }
  }
}

Full env example (omit keys you do not need):

{
  "mcpServers": {
    "lionscraper": {
      "command": "lionscraper-mcp",
      "env": {
        "PORT": "13808",
        "TIMEOUT": "120000",
        "LANG": "en-US",
        "TOKEN": "",
        "DAEMON": ""
      }
    }
  }
}

npx (no global install) — requires Node.js; the first run may download the package. The npm package name is lionscraper; the executable is lionscraper-mcp. Use command npx and pass lionscraper then lionscraper-mcp in args (after -y).

Minimal config (npx):

{
  "mcpServers": {
    "lionscraper": {
      "command": "npx",
      "args": ["-y", "lionscraper", "lionscraper-mcp"]
    }
  }
}

Full env example (npx):

{
  "mcpServers": {
    "lionscraper": {
      "command": "npx",
      "args": ["-y", "lionscraper", "lionscraper-mcp"],
      "env": {
        "PORT": "13808",
        "TIMEOUT": "120000",
        "LANG": "en-US",
        "TOKEN": "",
        "DAEMON": ""
      }
    }
  }
}

To pin a version, use e.g. "lionscraper@1.0.1" in place of "lionscraper" inside args.

  • PORT: HTTP + WebSocket listen port; default 13808; must match the extension bridge port.
  • TIMEOUT: ms to wait for a previous instance to release the port; default 120000; 0 forces takeover quickly.
  • LANG: tool descriptions and stderr language (en-US, zh-CN, or POSIX forms).
  • TOKEN: Bearer token shared with the daemon; empty means no auth.
  • DAEMON: only 0 disables auto-starting lionscraper daemon from thin MCP.

Restart MCP or the host app after changing config.

Python: MCP via python -m

{
  "mcpServers": {
    "lionscraper": {
      "command": "python",
      "args": ["-m", "lionscraper"]
    }
  }
}

Use the same python you used to install the package (or python3 on some systems).

Match the port in the browser extension

  1. Open LionScraper settings / options.
  2. Set bridge port to the same value as PORT (e.g. 13808).
  3. If needed, use Reconnect, reload the extension, or restart the browser.

Day-to-day use

  1. Keep the extension enabled and target pages open as required.
  2. Ask in natural language (e.g. check connection, scrape lists / article / emails / phones / links / images).
  3. If you see “not connected” or timeouts, retry a connection check and confirm PORT matches.

FAQ

Extension not connected or scrape fails?

  • Is the extension enabled?
  • Does PORT in the AI app match the extension bridge port exactly?
  • One bridge per machine is usually enough; duplicate MCP configs can conflict.

Seeing MCP tools in the client means everything works?

Not necessarily. Tools only prove AI → bridge; the extension must also register on the same port.

MCP Registry and directories

Official MCP Registry entries (both use server.json):

PathRegistry namePackage
packages/node/server.jsonio.github.dowant/lionscraper-nodenpm: lionscraper (mcpName in package.json)
packages/python/server.jsonio.github.dowant/lionscraper-pythonPyPI: lionscraper (mcp-name comment in English README.md)

Publish outline (install the official CLI, see Quickstart):

  1. Publish npm / PyPI at the version in each server.json.
  2. In packages/node: mcp-publisher login github, then mcp-publisher publish.
  3. In packages/python: mcp-publisher publish (login reused).

Third-party listings (e.g. Glama) have their own rules; Smithery targets public HTTPS/streaming setups rather than local stdio + npm/pip by default.

Third-party directory (Glama)

This project is listed on Glama (e.g. LionScraper on Glama). If the page shows cannot be installed or license not found, typical fixes are: add a root LICENSE (this repo includes LICENSE), add glama.json with maintainer GitHub usernames for org-owned repos (glama.json—edit maintainers if claim fails), claim the server on Glama, and optionally complete Glama’s Docker / release flow if you need their install and security/quality checks—official install remains npm install -g lionscraper and pip install -U lionscraper. See also the score / checklist page.

License

MIT (same as the npm and PyPI packages).

Server Config

{
  "mcpServers": {
    "lionscraper": {
      "command": "lionscraper-mcp",
      "env": {
        "PORT": "13808",
        "TIMEOUT": "120000",
        "LANG": "en-US",
        "TOKEN": "",
        "DAEMON": ""
      }
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Playwright McpPlaywright MCP server
CursorThe AI Code Editor
RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.
ChatWiseThe second fastest AI chatbot™
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
DeepChatYour AI Partner on Desktop
WindsurfThe new purpose-built IDE to harness magic
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Serper MCP ServerA Serper MCP Server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Tavily Mcp
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
Amap Maps高德地图官方 MCP Server