Sponsored by Deepsite.site

Chrome API MCP Server

Created By
needsupport9 months ago
A Chrome browser control MCP server for AI assistants to directly interact with and control Chrome via Chrome DevTools Protocol
Content

Chrome API MCP Server

A Chrome API MCP (Model Context Protocol) server that provides semantic understanding of web pages for AI assistants like Claude, enabling DOM-based browsing without relying on screenshots.

Features

  • Semantic DOM Analysis: Build structured representations of web pages
  • Efficient Browsing: Provides content extraction without relying on screenshots
  • Interactive Navigation: Identify and interact with elements based on semantics
  • Reliable Element Selection: Multiple strategies for finding and interacting with page elements
  • Cache Optimization: Smart caching system for improved performance
  • Error Handling: Robust error management for reliable operation
  • Detailed Logging: Comprehensive logging system for debugging

Requirements

  • Node.js 16+
  • Chrome browser (must be installed)
  • npm or yarn

Quick Start

  1. Make the startup script executable:

    chmod +x start-custom-mcp.sh
    
  2. Start the server:

    ./start-custom-mcp.sh
    

    This script will:

    • Check if Chrome is running with remote debugging enabled
    • Start Chrome with the correct flags if needed
    • Build the TypeScript code
    • Start the MCP server on port 3001
  3. Run the example to test functionality:

    npx ts-node examples/analyze-page.ts https://example.com
    

API Methods

Basic Methods

  • initialize: Initialize the connection
  • navigate: Open a URL in a new tab
  • getContent: Get the raw HTML content of a page
  • executeScript: Execute JavaScript code in a tab
  • clickElement: Click on an element matching a CSS selector
  • takeScreenshot: Capture a screenshot (optional)
  • closeTab: Close a tab

Semantic Understanding Methods

  • getStructuredContent: Get a structured representation of the page content
  • analyzePageSemantics: Analyze the page and build a semantic DOM model
  • findElementsByText: Find elements containing specific text
  • findClickableElements: Find all interactive elements on the page
  • clickSemanticElement: Click an element by its semantic ID
  • fillFormField: Fill a form field by its semantic ID
  • performSearch: Use the page's search functionality

Example Usage

// Initialize and navigate to a page
const response1 = await fetch('http://localhost:3001', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'navigate',
    params: { url: 'https://example.com' },
    id: 1
  })
});
const { result: { tabId } } = await response1.json();

// Get structured content
const response2 = await fetch('http://localhost:3001', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'getStructuredContent',
    params: { tabId },
    id: 2
  })
});
const { result: { content } } = await response2.json();
console.log(content);

// Find and click a button
const response3 = await fetch('http://localhost:3001', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'findElementsByText',
    params: { tabId, text: 'Login' },
    id: 3
  })
});
const { result: { elements } } = await response3.json();

if (elements.length > 0) {
  await fetch('http://localhost:3001', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      jsonrpc: '2.0',
      method: 'clickSemanticElement',
      params: { tabId, semanticId: elements[0].semanticId },
      id: 4
    })
  });
}

Configuration

  • Chrome debugging port: 9222 (default)
  • MCP server port: 3001 (configurable via PORT environment variable)
  • Debug mode: Set environment variable DEBUG=true to enable verbose logging
  • Cache settings: Configured in config.ts
    • Default TTL: 10 seconds
    • Max cache size: 200 entries
  • Connection timeouts: Configurable in config.ts

Debugging

Enable debug mode by setting the DEBUG environment variable:

DEBUG=true ./start-custom-mcp.sh

For more granular debugging of specific modules, use:

DEBUG=chrome-page-analyzer:* ./start-custom-mcp.sh

Log Files

The server creates log files in the logs directory with detailed information about all operations:

  • Log location: ./logs/chrome-mcp.log
  • Log rotation: Automatically rotates logs when they reach 10MB (configurable)
  • Log levels: ERROR, WARN, INFO, DEBUG, TRACE
  • Log content: Timestamps, request IDs, method calls, parameters, and results

You can view logs in real-time using:

tail -f logs/chrome-mcp.log

How It Works

  1. When a page is loaded, the server builds a semantic model of the page
  2. The model includes:
    • Semantic element types (navigation, button, link, form, content, etc.)
    • Text content and structure
    • Interactive elements
    • Hierarchical relationships
  3. AI systems can query this model to understand the page content and structure
  4. Actions can be performed through semantic references rather than coordinates

This approach is more efficient than using screenshots and provides better context for AI assistants to understand and interact with web pages.

Architecture

The server is built with a modular architecture:

  • ChromeAPI: Main API class exposing methods to clients
  • DOMInteractionLayer: Core DOM interaction functionality
  • SemanticAnalyzer: Semantic understanding of page structure
  • ContentExtractor: Extract structured content from pages
  • Error Handler: Centralized error management
  • DOM Helpers: Utility functions for DOM manipulation
  • Cache: Optimized caching system for improved performance
  • Logger: Comprehensive logging system for debugging

Development

  1. Install dependencies:

    npm install
    
  2. Build TypeScript code:

    npm run build
    
  3. Run the server with debugging:

    DEBUG=true node custom-mcp-server.js
    
  4. Run tests:

    npm test
    

Troubleshooting

  • Chrome connection issues: Make sure Chrome is running with the remote debugging port open. You can start it manually with google-chrome --remote-debugging-port=9222.
  • Port conflicts: If port 3001 is already in use, set a different port with PORT=3002 ./start-custom-mcp.sh.
  • TypeScript build errors: Check for any type errors in the source code and fix them before building.
  • Element interaction failures: If clicking elements fails, the server attempts multiple strategies (mouse events, JavaScript). Check the debug logs for details.
  • Memory issues: If you encounter memory problems, adjust the cache settings in config.ts.
  • Log file access: If you encounter permission issues with log files, make sure the user running the server has write access to the logs directory.

License

MIT

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
DeepChatYour AI Partner on Desktop
Amap Maps高德地图官方 MCP Server
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Serper MCP ServerA Serper MCP Server
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
Tavily Mcp
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
WindsurfThe new purpose-built IDE to harness magic
Playwright McpPlaywright MCP server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
CursorThe AI Code Editor
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
ChatWiseThe second fastest AI chatbot™