Sponsored by Deepsite.site

MCP Browser Automation

Created By
algonius6 months ago
MCP browser automation server. Exposes browser control tools to external AI systems via Model Context Protocol. Open-source & secure.
Content
Algonius Browser Banner

MCP Browser Automation

🌐 Overview

Algonius Browser is an open-source MCP (Model Context Protocol) server that provides browser automation capabilities to external AI systems. It exposes a comprehensive set of browser control tools through the MCP protocol, enabling AI assistants and other tools to navigate websites, interact with DOM elements, and extract web content programmatically.

🎯 Key Features

  • MCP Protocol Integration: Standard interface for AI systems to control browser automation
  • Chrome Extension: Background service worker that handles browser interactions
  • Native Messaging: Go-based MCP host that bridges Chrome extension with external tools
  • Comprehensive Tool Set: 5 browser automation tools + 2 MCP resources
  • Type Safety: Full TypeScript implementation with structured error handling
  • Testing Suite: Comprehensive integration tests for all functionality

🛠️ Available MCP Tools

  • navigate_to: Navigate to URLs with configurable timeout handling
  • manage_tabs: Create, close, and switch between browser tabs

DOM Interaction

  • get_dom_extra_elements: Advanced DOM element extraction with pagination and filtering
  • click_element: Click DOM elements using CSS selectors or text matching
  • set_value: Set values in input fields, textareas, and form elements
  • scroll_page: Scroll pages up or down with customizable distances

📋 Available MCP Resources

Browser State Resources

  • browser://current/state: Complete current browser state in AI-friendly Markdown format

    • Active tab information
    • All browser tabs with URLs, titles, and status
    • Real-time state updates via resource notifications
  • browser://dom/state: Current DOM state overview in Markdown format

    • Page metadata (URL, title, scroll position)
    • First 20 interactive elements
    • Total element count with "more available" indicators
    • Simplified DOM structure
    • Auto-updates when page changes

🚀 Quick Start

1. Install MCP Host

One-Click Installation (Recommended):

curl -fsSL https://raw.githubusercontent.com/algonius/algonius-browser/master/install-mcp-host.sh | bash

Manual Installation:

# Download latest release
wget https://github.com/algonius/algonius-browser/releases/latest/download/mcp-host-linux-x86_64.tar.gz

# Extract and install
tar -xzf mcp-host-linux-x86_64.tar.gz
cd mcp-host-linux-x86_64
./install.sh

2. Install Chrome Extension

From Source:

# Clone and build
git clone https://github.com/algonius/algonius-browser.git
cd algonius-browser
pnpm install
pnpm build

# Load in Chrome
# 1. Open chrome://extensions/
# 2. Enable "Developer mode"
# 3. Click "Load unpacked"
# 4. Select the 'dist' folder

3. Start MCP Host

# Test the installation
mcp-host-go --version

# The MCP host will be automatically started when needed by the Chrome extension

🔧 Integration Examples

Using with AI Assistants

Once installed, AI systems can use the browser automation tools and resources through the MCP protocol:

Tool Usage:

{
  "method": "tools/call",
  "params": {
    "name": "navigate_to",
    "arguments": {
      "url": "https://example.com",
      "timeout": 30000
    }
  }
}

Resource Access:

{
  "method": "resources/read",
  "params": {
    "uri": "browser://current/state"
  }
}

Common Workflows

Web Scraping:

  1. navigate_to → Navigate to target site
  2. Read browser://dom/state → Get page overview
  3. get_dom_extra_elements → Get specific elements with pagination
  4. click_element → Interact with elements
  5. Read browser://dom/state → Extract updated content

Form Automation:

  1. navigate_to → Go to form page
  2. Read browser://dom/state → Identify form elements
  3. set_value → Fill form fields
  4. click_element → Submit form
  5. Read browser://current/state → Verify completion

Multi-Tab Management:

  1. Read browser://current/state → Check current tabs
  2. manage_tabs → Create/switch tabs
  3. navigate_to → Load content in each tab
  4. Read browser://current/state → Monitor all tab states

Page Navigation with Scrolling:

  1. navigate_to → Go to target page
  2. Read browser://dom/state → Get initial page state
  3. scroll_page → Scroll to load more content
  4. get_dom_extra_elements → Extract newly loaded elements

🏗️ Architecture

External AI System
       ↓ (MCP Protocol)
   MCP Host (Go)
       ↓ (Native Messaging)
Chrome Extension
       ↓ (Chrome APIs)
    Browser Tabs

Components

  • MCP Host: Go-based native messaging host that implements MCP protocol
  • Chrome Extension: Background service worker with tool handlers
  • Content Scripts: DOM interaction and data extraction utilities
  • Integration Tests: Comprehensive test suite for all tools

🧪 Development

Build from Source

Prerequisites:

  • Node.js 22.12.0+
  • pnpm 9.15.1+
  • Go 1.21+ (for MCP host)

Build Extension:

pnpm install
pnpm build

Build MCP Host:

cd mcp-host-go
make build

Run Tests:

# Extension tests
pnpm test

# MCP host tests  
cd mcp-host-go
make test

Development Mode

# Extension development
pnpm dev

# MCP host development
cd mcp-host-go
make dev

📊 Supported Platforms

MCP Host:

  • Linux x86_64
  • macOS Intel (x86_64) and Apple Silicon (arm64)
  • Windows x86_64

Chrome Extension:

  • Chrome/Chromium 88+
  • Microsoft Edge 88+

📚 Documentation

Detailed documentation available in the docs/ directory:

🤝 Contributing

We welcome contributions! Check out our CONTRIBUTING.md for guidelines.

Ways to contribute:

  • Report bugs and feature requests
  • Submit pull requests for improvements
  • Add integration tests
  • Improve documentation
  • Share usage examples

🔒 Security

For security vulnerabilities, please create a GitHub Security Advisory rather than opening a public issue.

💬 Community

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

👏 Acknowledgments

Built with these excellent open-source projects:


Made with ❤️ by the Algonius Browser Team

Give us a star 🌟 if this project helps you build better browser automation!

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
DeepChatYour AI Partner on Desktop
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
WindsurfThe new purpose-built IDE to harness magic
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Serper MCP ServerA Serper MCP Server
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
ChatWiseThe second fastest AI chatbot™
CursorThe AI Code Editor
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Tavily Mcp
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Playwright McpPlaywright MCP server
Amap Maps高德地图官方 MCP Server