Sponsored by Deepsite.site

Selenium Mcp

Created By
amandeep-sg21 days ago
This server is implemented in python to bridge the gap between the AI Assistant or (custom MCP clients) and Selenium Webdrivers. It exposes selenium webdriver functionalities as MCP tools allowing AI assistanct/MCP clients to user them to perform task for web automation, web testing or web scraping.
Overview

Selenium mcp logo Selenium mcp logo

Static Badge Static Badge Static Badge Static Badge

Introduction

This server is implemented in python to bridge the gap between the AI Assistant or (custom MCP clients) and Selenium Webdrivers. It exposes selenium webdriver functionalities as MCP tools allowing AI assistanct/MCP clients to user them to perform task for web automation, web testing or web scraping.

Release Notes

Version 2.0.1 - Release 11 April 2026

In this version, minor enhancements are made in error code, return types and readme file

  1. Retun type changes from Union[None, dict], int, bool to str in
    1. cookies.py
    2. find.py
    3. get.py
    4. input.py

Version 2.0.0 - Release 4 April 2026

In this version, we have done some structural changes like seperating functions into save and get. Now save is just focused on saving files on the disk. And get is where LLM wants to get the data from the browser.

Following are the list of enhancements:

  1. Get: To get webpage as markdown, html, screenshot, element's screenshot, list of urls
  2. JS Executor: To execute javascript code for interacting with the webpage
  3. Files: To upload and download files
  4. Alerts: To handle alerts
  5. Click: Added drag and drop of elements by xpath
  6. File: To save webpage as pdf on disk

Version 1.0.0 - Release 31 March 2026

  1. Web Driver: Create new or quit exiting webdiver sessions
  2. Cookies: To manage cookies (add, delete, get, clear)
  3. Clicks: To perform clicks on elements (left client, right click, double click)
  4. Browser: To navigate urls and manage browser capabilities like resize, maximize, minimize, fullscreen, etc.
  5. Scroll: To scroll the entire webpage
  6. Input: Input text into elements and select/unselect checkbox, radio buttons, dropdowns options, etc.
  7. Find: To find element by xPaths

Key Features

  1. Humanised error handleing, enables LLM to intreperate errors and reconfigure tool usage accordingly
  2. Comprehensive element interaction: Clicks, input, select are performed by checking if element is visible, enabled, clickable, etc
  3. Full Navigation control: Open New url, click forward, backward, refresh, etc

The tools leverages following technologies to support

  1. FastMCP: For MCP server implementation
  2. Selenium: For web automation
  3. Google GenAI: For AI assistant

Upcomming

Following are the list of features that will be added in the future:

  1. Tools to support Chrome Dev Tools & BiDi
  2. Enhance save functionality to save files in different formats
  3. Enhanced error handling
  4. Multi browser support (Firefox, Edge, Safari, etc)

Example

Prompt: "Open https://rfpnotification.com and join the waiting list by entering the email address: [test_user@example.com]"

BeforeAfter
BeforeAfter

After running the script, the browser took the screenshot to check if the email was entered successfully. Screenshot

Test In Action

Test in action

Dev Setup

Clone the repository

git clone {url}

Create virtual environment

python3 -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements.txt

Run the server

python server.py

The package comes with a lightweight MCP client using Google GenAI SDK to test the server. It is implemented in server.py file. To use it, you need to have a Google GenAI API key. Set it in the .env file as GEMINI_API_KEY={your_api_key}.

Run the client

python server.py

Architecture

The mcp has a very somple architecture as shown below. If this MCP is hosted locally then it communicates via stdio, if it is hosted on cloud then it communicates via http.

Architecture

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Amap Maps高德地图官方 MCP Server
WindsurfThe new purpose-built IDE to harness magic
Playwright McpPlaywright MCP server
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
DeepChatYour AI Partner on Desktop
CursorThe AI Code Editor
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Serper MCP ServerA Serper MCP Server
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Tavily Mcp
RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.
ChatWiseThe second fastest AI chatbot™