- Browser Control
Browser Control
BrowserControl
Vision-first browser control for AI agents.
BrowserControl is a production-ready MCP server that gives AI agents real browser capabilities. Not HTML scraping. Not DOM guessing. Full Chromium control with screenshots, numbered interactive elements, persistent sessions, and built-in developer tooling.
Inspired by Google’s AntiGravity browser control model.
What it does
BrowserControl lets AI agents:
- See fully rendered web pages
- Interact using numbered elements instead of selectors
- Click, type, scroll, and navigate like a human
- Stay logged in across sessions
- Debug real web apps using console and network tools
- Record browser sessions for replay and inspection
All locally. No vision APIs. No external AI calls.
Core features
Set of Marks (SoM)
Every page screenshot is annotated with numbered boxes on interactive elements.
Agents simply call click(5) or type_text(3, "hello").
Vision-first interaction Agents operate on what the page looks like, not fragile DOM trees.
Persistent browser sessions Cookies, localStorage, login state, and history persist automatically.
Developer tooling built in
- Console logs
- Network requests
- JavaScript errors
- Runtime JS execution
- Element inspection
- Page performance metrics
Session recording
- Start and stop recordings
- Save Playwright traces
- Replay sessions for debugging or documentation
Zero extra AI cost
- No vision models
- No selector inference
- No secondary LLM calls
Why BrowserControl
- 50–100× fewer tokens per action than DOM-based tools
- Faster actions, lower latency
- Works fully offline
- No brittle selectors
- MCP-native and production-ready
Available tools
Navigation
navigate_to, go_back, refresh_page, scroll
Interaction
click, type_text, hover, press_key, scroll_to_element
Content
get_page_content, get_text, screenshot, run_javascript
Developer tools
get_console_logs, get_network_requests, get_page_errors, inspect_element, get_page_performance
Recording
start_recording, stop_recording, take_snapshot, list_recordings
Use cases
- Autonomous web research
- Browser-based agent workflows
- Automated form filling
- End-to-end testing
- Web app debugging
- Login-required automation
- Agent-driven QA
Tech stack
- Python 3.11+
- FastMCP
- Playwright (Chromium auto-installed)
- Fully local execution
Status
Stable. Actively developed. MIT licensed.
Server Config
{
"mcpServers": {
"browsercontrol": {
"command": "python",
"args": [
"-m",
"browsercontrol"
]
}
}
}