- AgentKit Browser Automation
Content
AgentKit Browser Automation
A sophisticated browser automation framework built with AgentKit, featuring a multi-agent system for intelligent web navigation and task execution.
Overview
This project implements a multi-agent system for browser automation, where different agents work together to:
- Plan and break down tasks
- Navigate web pages
- Execute browser actions
- Validate results
Architecture (TODO)
The system consists of four specialized agents:
-
Planning Agent
- Breaks down tasks into actionable steps
- Creates detailed execution plans
- Determines task completion criteria
-
Navigator Agent
- Determines the next actions to take
- Manages state transitions
- Handles action execution
- Provides detailed logging and feedback
-
Browser Agent
- Executes browser automation actions
- Interacts with web elements
- Handles page navigation
- Manages browser state
-
Validation Agent
- Validates task completion
- Verifies results
- Handles error cases
- Provides feedback on success/failure
Features
- Intelligent Task Planning: Breaks down complex tasks into manageable steps
- State Management: Tracks browser state and action results
- Error Handling: Robust error handling and recovery mechanisms
- Event System: Comprehensive event logging and monitoring
- Flexible Action System: Extensible action registry for custom behaviors
- Validation Framework: Built-in validation for task completion
- Memory Management: Maintains context and history of actions
Getting Started
Prerequisites
- Node.js (v14 or higher)
- npm or yarn
- OpenAI API key (for GPT models)
Installation
- Clone the repository:
git clone https://github.com/tmahesh/playwright-agent.git
cd playwright-agent
- Install dependencies:
npm install
- Set up environment variables:
cp .env.sample .env
# Edit .env with your OpenAI API key and other configurations
- run these commands on diff terminals: index.ts, playwright-mcp, inngest-cli
npx @playwright/mcp@latest --port 8931
npx tsx index.ts
npx inngest-cli@latest dev --no-discovery -u http://localhost:3000/api/inngest -v
Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
Acknowledgments
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
WindsurfThe new purpose-built IDE to harness magic
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
DeepChatYour AI Partner on Desktop
Serper MCP ServerA Serper MCP Server
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
ChatWiseThe second fastest AI chatbot™
CursorThe AI Code Editor
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题;
Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Tavily Mcp
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Amap Maps高德地图官方 MCP Server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Playwright McpPlaywright MCP server
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.