- Mcp Window Screenshooter
Mcp Window Screenshooter
Window Screenshooter MCP Server 🖼️
A cross-platform Model Context Protocol (MCP) server that enables AI agents to capture screenshots of specific application windows! (≧◡≦)
Overview
Window Screenshooter is an MCP server built in Python that provides AI agents with the capability to take targeted screenshots of specific application windows across Windows and Linux platforms. Unlike traditional screen capture solutions that only capture the entire screen, this server allows precise window-based capture for AI verification workflows, automated testing, and application monitoring.
✨ New Feature: Smart Window State Restoration & Focus Management!
The latest version now includes automatic window state restoration and intelligent focus handling! 🎉
When capturing windows, the server will:
- 📸 Save the original window state (minimized, maximized, position, etc.)
- 🔄 Temporarily modify the window if needed for capture
- ✨ Restore the window to its exact original state after capture
- 🎯 Auto-detect your editor (Cursor, Trae, Windsurf, VS Code, etc.) and restore focus to it
- 📉 Minimize captured windows if the calling application can't be found
- 💫 Work seamlessly across all supported platforms
This means your workflow stays smooth - focus returns to your editor and windows don't get left in unexpected states!
Key Features
- 🖼️ Window-Specific Capture: Target individual application windows by name or title
- 🌐 Cross-Platform Support: Works on Windows and Linux with platform-optimized backends
- 🔧 MCP Integration: Seamless integration with AI agents through Model Context Protocol
- 📡 STDIO Transport: Uses standard input/output for reliable communication
- ⚡ Performance Optimized: Platform-specific implementations for maximum efficiency
Installation & Setup
Prerequisites
- Python 3.12+
- Windows or Linux
Quick Start
-
Clone or download this repository:
git clone <your-repo-url> cd window-screenshooter -
Install dependencies:
pip install pywinctl pillow pywin32 # Or use the project file pip install -e . -
Test the server:
python test-mcp.py -
Run the MCP server:
# STDIO mode (for MCP clients) python server.py
MCP Tools
The server exposes three main MCP tools:
1. capture_window
Captures a screenshot of a specific window by title or identifier.
Parameters:
windowTitle(string): Exact or partial window title to matchoutputPath(string, optional): Save location for screenshotformat(string, optional): Image format (PNG, JPEG) - default: PNGquality(int, optional): JPEG quality (1-100) - default: 85
Returns: Base64-encoded image data or file path confirmation
Example:
# Save to file
await capture_window("Notepad", "screenshot.png", "PNG")
# Get base64 data
await capture_window("Calculator")
2. list_windows
Enumerates all available windows on the system.
Returns: Array of window objects with ID, title, position, and size information
Example:
await list_windows()
3. get_window_info
Retrieves detailed information about a specific window.
Parameters:
windowIdentifier(string): Window title or ID
Returns: Window metadata including position, size, visibility state, and process information
Example:
await get_window_info("Visual Studio Code")
Platform-Specific Features
Windows Implementation
- Utilizes
win32guiwith BitBlt API for robust window capture - Can capture minimized, hidden, or overlapped windows
- High-performance Graphics Capture API integration
- Provides Windows handle (HWND) and process information
Linux Implementation
- X11-based window capture using native protocols
- Direct window buffer access for efficient capture
- Support for common Linux desktop environments
MCP Client Configuration
For Cursor IDE
Add to your MCP configuration file:
{
"mcpServers": {
"window-screenshooter": {
"command": "python",
"args": ["server.py"],
"cwd": "/path/to/window-screenshooter",
"transport": "stdio"
}
}
}
For Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"window-screenshooter": {
"command": "python",
"args": ["server.py"],
"cwd": "/path/to/window-screenshooter"
}
}
}
💡 IDE Configuration Tip
📝 Note: For optimal workflow integration, consider adding this rule to your IDE configuration:
"Before capturing windows or screens with the MCP screenshooter, ALWAYS list windows first to get correct names. If working on a Web project, the default browser is Brave. If working on a Unity project, the user wants the Unity game scene window. After you make a screenshot or capture a screen, ALWAYS use vision."
This helps ensure accurate window targeting and proper follow-up analysis of captured content! ✨
Usage Examples
AI Development Workflows
- Code Verification: AI takes Unity editor screenshots to verify game object placement
- UI Testing: Capture application states during automated testing sequences
- Documentation: Generate visual documentation of application interfaces
- Debugging: Visual confirmation of application behavior changes
Automation Scenarios
- Quality Assurance: Screenshot comparison for regression testing
- Process Monitoring: Capture application states for workflow verification
- Training Data: Generate labeled screenshots for computer vision training
Error Handling
The server implements robust error handling for:
- Window not found scenarios
- Permission-denied capture attempts
- Cross-platform compatibility issues
- Invalid parameter validation
- Graceful degradation when platform-specific features are unavailable
Development & Contributing
Project Structure
window-screenshooter/
├── server.py # Main MCP server implementation
├── windows_capture.py # Windows-specific capture logic
├── linux_capture.py # Linux-specific capture logic
├── test-mcp.py # Test script for functionality
├── mcp-config-example.json # Example MCP configuration
├── pyproject.toml # Project dependencies
└── README.md # This file
Testing
Run the test script to verify functionality:
python test-mcp.py
Common Issues
-
"Window not found" errors:
- Check exact window title with
list_windows - Try partial title matching
- Ensure window is visible and not minimized
- Check exact window title with
-
Permission errors on Windows:
- Run as administrator if needed
- Check Windows security settings
-
Import errors:
- Ensure all dependencies are installed:
pip install pywinctl pillow pywin32 - Check Python version (requires 3.12+)
- Ensure all dependencies are installed:
Platform Compatibility
- Windows: Full support with native Win32 API
- Linux: Basic support with X11 integration
License
This project is licensed under the MIT License - see the LICENSE file for details.
Server Config
{
"mcpServers": {
"mcp-window-screenshooter": {
"command": "python",
"args": [
"server.py"
],
"cwd": "/path/to/window-screenshooter"
}
}
}