Sponsored by Deepsite.site

mcp-server-macos-use

Created By
mediar-ai8 months ago
AI agent that controls computer with OS-level tools, MCP compatible, works with any model
Content

mcp-server-macos-use

Model Context Protocol (MCP) server in Swift. It allows controlling macOS applications by leveraging the accessibility APIs, primarily through the MacosUseSDK.

You can use it in Claude Desktop or other compatible MCP-client.

The server listens for MCP commands over standard input/output (stdio) and exposes several tools to interact with applications.

https://github.com/user-attachments/assets/b43622a3-3d20-4026-b02f-e9add06afe2b

Available Tools

The server exposes the following tools via the CallTool MCP method:

  1. macos-use_open_application_and_traverse

    • Description: Opens or activates a specified application and then traverses its accessibility tree.
    • Parameters:
      • identifier (String, Required): The application's name, bundle ID, or file path.
  2. macos-use_click_and_traverse

    • Description: Simulates a mouse click at specific coordinates within the window of the target application (identified by PID) and then traverses its accessibility tree.
    • Parameters:
      • pid (Number, Required): The Process ID (PID) of the target application.
      • x (Number, Required): The X-coordinate for the click (relative to the window/screen, depending on SDK behavior).
      • y (Number, Required): The Y-coordinate for the click.
  3. macos-use_type_and_traverse

    • Description: Simulates typing text into the target application (identified by PID) and then traverses its accessibility tree.
    • Parameters:
      • pid (Number, Required): The Process ID (PID) of the target application.
      • text (String, Required): The text to be typed.
  4. macos-use_press_key_and_traverse

    • Description: Simulates pressing a specific keyboard key (e.g., 'Enter', 'Tab', 'a', 'B') with optional modifier keys held down, targeting the application specified by PID, and then traverses its accessibility tree.
    • Parameters:
      • pid (Number, Required): The Process ID (PID) of the target application.
      • keyName (String, Required): The name of the key (e.g., Return, Escape, ArrowUp, Delete, a, B). Case-sensitive for letters if no modifiers are active.
      • modifierFlags (Array, Optional): An array of modifier keys to hold during the press. Valid values: CapsLock (or Caps), Shift, Control (or Ctrl), Option (or Opt, Alt), Command (or Cmd), Function (or Fn), NumericPad (or Numpad), Help.
  5. macos-use_refresh_traversal

    • Description: Only performs the accessibility tree traversal for the specified application (identified by PID). Useful for getting the current UI state without performing an action.
    • Parameters:
      • pid (Number, Required): The Process ID (PID) of the application to traverse.

Common Optional Parameters (for CallTool)

These can potentially be passed in the arguments object for any tool call to override default MacosUseSDK behavior (refer to ActionOptions in the code):

  • traverseBefore (Boolean, Optional): Traverse accessibility tree before the primary action.
  • traverseAfter (Boolean, Optional): Traverse accessibility tree after the primary action (usually defaults to true).
  • showDiff (Boolean, Optional): Include a diff between traversals (if applicable).
  • onlyVisibleElements (Boolean, Optional): Limit traversal to visible elements.
  • showAnimation (Boolean, Optional): Show visual feedback animation for actions.
  • animationDuration (Number, Optional): Duration of the feedback animation.
  • delayAfterAction (Number, Optional): Add a delay after performing the action.

Dependencies

  • MacosUseSDK (Assumed local or external Swift package providing macOS control functionality)

Building and Running

# Example build command (adjust as needed, use 'debug' for development)
swift build -c debug # Or 'release' for production

# Running the server (it communicates via stdin/stdout)
./.build/debug/mcp-server-macos-use

Integrating with Clients (Example: Claude Desktop)

Once built, you need to tell your client application where to find the server executable. For example, to configure Claude Desktop, you might add the following to its configuration:

{
    "mcpServers": {
        "mcp-server-macos-use": {
            "command": "/path/to/your/project/mcp-server-macos-use/.build/debug/mcp-server-macos-use"
        }
    }
}

Replace /path/to/your/project/ with the actual absolute path to your mcp-server-macos-use directory.

Help

Reach out to matt@mediar.ai Discord: m13v_

Plans

Happy to tailor the server for your needs, feel free to open an issue or reach out

Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
CursorThe AI Code Editor
ChatWiseThe second fastest AI chatbot™
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Amap Maps高德地图官方 MCP Server
Playwright McpPlaywright MCP server
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
WindsurfThe new purpose-built IDE to harness magic
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Tavily Mcp
Serper MCP ServerA Serper MCP Server
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
DeepChatYour AI Partner on Desktop
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.