Sponsored by Deepsite.site

AgentDesk MCP

Created By
Rih0za month ago
Adversarial AI review API — independent AI reviews another AI's output. Stop LLMs from grading their own homework. Provides automated quality assurance for AI-generated code, content, and other outputs through independent review pipelines.
Content

AgentDesk MCP — Adversarial AI Review

License: MIT Tests MCP

Quality control for AI pipelines — one MCP tool. Works with Claude Code, Claude Desktop, and any MCP client.

29.5% of teams do NO evaluation of AI outputs. (LangChain Survey) Knowledge workers spend 4.3 hours/week fact-checking AI outputs. (Microsoft 2025)

AgentDesk MCP fixes this. Add independent adversarial review to any AI pipeline in 30 seconds.

Quick Start

npx agentdesk-mcp

Claude Code

claude mcp add agentdesk-mcp -- npx agentdesk-mcp

Claude Desktop

{
  "mcpServers": {
    "agentdesk-mcp": {
      "command": "npx",
      "args": ["-y", "agentdesk-mcp"],
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    }
  }
}

Install from GitHub (alternative)

npm install github:Rih0z/agentdesk-mcp

Requirements

  • ANTHROPIC_API_KEY environment variable (uses your own key — BYOK)

Tools

review_output

Adversarial quality review of any AI-generated output. An independent reviewer assumes the author made mistakes and actively looks for problems.

Input:

ParameterRequiredDescription
outputYesThe AI-generated output to review
criteriaNoCustom review criteria
review_typeNoCategory: code, content, factual, translation, etc.
modelNoReviewer model (default: claude-sonnet-4-6)

Output:

{
  "verdict": "PASS | FAIL | CONDITIONAL_PASS",
  "score": 82,
  "issues": [
    {
      "severity": "high",
      "category": "accuracy",
      "description": "Claim about X is unsupported",
      "suggestion": "Add citation or remove claim"
    }
  ],
  "checklist": [
    {
      "item": "Factual accuracy",
      "status": "pass",
      "evidence": "All statistics match cited sources"
    }
  ],
  "summary": "Overall assessment...",
  "reviewer_model": "claude-sonnet-4-6"
}

review_dual

Dual adversarial review — two independent reviewers assess the output from different angles, then a merge agent combines findings.

  • If either reviewer finds a critical issue → merged verdict is FAIL
  • Takes the lower score
  • Combines and deduplicates all issues

Use for high-stakes outputs where quality is critical.

Same parameters as review_output.

How It Works

  1. Adversarial prompting: The reviewer is instructed to assume mistakes were made. No benefit of the doubt.
  2. Evidence-based checklist: Every PASS item requires specific evidence. Items without evidence are automatically downgraded to FAIL.
  3. Anti-gaming validation: If >30% of checklist items lack evidence, the entire review is forced to FAIL with a capped score of 50.
  4. Structured output: Verdict + numeric score + categorized issues + checklist (not just "looks good").

Use Cases

  • Code review: Check for bugs, security issues, performance problems
  • Content review: Verify accuracy, readability, SEO, audience fit
  • Factual verification: Validate claims in AI-generated text
  • Translation quality: Check accuracy and naturalness
  • Data extraction: Verify completeness and correctness
  • Any AI output: Summaries, reports, proposals, emails, etc.

Why Not Just Ask the Same AI to Review?

Self-review has systematic leniency bias. An LLM reviewing its own output shares the same blind spots that created the errors. Research shows models are 34% more likely to use confident language when hallucinating.

AgentDesk uses a separate reviewer invocation with adversarial prompting — fundamentally different from self-review.

Comparison

FeatureAgentDesk MCPManual promptBraintrustDeepEval
One-tool setupYesNoNoNo
Adversarial reviewYesDIYNoNo
Dual reviewerYesDIYNoNo
Anti-gaming validationYesNoNoNo
No SDK requiredYesYesNoNo
MCP nativeYesNoNoNo

Limitations

  • Prompt injection: Like all LLM-as-judge systems, adversarial inputs could attempt to manipulate reviewer verdicts. The anti-gaming validation layer mitigates superficial gaming, but determined adversarial inputs remain a challenge. For high-stakes use cases, combine with deterministic validation.
  • BYOK cost: Each review_output call makes 1 LLM API call; review_dual makes 3. Factor this into your pipeline costs.

Hosted API (Separate Product)

For teams that prefer HTTP integration, a hosted REST API with additional features (agent marketplace, context learning, workflows) is available at agentdesk-blue.vercel.app.

Development

git clone https://github.com/Rih0z/agentdesk-mcp.git
cd agentdesk-mcp
npm install
npm test        # 35 tests
npm run build

License

MIT


Built by EZARK Consulting | Web Version

Server Config

{
  "mcpServers": {
    "agentdesk-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "@ezark-publish/agentdesk-mcp@1.3.0"
      ]
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Amap Maps高德地图官方 MCP Server
CursorThe AI Code Editor
Tavily Mcp
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.
Serper MCP ServerA Serper MCP Server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
ChatWiseThe second fastest AI chatbot™
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
WindsurfThe new purpose-built IDE to harness magic
Playwright McpPlaywright MCP server
DeepChatYour AI Partner on Desktop
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.