Iris

Created By

iris-eval3 months ago

The first MCP-native eval and observability tool for AI agents. Any MCP-compatible agent discovers and uses Iris automatically — no SDK, no code changes. Log traces with hierarchical span trees, evaluate output quality with 12 built-in rules (PII detection, prompt injection, cost thresholds), and track what your agents are actually doing and costing you. Real-time dark-mode dashboard, OpenTelemetry-compatible span structure, self-hosted with SQLite. MIT licensed.

# observability

# evaluation

Overview Content Tools Comments

Overview

Iris — MCP-Native Agent Eval & Observability

See what your AI agents are actually doing. Iris is an open-source MCP server that logs every trace, evaluates output quality, and tracks costs across all your agents. Any MCP-compatible agent discovers and uses it automatically — no SDK, no code changes.

Iris Dashboard

The Problem

Your agents are running in production. Traditional monitoring sees 200 OK and moves on. It has no idea the agent just:

Leaked a social security number in its response
Hallucinated an answer with zero factual grounding
Burned $0.47 on a single query — 4.7x your budget threshold
Made 6 tool calls when 2 would have sufficed

Iris sees all of it.

What You Get


Trace Logging	Hierarchical span trees with per-tool-call latency, token usage, and cost in USD. Stored in SQLite, queryable instantly.
Output Evaluation	12 built-in rules across 4 categories: completeness, relevance, safety, cost. PII detection, prompt injection patterns, hallucination markers. Add custom rules with Zod schemas.
Cost Visibility	Aggregate cost across all agents over any time window. Set budget thresholds. Get flagged when agents overspend.
Web Dashboard	Real-time dark-mode UI with trace visualization, eval results, and cost breakdowns.

Quickstart

Add Iris to your Claude Desktop (or Cursor, Claude Code, Windsurf) MCP config:

{
  "mcpServers": {
    "iris-eval": {
      "command": "npx",
      "args": ["@iris-eval/mcp-server"]
    }
  }
}

That's it. Your agent discovers Iris and starts logging traces automatically.

Want the dashboard?

npx @iris-eval/mcp-server --dashboard
# Open http://localhost:6920

Other Install Methods

# Global install
npm install -g @iris-eval/mcp-server
iris-mcp --dashboard

# Docker
docker run -p 3000:3000 -v iris-data:/data ghcr.io/iris-eval/mcp-server

MCP Tools

Iris registers three tools that any MCP-compatible agent can invoke:

log_trace — Log an agent execution with spans, tool calls, token usage, and cost
evaluate_output — Score output quality against completeness, relevance, safety, and cost rules
get_traces — Query stored traces with filtering, pagination, and time-range support

Full tool schemas and configuration: iris-eval.com

Cloud Tier (Coming Soon)

Self-hosted Iris runs on your machine with SQLite. As your team grows, the cloud tier adds PostgreSQL, team dashboards, alerting, and managed infrastructure.

Join the waitlist to get early access.

Examples

Claude Desktop setup — MCP config for stdio and HTTP modes
TypeScript — MCP SDK client usage
LangChain — Agent instrumentation
CrewAI — Crew observability

Community

GitHub Issues — Bug reports and feature requests
GitHub Discussions — Questions and ideas
Contributing Guide — How to contribute
Roadmap — What's coming next

Configuration & Security

CLI Arguments

Flag	Default	Description
`--transport`	`stdio`	Transport type: `stdio` or `http`
`--port`	`3000`	HTTP transport port
`--db-path`	`~/.iris/iris.db`	SQLite database path
`--config`	`~/.iris/config.json`	Config file path
`--api-key`	—	API key for HTTP authentication
`--dashboard`	`false`	Enable web dashboard
`--dashboard-port`	`6920`	Dashboard port

Environment Variables

Variable	Description
`IRIS_TRANSPORT`	Transport type
`IRIS_PORT`	HTTP port
`IRIS_DB_PATH`	Database path
`IRIS_LOG_LEVEL`	Log level: debug, info, warn, error
`IRIS_DASHBOARD`	Enable dashboard (true/false)
`IRIS_API_KEY`	API key for HTTP authentication
`IRIS_ALLOWED_ORIGINS`	Comma-separated allowed CORS origins

Security

When using HTTP transport, Iris includes:

API key authentication with timing-safe comparison
CORS restricted to localhost by default
Rate limiting (100 req/min API, 20 req/min MCP)
Helmet security headers
Zod input validation on all routes
ReDoS-safe regex for custom eval rules
1MB request body limits

# Production deployment
iris-mcp --transport http --port 3000 --api-key "$(openssl rand -hex 32)" --dashboard

If Iris is useful to you, consider starring the repo — it helps others find it.

MIT Licensed.

Server Config

{
  "mcpServers": {
    "iris": {
      "command": "npx",
      "args": [
        "-y",
        "@iris-eval/mcp-server"
      ]
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

ChatWiseThe second fastest AI chatbot™

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

Playwright McpPlaywright MCP server

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

WindsurfThe new purpose-built IDE to harness magic

RedisA Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

DeepChatYour AI Partner on Desktop

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

CursorThe AI Code Editor

Tavily Mcp

Amap Maps高德地图官方 MCP Server

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

Serper MCP ServerA Serper MCP Server

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.