Joern MCP

Created By

Lekssays3 months ago

A Model Context Protocol (MCP) server that provides AI assistants with static code analysis capabilities using Joern's Code Property Graph (CPG) technology.

# code analysis

# program analysis

Overview Content Tools Comments

Content

🕷️ joern-mcp

A Model Context Protocol (MCP) server that provides AI assistants with static code analysis capabilities using Joern's Code Property Graph (CPG) technology.

Features

Multi-Language Support: Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, Swift
Docker Isolation: Each analysis session runs in a secure container
GitHub Integration: Analyze repositories directly from GitHub URLs
Session-Based: Persistent CPG sessions with automatic cleanup
Redis-Backed: Fast caching and session management
Async Queries: Non-blocking CPG generation and query execution

Quick Start

Prerequisites

Python 3.8+
Docker
Redis
Git

Installation

Clone and install dependencies:

git clone https://github.com/Lekssays/joern-mcp.git
cd joern-mcp
pip install -r requirements.txt

Setup (builds Joern image and starts Redis):

./setup.sh

Configure (optional):

cp config.example.yaml config.yaml
# Edit config.yaml as needed

Run the server:

python main.py
# Server will be available at http://localhost:4242

Integration with GitHub Copilot

The server uses Streamable HTTP transport for network accessibility and supports multiple concurrent clients.

Add to your VS Code settings.json:

{
  "github.copilot.advanced": {
    "mcp": {
      "servers": {
        "joern-mcp": {
          "url": "http://localhost:4242/mcp",
        }
      }
    }
  }
}

Make sure the server is running before using it with Copilot:

python main.py

Available Tools

Core Tools

create_cpg_session: Initialize analysis session from local path or GitHub URL
run_cpgql_query: Execute synchronous CPGQL queries with JSON output
run_cpgql_query_async: Execute asynchronous queries with status tracking
get_query_status: Check status of asynchronously running queries
get_query_result: Retrieve results from completed queries
cleanup_queries: Clean up old completed query results
get_session_status: Check session state and metadata
list_sessions: View active sessions with filtering
close_session: Clean up session resources
cleanup_all_sessions: Clean up multiple sessions and containers

Code Browsing Tools

get_codebase_summary: Get high-level overview of codebase (file count, method count, language)
list_files: List all source files with optional regex filtering
list_methods: Discover all methods/functions with filtering by name, file, or external status
get_method_source: Retrieve actual source code for specific methods
list_calls: Find function call relationships and dependencies
get_call_graph: Build call graphs (outgoing callees or incoming callers) with configurable depth
list_parameters: Get detailed parameter information for methods
find_literals: Search for hardcoded values (strings, numbers, API keys, etc)
get_code_snippet: Retrieve code snippets from files with line range

Security Analysis Tools

find_taint_sources: Locate likely external input points (taint sources)
find_taint_sinks: Locate dangerous sinks where tainted data could cause vulnerabilities
find_taint_flows: Find dataflow paths from sources to sinks using Joern dataflow primitives
find_argument_flows: Find flows where the exact same expression is passed to both source and sink calls
check_method_reachability: Check if one method can reach another through the call graph
list_taint_paths: List detailed taint flow paths from sources to sinks
get_program_slice: Build a program slice from a specific line or call

Example Usage

# Create session from GitHub
{
  "tool": "create_cpg_session",
  "arguments": {
    "source_type": "github",
    "source_path": "https://github.com/user/repo",
    "language": "java"
  }
}

# Get codebase overview
{
  "tool": "get_codebase_summary",
  "arguments": {
    "session_id": "abc-123-def"
  }
}

# List all methods in the codebase
{
  "tool": "list_methods",
  "arguments": {
    "session_id": "abc-123-def",
    "include_external": false,
    "limit": 50
  }
}

# Get source code for a specific method
{
  "tool": "get_method_source",
  "arguments": {
    "session_id": "abc-123-def",
    "method_name": "authenticate"
  }
}

# Find what methods call a specific function
{
  "tool": "get_call_graph",
  "arguments": {
    "session_id": "abc-123-def",
    "method_name": "execute_query",
    "depth": 2,
    "direction": "incoming"
  }
}

# Search for hardcoded secrets
{
  "tool": "find_literals",
  "arguments": {
    "session_id": "abc-123-def",
    "pattern": "(?i).*(password|secret|api_key).*",
    "limit": 20
  }
}

# Get code snippet from a file
{
  "tool": "get_code_snippet",
  "arguments": {
    "session_id": "abc-123-def",
    "filename": "src/main.c",
    "start_line": 10,
    "end_line": 25
  }
}

# Run custom CPGQL query
{
  "tool": "run_cpgql_query",
  "arguments": {
    "session_id": "abc-123-def",
    "query": "cpg.method.name.l"
  }
}

# Find potential security vulnerabilities
{
  "tool": "find_taint_sources",
  "arguments": {
    "session_id": "abc-123-def",
    "language": "c"
  }
}

# Check for data flows from sources to sinks
{
  "tool": "find_taint_flows",
  "arguments": {
    "session_id": "abc-123-def",
    "source_patterns": ["getenv", "fgets"],
    "sink_patterns": ["system", "sprintf"]
  }
}

# Find argument flows between function calls
{
  "tool": "find_argument_flows",
  "arguments": {
    "session_id": "abc-123-def",
    "source_name": "validate_input",
    "sink_name": "process_data",
    "arg_index": 0
  }
}

# Get detailed taint paths
{
  "tool": "list_taint_paths",
  "arguments": {
    "session_id": "abc-123-def",
    "source_pattern": "getenv",
    "sink_pattern": "system",
    "max_paths": 5
  }
}

# Build program slice for security analysis
{
  "tool": "get_program_slice",
  "arguments": {
    "session_id": "abc-123-def",
    "filename": "main.c",
    "line_number": 42,
    "call_name": "memcpy"
  }
}

Security Analysis Capabilities

The security analysis tools provide comprehensive vulnerability detection including:

Taint Analysis:

Source identification: find_taint_sources locates external input points
Sink identification: find_taint_sinks finds dangerous operations
Flow analysis: find_taint_flows traces data from sources to sinks
Argument flow analysis: find_argument_flows finds exact expression reuse between calls
Path enumeration: list_taint_paths provides detailed propagation chains

Program Slicing:

Backward slicing: get_program_slice shows all code affecting a specific operation
Data dependencies: Variable assignments and data flow tracking
Control dependencies: Conditional statements affecting execution

Reachability Analysis:

Method connectivity: check_method_reachability verifies call graph connections
Impact analysis: Understand potential execution paths

Configuration

Key settings in config.yaml:

server:
  host: 0.0.0.0
  port: 4242
  log_level: INFO

redis:
  host: localhost
  port: 6379

sessions:
  ttl: 3600                # Session timeout (seconds)
  max_concurrent: 50       # Max concurrent sessions

cpg:
  generation_timeout: 600  # CPG generation timeout (seconds)
  supported_languages: [java, c, cpp, javascript, python, go, kotlin, csharp, ghidra, jimple, php, ruby, swift]

Environment variables override config file settings (e.g., MCP_HOST, REDIS_HOST, SESSION_TTL).

Example CPGQL Queries

Find all methods:

cpg.method.name.l

Find hardcoded secrets:

cpg.literal.code("(?i).*(password|secret|api_key).*").l

Find SQL injection risks:

cpg.call.name(".*execute.*").where(_.argument.isLiteral.code(".*SELECT.*")).l

Find complex methods:

cpg.method.filter(_.cyclomaticComplexity > 10).l

Architecture

FastMCP Server: Built on FastMCP 2.12.4 framework with Streamable HTTP transport
HTTP Transport: Network-accessible API supporting multiple concurrent clients
Docker Containers: One isolated Joern container per session
Redis: Session state and query result caching
Async Processing: Non-blocking CPG generation
CPG Caching: Reuse CPGs for identical source/language combinations

Development

Project Structure

joern-mcp/
├── src/
│   ├── services/       # Session, Docker, Git, CPG, Query services
│   ├── tools/          # MCP tool definitions
│   ├── utils/          # Redis, logging, validators
│   └── models.py       # Data models
├── playground/         # Test codebases and CPGs
├── main.py            # Server entry point
├── config.yaml        # Configuration
└── requirements.txt   # Dependencies

Running Tests

# Install dev dependencies
pip install -r requirements.txt

# Run tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

Code Quality

# Format
black src/ tests/
isort src/ tests/

# Lint
flake8 src/ tests/
mypy src/

Troubleshooting

Setup issues:

# Re-run setup to rebuild and restart services
./setup.sh

Docker issues:

# Verify Docker is running
docker ps

# Check Joern image
docker images | grep joern

# Check Redis container
docker ps | grep joern-redis

Redis connection issues:

# Test Redis connection
docker exec joern-redis redis-cli ping

# Check Redis logs
docker logs joern-redis

# Restart Redis
docker restart joern-redis

Server connectivity:

# Test server is running
curl http://localhost:4242/health

# Check server logs for errors
python main.py

Loading large projects:

joern:
  binary_path: ${JOERN_BINARY_PATH:joern}
  memory_limit: ${JOERN_MEMORY_LIMIT:16g}
  java_opts: ${JOERN_JAVA_OPTS:-Xmx16G -Xms8G -XX:+UseG1GC -Dfile.encoding=UTF-8}

Debug logging:

export MCP_LOG_LEVEL=DEBUG
python main.py

Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

Getting started with development setup
Code style and quality guidelines
Testing requirements and best practices
Submitting changes through pull requests
Reporting issues and feature requests
Documentation standards

Quick start for contributors:

git clone https://github.com/YOUR_USERNAME/joern-mcp.git
cd joern-mcp
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
./setup.sh

# Create feature branch
git checkout -b feature/your-feature

# Make changes and run tests
pytest && black . && flake8

# Submit pull request

See CONTRIBUTING.md for detailed guidelines.

Acknowledgments

Joern - Static analysis platform
FastMCP - MCP framework
Model Context Protocol - MCP specification

Built with ❤️ in Doha 🇶🇦

Server Config

{
  "mcpServers": {
    "joern-mcp": {
      "url": "https://0.0.0.0:4242/mcp"
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

Tavily Mcp

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

WindsurfThe new purpose-built IDE to harness magic

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

CursorThe AI Code Editor

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

DeepChatYour AI Partner on Desktop

ChatWiseThe second fastest AI chatbot™

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

Serper MCP ServerA Serper MCP Server

Playwright McpPlaywright MCP server

Amap Maps高德地图官方 MCP Server

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.