Sponsored by Deepsite.site

Deep-research

Created By
ssdeanx10 months ago
MCP Deep Research Server using Gemini creating a Research AI Agent
Content

Deep-research

Node.js TypeScript Gemini API Firecrawl MCP License: MIT

Your AI-Powered Research Assistant. Conduct iterative, deep research using search engines, web scraping, and Gemini LLMs, all within a lightweight and understandable codebase.

This tool uses Firecrawl for efficient web data extraction and Gemini for advanced language understanding and report generation.

The goal of this project is to provide the simplest yet most effective implementation of a deep research agent. It's designed to be easily understood, modified, and extended, aiming for a codebase under 500 lines of code (LoC).

Key Features:

  • MCP Integration: Seamlessly integrates as a Model Context Protocol (MCP) tool into AI agent ecosystems.
  • Iterative Deep Dive: Explores topics deeply through iterative query refinement and result processing.
  • Gemini-Powered Queries: Leverages Gemini LLMs to generate smart, targeted search queries.
  • Depth & Breadth Control: Configurable depth and breadth parameters for precise research scope.
  • Smart Follow-up Questions: Intelligently generates follow-up questions for query refinement.
  • Comprehensive Markdown Reports: Generates detailed, ready-to-use Markdown reports.
  • Concurrent Processing for Speed: Maximizes research efficiency with parallel processing.

Workflow Diagram

flowchart TB
    subgraph Input
        Q[User Query]
        B[Breadth Parameter]
        D[Depth Parameter]
    end

    DR[Deep Research] -->
    SQ[SERP Queries] -->
    PR[Process Results]

    subgraph Results[Results]
        direction TB
        NL((Learnings))
        ND((Directions))
    end

    PR --> NL
    PR --> ND

    DP{depth > 0?}

    RD["Next Direction:
    - Prior Goals
    - New Questions
    - Learnings"]

    MR[Markdown Report]

    %% Main Flow
    Q & B & D --> DR

    %% Results to Decision
    NL & ND --> DP

    %% Circular Flow
    DP -->|Yes| RD
    RD -->|New Context| DR

    %% Final Output
    DP -->|No| MR

    %% Styling
    classDef input fill:#7bed9f,stroke:#2ed573,color:black
    classDef process fill:#70a1ff,stroke:#1e90ff,color:black
    classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
    classDef output fill:#ff4757,stroke:#ff6b81,color:black
    classDef results fill:#a8e6cf,stroke:#3b7a57,color:black

    class Q,B,D input
    class DR,SQ,PR process
    class DP,RD recursive
    class MR output
    class NL,ND results

Persona Agents in open-deep-research

What are Persona Agents?

In deep-research, we utilize the concept of "persona agents" to guide the behavior of the Gemini language models. Instead of simply prompting the LLM with a task, we imbue it with a specific role, skills, personality, communication style, and values. This approach helps to:

  • Focus the LLM's Output: By defining a clear persona, we encourage the LLM to generate responses that are aligned with the desired expertise and perspective.
  • Improve Consistency: Personas help maintain a consistent tone and style throughout the research process.
  • Enhance Task-Specific Performance: Tailoring the persona to the specific task (e.g., query generation, learning extraction, feedback) optimizes the LLM's output for that stage of the research.

Examples of Personas in use:

  • Expert Research Strategist & Query Generator: Used for generating search queries, this persona emphasizes strategic thinking, comprehensive coverage, and precision in query formulation.
  • Expert Research Assistant & Insight Extractor: When processing web page content, this persona focuses on meticulous analysis, factual accuracy, and extracting key learnings relevant to the research query.
  • Expert Research Query Refiner & Strategic Advisor: For generating follow-up questions, this persona embodies strategic thinking, user intent understanding, and the ability to guide users towards clearer and more effective research questions.
  • Professional Doctorate Level Researcher (System Prompt): This overarching persona, applied to the main system prompt, sets the tone for the entire research process, emphasizing expert-level analysis, logical structure, and in-depth investigation.

By leveraging persona agents, deep-research aims to achieve more targeted, consistent, and high-quality research outcomes from the Gemini language models.

How It Works

Features

  • MCP Integration: Available as a Model Context Protocol tool for seamless integration with AI agents
  • Iterative Research: Performs deep research by iteratively generating search queries, processing results, and diving deeper based on findings
  • Intelligent Query Generation: Uses Gemini LLMs to generate targeted search queries based on research goals and previous findings
  • Depth & Breadth Control: Configurable parameters to control how wide (breadth) and deep (depth) the research goes
  • Smart Follow-up: Generates follow-up questions to better understand research needs
  • Comprehensive Reports: Produces detailed markdown reports with findings and sources
  • Concurrent Processing: Handles multiple searches and result processing in parallel for efficiency

Requirements

  • Node.js environment (v22.x recommended)
  • API keys for:
    • Firecrawl API (for web search and content extraction)
    • Gemini API (for o3 mini model, knowledge cutoff: August 2024)

Setup

Node.js

  1. Clone the repository:

    git clone [your-repo-link-here]
    
  2. Install dependencies:

    npm install
    
  3. Set up environment variables: Create a .env.local file in the project root and add your API keys:

    GEMINI_API_KEY="your_gemini_key"
    FIRECRAWL_KEY="your_firecrawl_key"
    # Optional: If you want to use your self-hosted Firecrawl instance
    # FIRECRAWL_BASE_URL=http://localhost:3002
    
  4. Build the project:

    npm run build
    

Usage

As MCP Tool

To run deep-research as an MCP tool, start the MCP server:

node --env-file .env.local dist/mcp-server.js

You can then invoke the deep-research tool from any MCP-compatible agent using the following parameters:

  • query (string, required): The research query.
  • depth (number, optional, 1-5): Research depth (default: moderate).
  • breadth (number, optional, 1-5): Research breadth (default: moderate).
  • existingLearnings (string[], optional): Pre-existing research findings to guide research.

Example MCP Tool Invocation (TypeScript):

const mcp = new ModelContextProtocolClient(); // Assuming MCP client is initialized

async function invokeDeepResearchTool() {
  try {
    const result = await mcp.invoke("deep-research", {
      query: "Explain the principles of blockchain technology",
      depth: 2,
      breadth: 4
    });

    if (result.isError) {
      console.error("MCP Tool Error:", result.content[0].text);
    } else {
      console.log("Research Report:\n", result.content[0].text);
      console.log("Sources:\n", result.metadata.sources);
    }
  } catch (error) {
    console.error("MCP Invoke Error:", error);
  }
}

invokeDeepResearchTool();

Standalone CLI Usage

To run deep-research directly from the command line:

npm run start "your research query"

Example:

npm run start "what are latest developments in ai research agents"

MCP Inspector Testing

For interactive testing and debugging of the MCP server, use the MCP Inspector:

npx @modelcontextprotocol/inspector node --env-file .env.local dist/mcp-server.js

License

MIT License - Free and Open Source. Use it freely!


**🚀 Let's dive deep into research! 🚀

Recent Improvements (v0.2.0)

Enhanced Research Validation:

  • 🧪 Added academic input/output validation
  • ✅ Input validation: Minimum 10 characters + 3 words
  • 📈 Output validation: Citation density (1.5+ per 100 words)
  • 🔍 Recent sources check (3+ post-2019 references)
  • ⚖️ Conflict disclosure enforcement

Gemini Integration Upgrades:

  • 🧠 Embedded Gemini analysis in research workflow
  • 🔄 Integrated Gemini Flash 2.0 for faster processing
  • 📊 Added semantic text splitting for LLM context management
  • 🛠️ Improved error handling for API calls

Code Quality Improvements:

  • 🚀 Added concurrent processing pipeline
  • 🧹 Removed redundant academic-validators module
  • 🛡️ Enhanced type safety across interfaces
  • 📦 Optimized dependencies (30% smaller node_modules)

New Features:

  • 📊 Research metrics tracking (sources/learnings ratio)
  • 📑 Auto-generated conflict disclosure statements
  • 🔄 Recursive research depth control (1-5 levels)
  • 📈 Research metrics tracking (sources/learnings ratio)
  • 🤖 MCP tool integration improvements

Performance:

  • 🚀 30% faster research cycles
  • ⚡ 40% faster initial research cycles
  • 📉 60% reduction in API errors
  • 🧮 25% more efficient token usage
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
WindsurfThe new purpose-built IDE to harness magic
CursorThe AI Code Editor
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Serper MCP ServerA Serper MCP Server
ChatWiseThe second fastest AI chatbot™
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Amap Maps高德地图官方 MCP Server
Tavily Mcp
Playwright McpPlaywright MCP server
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
DeepChatYour AI Partner on Desktop