Zentrix Agentic Workbench

Created By

deslito4 months ago

A short summary of what the server does. Example: Modular agentic AI workflows for dataset cleaning and orchestration.

Content

Kimi Multi-Tool Dataset Generation Pipeline

This project contains a comprehensive pipeline for generating a high-quality, balanced dataset for training and evaluating conversational AI models with multi-tool and file-access capabilities. It combines synthetic data generation with the integration and normalization of external datasets.

The final output is a balanced training set ready for use, located at final_datasets/training_set.jsonl.

Project Structure

invocation_dataset.py: Generates synthetic dialogues for multi-tool software invocation tasks.
file_access.py: Generates synthetic dialogues for file-access tasks.
merge_external_datasets.py: Downloads and normalizes three external tool-use datasets from Hugging Face.
merge_and_unify.py: Merges the synthetic and external datasets into two main files: multi_tool.jsonl and file_access.jsonl.
validate_datasets.py: Validates the schema of the merged datasets and performs final normalization.
create_training_set.py: Combines all records from both datasets into the final training set, preserving 100% of the data.
main.py: Main script with two modes: full pipeline execution (generates data from scratch) or quick training set creation from existing datasets.

How to Run

Quick Training Set Creation

If you already have the processed datasets in final_datasets/, use:

python main.py --merge-only

This quickly creates the training set using existing data, skipping generation and downloads. Use this for iterating on the final dataset or when you want to ensure 100% data inclusion.

Full Pipeline Execution

To generate everything from scratch, including synthetic data and external downloads:

python main.py

This executes the complete pipeline in order:

Generate Synthetic Data: Runs invocation_dataset.py and file_access.py
Download External Data: Runs merge_external_datasets.py to download from Hugging Face into downloaded_datasets/
Merge & Unify: Runs merge_and_unify.py to combine sources into final_datasets/multi_tool.jsonl and file_access.jsonl
Validate: Runs validate_datasets.py to clean and verify the merged files
Create Training Set: Runs create_training_set.py to produce final_datasets/training_set.jsonl

Additional Options

--force: Skip file overwrite prompts
--skip-downloads: Skip external dataset downloads (use only synthetic data)
--merge-only: Quick mode, only create training set from existing data
--num-samples: Control number of synthetic samples to generate

Final Output Files

final_datasets/training_set.jsonl: The final dataset ready for model training. Contains 100% of records from both multi-tool and file-access tasks, combined and shuffled.
final_datasets/multi_tool.jsonl: A large, un-sampled collection of all multi-tool invocation dialogues.
final_datasets/file_access.jsonl: A smaller, un-sampled collection of all file-access dialogues.

Dataset Evaluation

The project includes tooling for evaluating the dataset quality using MCP.so:

Quick Start

python evaluate_with_mcp.py --input final_datasets/training_set.jsonl \
                           --mcp_url https://your-endpoint.mcp.so/serve \
                           --api_key YOUR_API_KEY \
                           --mode messages \
                           --limit 100

Evaluation Process

Initial Testing: Start with a small sample (--limit 100) to verify everything works
Analysis: Review accuracy metrics and failure patterns
Full Evaluation: Run complete dataset evaluation
Iteration: Fine-tune dataset based on results

Options

--mode: Choose between 'messages' (OpenAI-style) or 'prompt' format
--limit: Number of samples to evaluate (omit for full dataset)
--api_key: Your MCP.so API key
--mcp_url: Your MCP endpoint URL

Metrics Tracked

Exact match accuracy
Response structure consistency
Tool invocation accuracy
Common failure patterns

Author: royde Date: September 2025

Server Config

{
  "mcpServers": {
    "zentrix": {
      "command": "node",
      "args": [
        "dist/server.js"
      ],
      "env": {
        "PORT": "3000",
        "NODE_ENV": "production",
        "DATASET_PATH": "/app/datasets/training_set.jsonl",
        "KIMI_API_KEY": "your-kimi-api-key",
        "KIMI_MODEL": "kimi-v1",
        "LOCAL_API_URL": "http://localhost:3000",
        "PROD_API_URL": "https://your-service-name.mcp.so"
      },
      "mounts": {
        "../final_datasets": "/app/datasets"
      }
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

WindsurfThe new purpose-built IDE to harness magic

Serper MCP ServerA Serper MCP Server

Tavily Mcp

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

CursorThe AI Code Editor

Amap Maps高德地图官方 MCP Server

ChatWiseThe second fastest AI chatbot™

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.