- Zentrix Agentic Workbench
Zentrix Agentic Workbench
Kimi Multi-Tool Dataset Generation Pipeline
This project contains a comprehensive pipeline for generating a high-quality, balanced dataset for training and evaluating conversational AI models with multi-tool and file-access capabilities. It combines synthetic data generation with the integration and normalization of external datasets.
The final output is a balanced training set ready for use, located at final_datasets/training_set.jsonl.
Project Structure
invocation_dataset.py: Generates synthetic dialogues for multi-tool software invocation tasks.file_access.py: Generates synthetic dialogues for file-access tasks.merge_external_datasets.py: Downloads and normalizes three external tool-use datasets from Hugging Face.merge_and_unify.py: Merges the synthetic and external datasets into two main files:multi_tool.jsonlandfile_access.jsonl.validate_datasets.py: Validates the schema of the merged datasets and performs final normalization.create_training_set.py: Combines all records from both datasets into the final training set, preserving 100% of the data.main.py: Main script with two modes: full pipeline execution (generates data from scratch) or quick training set creation from existing datasets.
How to Run
Quick Training Set Creation
If you already have the processed datasets in final_datasets/, use:
python main.py --merge-only
This quickly creates the training set using existing data, skipping generation and downloads. Use this for iterating on the final dataset or when you want to ensure 100% data inclusion.
Full Pipeline Execution
To generate everything from scratch, including synthetic data and external downloads:
python main.py
This executes the complete pipeline in order:
- Generate Synthetic Data: Runs
invocation_dataset.pyandfile_access.py - Download External Data: Runs
merge_external_datasets.pyto download from Hugging Face intodownloaded_datasets/ - Merge & Unify: Runs
merge_and_unify.pyto combine sources intofinal_datasets/multi_tool.jsonlandfile_access.jsonl - Validate: Runs
validate_datasets.pyto clean and verify the merged files - Create Training Set: Runs
create_training_set.pyto producefinal_datasets/training_set.jsonl
Additional Options
--force: Skip file overwrite prompts--skip-downloads: Skip external dataset downloads (use only synthetic data)--merge-only: Quick mode, only create training set from existing data--num-samples: Control number of synthetic samples to generate
Final Output Files
final_datasets/training_set.jsonl: The final dataset ready for model training. Contains 100% of records from both multi-tool and file-access tasks, combined and shuffled.final_datasets/multi_tool.jsonl: A large, un-sampled collection of all multi-tool invocation dialogues.final_datasets/file_access.jsonl: A smaller, un-sampled collection of all file-access dialogues.
Dataset Evaluation
The project includes tooling for evaluating the dataset quality using MCP.so:
Quick Start
python evaluate_with_mcp.py --input final_datasets/training_set.jsonl \
--mcp_url https://your-endpoint.mcp.so/serve \
--api_key YOUR_API_KEY \
--mode messages \
--limit 100
Evaluation Process
- Initial Testing: Start with a small sample (--limit 100) to verify everything works
- Analysis: Review accuracy metrics and failure patterns
- Full Evaluation: Run complete dataset evaluation
- Iteration: Fine-tune dataset based on results
Options
--mode: Choose between 'messages' (OpenAI-style) or 'prompt' format--limit: Number of samples to evaluate (omit for full dataset)--api_key: Your MCP.so API key--mcp_url: Your MCP endpoint URL
Metrics Tracked
- Exact match accuracy
- Response structure consistency
- Tool invocation accuracy
- Common failure patterns
Author: royde Date: September 2025
Server Config
{
"mcpServers": {
"zentrix": {
"command": "node",
"args": [
"dist/server.js"
],
"env": {
"PORT": "3000",
"NODE_ENV": "production",
"DATASET_PATH": "/app/datasets/training_set.jsonl",
"KIMI_API_KEY": "your-kimi-api-key",
"KIMI_MODEL": "kimi-v1",
"LOCAL_API_URL": "http://localhost:3000",
"PROD_API_URL": "https://your-service-name.mcp.so"
},
"mounts": {
"../final_datasets": "/app/datasets"
}
}
}
}