- Toronto Open Data MCP Server
Toronto Open Data MCP Server
Toronto Open Data MCP Server
An MCP (Model Context Protocol) server that provides direct access to Toronto's Open Data through the CKAN API. This server allows LLM agents to efficiently discover, explore, and query Toronto's 500+ public datasets.
Features
- 🧠 Intelligent Query Engine: One-tool solution that automatically finds, processes, and returns relevant Toronto data
- 🎯 Relevance Scoring: Automatically ranks datasets by relevance to your question
- 🔍 Smart Filtering: Applies intelligent filters based on question context (recent data, failures, locations)
- 📊 Adaptive Data Processing: Handles both real-time API data and downloadable CSV files seamlessly
- 🚀 Agent-Optimized: Designed specifically for LLM agents with minimal decision complexity
- ✅ Robust Deployment: Ready for Railway/cloud deployment with health checks
Installation
-
Clone the repository:
git clone https://github.com/yourusername/toronto-open-data-mcp-server.git cd toronto-open-data-mcp-server -
Install dependencies:
# Install main dependencies pip install -e . # Install test dependencies (optional) pip install -e ".[test]" -
Run the server:
python main.py
Usage
Quick Start
Primary Tool (Recommended):
toronto_find_and_query_data(user_question)- One-step solution that finds and processes the most relevant data
Alternative Tools:
toronto_start_here()- Get usage guidance and workflow explanationtoronto_search_datasets(query)- Manual dataset discoverytoronto_smart_data_helper(dataset_id, user_question)- Process specific datasets
Example Usage
# Primary approach - simple and powerful
toronto_find_and_query_data("What restaurants failed health inspections recently?")
# Alternative approach - manual workflow
toronto_start_here()
toronto_search_datasets("restaurant inspection")
toronto_smart_data_helper("dinesafe", "recent restaurant inspection failures")
Popular Datasets
dinesafe- Restaurant inspections and health scorestraffic-signals- Traffic light locations and timingparks-facilities- Parks, pools, and recreation facilitiesbusiness-licences- Licensed businesses in Torontobuilding-permits- Construction and renovation permits
Testing
This project includes comprehensive tests covering unit tests, integration tests, and workflow tests.
Prerequisites
Install test dependencies:
pip install -e ".[test]"
Running Tests
Quick Test Commands
# Run unit tests only (recommended for development)
python run_tests.py
# Run with verbose output
python run_tests.py --verbose
# Run with coverage report
python run_tests.py --coverage
# Run integration tests (hits real Toronto API)
python run_tests.py --integration
# Run all tests (unit + integration)
python run_tests.py --all
Direct pytest Commands
# Unit tests only (excludes integration tests)
pytest test_toronto_mcp.py -m "not integration"
# Integration tests only (hits real API)
pytest test_toronto_mcp.py -m "integration"
# All tests
pytest test_toronto_mcp.py test_workflows.py
# With coverage
pytest --cov=main --cov-report=html test_toronto_mcp.py
Test Structure
test_toronto_mcp.py- Core unit and integration testsTestMakeApiRequest- API request functionalityTestTorontoSearchDatasets- Dataset search functionalityTestTorontoSmartDataHelper- Smart helper functionalityTestIntegration- Integration tests with real API
test_workflows.py- End-to-end workflow testsTestCommonWorkflows- Typical user workflowsTestErrorScenarios- Error handlingTestUserStories- Complete user stories
Test Categories
- Unit Tests: Fast tests with mocked API calls (default)
- Integration Tests: Tests that hit the real Toronto Open Data API
- Workflow Tests: End-to-end scenarios demonstrating common usage patterns
Coverage
Run tests with coverage to see how much of the code is tested:
python run_tests.py --coverage
# View report: open htmlcov/index.html
API Reference
Core Tools
toronto_start_here() -> str
Essential first call that provides workflow guidance and server capabilities.
toronto_search_datasets(query: str, limit: int = 10) -> str
Search Toronto datasets by keywords.
toronto_smart_data_helper(dataset_id: str, user_question: str, limit: int = 10) -> str
Intelligent helper that automatically handles both API and CSV data sources.
toronto_query_dataset_data(dataset_id: str, filters: Dict = None, fields: List = None, limit: int = 10, sort: str = None) -> str
Advanced querying with filtering and sorting for API datasets.
Utility Tools
toronto_popular_datasets() -> str
Quick access to commonly used datasets.
toronto_get_dataset_schema(dataset_id: str) -> str
Get field names and types for API datasets.
toronto_fetch_csv_data(csv_url: str, max_lines: int = 50) -> str
Fetch and preview CSV file content.
Architecture
- FastMCP Framework: Built on the FastMCP framework for easy tool definition
- CKAN API: Direct integration with Toronto's CKAN-based Open Data portal
- Collaborative Design: Works alongside web search rather than replacing it
- Error Recovery: Intelligent error handling with actionable suggestions
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Run the test suite:
python run_tests.py --all - Submit a pull request