- DataHub MCP Server
DataHub MCP Server
DataHub MCP Server
A Model Context Protocol (MCP) server implementation that exposes DataHub API endpoints as tools for use with MCP clients. Built with FastMCP 2.6.0 and designed following Python best practices.
Features
- Comprehensive API Access: Exposes all DataHub API endpoints as MCP tools
- Multiple Transport Modes: Supports both stdio and SSE transports
- Secure Authentication: Optional bearer token authentication with RSA key pairs
- Robust Error Handling: Comprehensive error handling and detailed logging
- Flexible Configuration: JSON-based configuration with sensible defaults
- Context7 Integration: Ready for integration with Windsurf systems
- Type Safety: Complete type annotations for better IDE support
Installation
Prerequisites
- Python 3.8 or higher
- Virtual environment (recommended)
Setup
# Clone the repository
git clone https://github.com/yourusername/mcp_datahub.git
cd mcp_datahub
# Create and activate a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install the package in development mode
pip install -e .
# Install development dependencies
pip install -e ".[dev]"
Usage
Command Line
# Run with stdio transport
python -m src.fastmcp_server.server --transport stdio
# Run with SSE transport (default)
python -m src.fastmcp_server.server --transport sse --port 8000 --host 0.0.0.0
# Run with authentication enabled
python -m src.fastmcp_server.server --auth
# Run with custom configuration file
python -m src.fastmcp_server.server --config config.json
# Specify log level
python -m src.fastmcp_server.server --log-level DEBUG
Direct Script Execution
You can also run the server script directly:
cd src/fastmcp_server
python server.py --transport sse --port 8000 --auth
Configuration File
Create a config.json file to customize server behavior:
{
"api_base_url": "https://datahub.zooxsmart.com",
"timeout": 30,
"retry_attempts": 3,
"log_level": "INFO",
"cors_origins": ["*"],
"transport": "sse",
"port": 8000,
"host": "0.0.0.0"
}
Available Tools
The server exposes the following DataHub API tools:
Person Data Services
get_person_by_cpf: Get person information by CPFget_person_by_linkedin: Get person information by LinkedIn URLget_person_by_email: Get person information by email addressget_person_by_name: Get person information by partial nameget_person_by_name_birthdate: Get person information by name and birthdateget_person_by_phone: Get person information by phone numbervalidate_person: Validate person informationvalidate_person_v2: Validate person information using V2 endpointvalidate_person_doc: Validate person documentget_person_webhook_cpf: Get person information by CPF using webhook
Company Data Services
get_company_by_cnpj: Get company information by CNPJget_company_webhook_cnpj: Get company information by CNPJ using webhookvalidate_company: Validate company informationvalidate_company_v2: Validate company information using V2 endpoint
Vehicle Data Services
get_vehicle_webhook_plate: Get vehicle information by plate using webhookget_vehicle_webhook_doc_plate: Get vehicle information by document and plate using webhookget_vehicle_fipe_plate: Get vehicle FIPE information by plate
Process Data Services
get_process_by_number: Get process information by process number
Infobip Services
validate_cpf_phonenumber: Validate CPF and phone number combination
Market Data Services
get_market_cep: Get market information by CEP (postal code)
Project Structure
mcp_datahub/
├── src/ # Source code directory
│ ├── __init__.py
│ ├── fastmcp_server/ # MCP server implementation
│ │ ├── __init__.py
│ │ └── server.py # Main server implementation
│ ├── api/ # API tools and integrations
│ │ ├── __init__.py
│ │ └── datahub_tools.py # DataHub API tools
│ └── utils/ # Utility functions
│ ├── __init__.py
│ └── helpers.py # Helper functions
├── tests/ # Test directory
│ ├── __init__.py
│ ├── conftest.py # Pytest fixtures
│ ├── test_server.py # Tests for server.py
│ ├── test_api_tools.py # Tests for datahub_tools.py
│ └── test_utils.py # Tests for helpers.py
├── config.json # Server configuration
├── pyproject.toml # Project metadata and dependencies
├── requirements.txt # Pinned dependencies
├── .windsurfrules # Project rules and guidelines
└── README.md # Project documentation
Development
Running Tests
# Run all tests
pytest
# Run tests with coverage report
pytest --cov=src
# Run specific test file
pytest tests/test_server.py
Code Quality
# Format code with black
black src tests
# Sort imports with isort
isort src tests
# Type checking with mypy
mypy src
Tool Discovery
When the server is running, you can discover available tools using the included tool discovery script:
python tools_discovery.py --url http://127.0.0.1:8000
This will output all available tools categorized by their functionality and save the complete tool information to tool_info.json.
Security Considerations
- When running in production, always enable authentication with the
--authflag - Consider restricting CORS origins in the configuration file
- The server generates an access token when started with authentication enabled
- Token expiration can be configured (default: 1 hour)
Troubleshooting
- Connection issues: Verify the host and port settings in your configuration
- Authentication errors: Ensure you're using the correct token from the server startup output
- Import errors: Make sure your Python path includes the project root
- Tool errors: Check the server logs for detailed error information
Contributing
- Follow the Python style guidelines (PEP 8)
- Add type annotations to all functions and classes
- Write comprehensive docstrings
- Add tests for new functionality
- Update documentation as needed
License
MIT License
Context7 Integration
This project includes Context7 MCP integration for up-to-date code documentation in Windsurf. Trigger Context7 in prompts by adding use context7 at the end of your query.
License
MIT