Sponsored by Deepsite.site

Sail MCP Server for Spark SQL

Created By
lakehq10 months ago
Sail is an open-source computation framework that serves as a drop-in replacement for Apache Spark (SQL and DataFrame API) in both single-host and distributed settings. The built-in MCP server in Sail exposes tools for LLM agents to register datasets and execute Spark SQL queries.
Content

Sail

Build Status PyPI Release PyPI Downloads Static Slack Badge

The mission of Sail is to unify stream processing, batch processing, and compute-intensive (AI) workloads. Currently, Sail features a drop-in replacement for Spark SQL and the Spark DataFrame API in both single-host and distributed settings.

✨News✨: Please check out our MCP server that brings data analytics in Spark to both LLM agents and humans!

Installation

Sail is available as a Python package on PyPI. You can install it using pip.

pip install "pysail[spark]"

Alternatively, you can install Sail from source for better performance for your hardware architecture. You can follow the Installation guide for more information.

Getting Started

Starting the Sail Server

Option 1: Command Line Interface You can start the local Sail server using the sail command.

sail spark server --port 50051

Option 2: Python API You can start the local Sail server using the Python API.

from pysail.spark import SparkConnectServer

server = SparkConnectServer(port=50051)
server.start(background=False)

Option 3: Kubernetes You can deploy Sail on Kubernetes and run Sail in cluster mode for distributed processing. Please refer to the Kubernetes Deployment Guide for instructions on building the Docker image and writing the Kubernetes manifest YAML file.

kubectl apply -f sail.yaml
kubectl -n sail port-forward service/sail-spark-server 50051:50051

Connecting to the Sail Server

Once you have a running Sail server, you can connect to it in PySpark. No changes are needed in your PySpark code!

from pyspark.sql import SparkSession

spark = SparkSession.builder.remote("sc://localhost:50051").getOrCreate()
spark.sql("SELECT 1 + 1").show()

Please refer to the Getting Started guide for further details.

Documentation

The documentation of the latest Sail version can be found here.

Further Reading

Contributing

Contributions are more than welcome!

Please submit GitHub issues for bug reports and feature requests. You are also welcome to ask questions in GitHub discussions.

Feel free to create a pull request if you would like to make a code change. You can refer to the development guide to get started.

Support

LakeSail offers flexible enterprise support options for Sail. Please contact us to learn more.

Server Config

{
  "mcpServers": {
    "sail": {
      "command": "sail",
      "args": [
        "spark",
        "mcp-server",
        "--transport",
        "stdio"
      ]
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
DeepChatYour AI Partner on Desktop
Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Playwright McpPlaywright MCP server
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code
Amap Maps高德地图官方 MCP Server
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
CursorThe AI Code Editor
Tavily Mcp
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
WindsurfThe new purpose-built IDE to harness magic
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
ChatWiseThe second fastest AI chatbot™
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
Serper MCP ServerA Serper MCP Server