A Node.js Model Context Protocol (MCP) server that exposes Together AI's inference endpoints — chat completions, image generation, vision, and embeddings — as tools callable from Claude Desktop, Cursor, VS Code, and any other MCP-compatible client.
I created this MCP due to an issue I was having accessing reasoning models through Together AI.
Together AI's largest reasoning models (GLM-5, Qwen3.5-397B, MiniMax M2.5, Kimi K2.5) use a non-standard response format. During chain-of-thought generation, these models write their reasoning trace into choices[0].message.reasoning while leaving choices[0].message.content as an empty string. The final answer only appears in message.content once thinking is complete.