Sponsored by Deepsite.site

Doc Ops Server

Created By
Tele-AI4 months ago
A Model Context Protocol server for seamless document format conversion and processing
Content

Document Operations MCP Server

npm version License: MIT Downloads

Language / 语言: English | 中文

Document Operations MCP Server - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.

Table of Contents

  1. Quick Start
  2. System Architecture
  3. Optional Integration
  4. Features
  5. Performance Metrics
  6. Open Source Licenses
  7. Future Roadmap
  8. Docker Deployment
  9. Development Guide
  10. Troubleshooting
  11. Contributing

1. Getting Started

First, add the Document Operations MCP server to your MCP client.

Standard config works in most MCP clients:

{
  "mcpServers": {
    "doc-ops-mcp": {
      "command": "npx",
      "args": ["-y", "doc-ops-mcp"]
    }
  }
}
Claude Desktop

Follow the MCP install guide, use the standard config above.

VS Code

Follow the MCP install guide, use the standard config above.

Cursor

Go to Cursor Settings -> MCP -> Add new MCP Server. Name to your liking, use command type with the command npx -y doc-ops-mcp.

Other MCP Clients

For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.

Configuration

The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the "env" object:

{
  "mcpServers": {
    "doc-ops-mcp": {
      "command": "npx",
      "args": ["-y", "doc-ops-mcp"],
      "env": {
        "OUTPUT_DIR": "/path/to/your/output/directory",
        "CACHE_DIR": "/path/to/your/cache/directory",
        "WATERMARK_IMAGE": "/path/to/watermark.png",
        "QR_CODE_IMAGE": "/path/to/qrcode.png"
      }
    }
  }
}

Supported Document Operations

FormatConvert to PDFConvert to DOCXConvert to HTMLConvert to MarkdownContent RewritingWatermark/QR Code
PDF
DOCX
HTML
Markdown

Rewriting Features:

  • Content Replacement: Support batch text replacement and regular expression replacement
  • Format Adjustment: Modify document structure, heading levels, and style formatting
  • Smart Rewriting: Content optimization while preserving original document format

Usage Examples

Format Conversion:

Convert /Users/docs/report.docx to PDF
Convert /Users/docs/article.md to HTML
Convert /Users/docs/presentation.html to DOCX
Convert /Users/docs/readme.md to PDF (with theme styling)

Document Rewriting:

Rewrite company names in /Users/docs/contract.md
Batch replace terminology in /Users/docs/manual.docx
Adjust heading levels in /Users/docs/article.html
Update dates and version numbers in /Users/docs/policy.md

PDF Enhancement:

Add watermark to /Users/docs/document.pdf
Add QR code to /Users/docs/report.pdf
Add company logo watermark to /Users/docs/invoice.pdf

Environment Variables

The server supports environment variables for controlling output paths and PDF enhancement features:

Core Directories

  • OUTPUT_DIR: Controls where all generated files are saved (default: ~/Documents)
  • CACHE_DIR: Directory for temporary and cache files (default: ~/.cache/doc-ops-mcp)

PDF Enhancement Features

  • WATERMARK_IMAGE: Default watermark image path for PDF files
    • Automatically added to all PDF conversions
    • Supported formats: PNG, JPG
    • If not set, default text watermark "doc-ops-mcp" will be used
  • QR_CODE_IMAGE: Default QR code image path for PDF files
    • Added to PDFs only when explicitly requested (addQrCode=true)
    • Supported formats: PNG, JPG
    • If not set, QR code functionality will be unavailable

Output Path Rules:

  1. If outputPath is not provided → files saved to OUTPUT_DIR with auto-generated names
  2. If outputPath is relative → resolved relative to OUTPUT_DIR
  3. If outputPath is absolute → used as-is, ignoring OUTPUT_DIR

See OUTPUT_PATH_CONTROL.md for detailed documentation.

2. System Architecture

Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:

┌─────────────────────────────────────────────────────────────┐
│                    MCP Client Layer                         │
│           (Claude Desktop, Cursor, VS Code, etc.)           │
└─────────────────────┬───────────────────────────────────────┘
                      │ JSON-RPC 2.0
┌─────────────────────┴───────────────────────────────────────┐
│                 Doc-Ops-MCP Server                         │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │
│  │   Tool Router   │  │  Request        │  │  Response   │ │
│  │   & Handler     │  │  Validator      │  │  Formatter  │ │
│  └────────┬────────┘  └────────┬────────┘  └──────┬──────┘ │
│           │                    │                  │        │
│  ┌────────┴────────────────────┴──────────────────┴─────┐ │
│  │                Document Processing Engine             │ │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │ │
│  │  │  Document   │  │   Format    │  │   Style     │   │ │
│  │  │   Reader    │  │  Converter  │  │  Processor  │   │ │
│  │  └─────────────┘  └─────────────┘  └─────────────┘   │ │
│  │                                                        │ │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │ │
│  │  │    PDF      │  │  Watermark/ │  │ Conversion  │   │ │
│  │  │ Enhancement │  │   QR Code   │  │  Planner    │   │ │
│  │  └─────────────┘  └─────────────┘  └─────────────┘   │ │
└────┴───────────────────────────────────────────────────────┴─┘
┌───────────────────────────┴─────────────────────────────────┐
│                    Core Dependencies Layer                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   pdf-lib   │  │   mammoth   │  │   marked    │          │
│  │ (PDF Tools) │  │(DOCX Tools) │  │ (Markdown)  │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   cheerio   │  │   turndown  │  │    docx     │          │
│  │(HTML Parser)│  │(HTML to MD) │  │(DOCX Gen.)  │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
└─────────────────────────────────────────────────────────────┘

Architecture Overview

Core Features:

  • Pure JavaScript implementation with no external system dependencies
  • Complete document reading, conversion, and style processing capabilities
  • Built-in PDF watermark and QR code addition functionality
  • Intelligent conversion planning and path optimization

Conversion Flow:

  • Direct Conversion: Supports direct conversion between most formats
  • Multi-step Conversion: Complex conversions achieved through intermediate formats
  • Style Preservation: Uses OOXML parser to ensure complete style integrity

3. Optional Integration

This server can work with playwright-mcp for enhanced PDF conversion capabilities. Please refer to the official playwright-mcp documentation for detailed configuration.

🔧 PDF Conversion Workflow

This server supports complete PDF conversion functionality:

  1. Document Parsing: Use OOXML parser to ensure complete style preservation
  2. Format Conversion: Convert documents to high-quality HTML format
  3. PDF Generation: Built-in converter or optionally work with playwright-mcp
  4. Enhancement Processing: Automatically add watermarks and QR codes (if configured)

How It Works

This server uses intelligent conversion architecture:

  1. Smart Planning: plan_conversion analyzes conversion requirements and selects optimal paths
  2. Format Conversion: Use specialized converters to handle various document formats
  3. Style Preservation: Ensure style integrity through OOXML parser
  4. Enhancement Processing: Automatically add watermarks, QR codes and other enhancements
  5. Optional Integration: Support working with playwright-mcp for enhanced capabilities

4. Features

MCP Tools

Core Document Tools

Tool NameDescriptionInput ParametersExternal Dependencies
read_documentRead document contentfilePath: Document path
extractMetadata: Extract metadata
preserveFormatting: Preserve formatting
None
write_documentWrite document contentcontent: Document content
outputPath: Output file path
encoding: File encoding
None
convert_documentSmart document conversioninputPath: Input file path
outputPath: Output file path
preserveFormatting: Preserve formatting
None
plan_conversionConversion plannersourceFormat: Source format
targetFormat: Target format
preserveStyles: Preserve styles
quality: Conversion quality
None
read_document

Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.

Parameters:

  • filePath (string, required) - Document path to read
  • extractMetadata (boolean, optional) - Extract document metadata, defaults to false
  • preserveFormatting (boolean, optional) - Preserve formatting (HTML output), defaults to false
write_document

Write content to document files in specified formats.

Parameters:

  • content (string, required) - Content to write
  • outputPath (string, optional) - Output file path (auto-generated if not provided)
  • encoding (string, optional) - File encoding, defaults to utf-8
convert_document

Convert documents between formats with enhanced style preservation.

Parameters:

  • inputPath (string, required) - Input file path
  • outputPath (string, optional) - Output file path (auto-generated if not provided)
  • preserveFormatting (boolean, optional) - Preserve formatting, defaults to true
  • useInternalPlaywright (boolean, optional) - Use built-in Playwright for PDF conversion, defaults to false
convert_docx_to_pdf

Convert DOCX to PDF with automatic watermark addition (if configured).

Parameters:

  • docxPath (string, required) - DOCX file path
  • outputPath (string, optional) - Output PDF path (auto-generated if not provided)
  • addQrCode (boolean, optional) - Whether to add QR code, defaults to false
  • preserveFormatting (boolean, optional) - Preserve original formatting, defaults to true
  • chineseFont (string, optional) - Chinese font, defaults to Microsoft YaHei
convert_markdown_to_pdf

Convert Markdown to PDF with automatic watermark addition (if configured).

Parameters:

  • markdownPath (string, required) - Markdown file path
  • outputPath (string, optional) - Output PDF path (auto-generated if not provided)
  • theme (string, optional) - Theme style, defaults to "github"
  • includeTableOfContents (boolean, optional) - Include table of contents, defaults to false
  • addQrCode (boolean, optional) - Whether to add QR code, defaults to false
convert_markdown_to_html

Convert Markdown to HTML.

Parameters:

  • markdownPath (string, required) - Markdown file path
  • outputPath (string, optional) - Output HTML path (auto-generated if not provided)
  • theme (string, optional) - Theme style, defaults to "github"
  • includeTableOfContents (boolean, optional) - Include table of contents, defaults to false
convert_markdown_to_docx

Convert Markdown to DOCX.

Parameters:

  • markdownPath (string, required) - Markdown file path
  • outputPath (string, optional) - Output DOCX path (auto-generated if not provided)
convert_html_to_markdown

Convert HTML to Markdown.

Parameters:

  • htmlPath (string, required) - HTML file path
  • outputPath (string, optional) - Output Markdown path (auto-generated if not provided)
plan_conversion

🎯 Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.

Parameters:

  • sourceFormat (string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)
  • targetFormat (string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)
  • sourceFile (string, optional) - Source file path (for generating specific conversion parameters)
  • preserveStyles (boolean, optional) - Whether to preserve style formatting, defaults to true
  • includeImages (boolean, optional) - Whether to include images, defaults to true
  • theme (string, optional) - Conversion theme, defaults to github
  • quality (string, optional) - Conversion quality requirement (fast, balanced, high), defaults to balanced
process_pdf_post_conversion

Parameters:

  • playwrightPdfPath (string, required) - Generated PDF file path
  • targetPath (string, optional) - Target PDF file path (auto-generated if not provided)
  • addWatermark (boolean, optional) - Whether to add watermark, defaults to false
  • addQrCode (boolean, optional) - Whether to add QR code, defaults to false
  • watermarkImage (string, optional) - Watermark image path
  • qrCodePath (string, optional) - QR code image path

PDF Enhancement Tools

add_watermark

🎨 PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.

Parameters:

  • pdfPath (string, required) - PDF file path
  • watermarkImage (string, optional) - Watermark image path (PNG/JPG)
  • watermarkText (string, optional) - Watermark text content
  • watermarkImageScale (number, optional) - Image scale ratio, defaults to 0.25
  • watermarkImageOpacity (number, optional) - Image opacity, defaults to 0.6
  • watermarkImagePosition (string, optional) - Image position, defaults to fullscreen
add_qrcode

📱 PDF QR Code Addition Tool - Add QR codes to PDF documents.

Parameters:

  • pdfPath (string, required) - PDF file path
  • qrCodePath (string, optional) - QR code image path
  • qrScale (number, optional) - QR code scale ratio, defaults to 0.15
  • qrOpacity (number, optional) - QR code opacity, defaults to 1.0
  • qrPosition (string, optional) - QR code position, defaults to bottom-center
  • addText (boolean, optional) - Whether to add explanatory text, defaults to true

5. Performance Metrics

Document Processing Capabilities

Document TypeMax File SizeProcessing SpeedMemory Usage
PDF50MB2-5MB/s~File size×1.5
DOCX50MB5-10MB/s~File size×2
HTML50MB10-20MB/s~File size×1.2
Markdown50MB15-30MB/s~File size×1.1

Conversion Performance

  • PDF Conversion: Requires playwright-mcp integration, ~1-3 pages/second
  • DOCX Conversion: Pure JavaScript processing, ~5-15 pages/second
  • HTML Conversion: Fastest, ~20-50 pages/second
  • Concurrent Processing: Supports up to 5 concurrent tasks

System Resource Requirements

  • Minimum Memory: 512MB
  • Recommended Memory: 2GB (for large files)
  • CPU: Single core sufficient, multi-core improves concurrency
  • Disk Space: Temporary files require 2-3x original file size

System Requirements

System Requirements

  • Node.js ≥ 18.0.0
  • Zero external system dependencies - All processing via npm packages
  • Optional Integration: playwright-mcp for enhanced PDF conversion

Core Technology Stack

  • pdf-lib - PDF operations and enhancement
  • mammoth - DOCX document processing
  • marked - Markdown parsing and rendering
  • cheerio - HTML parsing and manipulation
  • turndown - HTML to Markdown conversion
  • docx - DOCX document generation

Installation

# Global installation
npm install -g doc-ops-mcp

# Or using pnpm
pnpm add -g doc-ops-mcp

# Or using bun
bun add -g doc-ops-mcp

Architecture Components

  • MCP Server Core: Handles JSON-RPC 2.0 communication and tool registration
  • Smart Router: Routes requests to optimal processing modules
  • Conversion Engine: Contains specialized converters for different document types
  • Style Processor: Ensures style preservation during format conversion
  • Security Module: Provides path validation and content security handling

6. Open Source Licenses

Project License

  • This Project: MIT License
  • Compatibility: Available for commercial and non-commercial use

Third-Party Dependencies

LibraryVersionLicensePurpose
pdf-lib^1.17.1MITPDF document manipulation
mammoth^1.6.0BSD-2-ClauseDOCX parsing and conversion
marked^9.1.6MITMarkdown parsing and rendering
exceljs^4.4.0MITExcel file processing
jsdom^23.0.1MITHTML DOM manipulation
turndown^7.1.2MITHTML to Markdown conversion

License Compatibility

  • Commercial Use: All dependencies support commercial use
  • Distribution: Free to distribute and modify
  • Patent Protection: Apache-2.0 provides patent protection
  • ⚠️ Notice: Original license notices must be retained

7. Future Roadmap

Core Features

  • 🔄 Enhanced Conversion Quality: Improve style preservation for complex documents
  • 📊 Excel Support: Complete Excel read/write and conversion functionality
  • 🎨 Template System: Support for custom document templates
  • 🔍 OCR Integration: Image text recognition capabilities

System Improvements

  • 🌐 Multi-language Support: Internationalization and localization
  • 🔐 Security Enhancements: Document encryption and access control
  • Performance Optimization: Large file handling and memory optimization
  • 🔌 Plugin System: Extensible processor architecture

Version Roadmap

  • v2.0: Complete Excel support and template system
  • v3.0: OCR integration and multi-language support
  • v4.0: Advanced security features and plugin system

8. Docker Deployment

Quick Start

Using Pre-built Image

# Pull the latest image
docker pull docops/doc-ops-mcp:latest

# Run with default configuration
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  docops/doc-ops-mcp:latest

Building from Source

# Clone the repository
git clone https://github.com/JefferyMunoz/doc-ops-mcp.git
cd doc-ops-mcp

# Build the Docker image
docker build -t doc-ops-mcp .

# Run the container
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -v $(pwd)/documents:/app/documents \
  doc-ops-mcp

Docker Compose Deployment

Create a docker-compose.yml file:

version: '3.8'

services:
  doc-ops-mcp:
    image: docops/doc-ops-mcp:latest
    container_name: doc-ops-mcp
    ports:
      - "3000:3000"
    volumes:
      - ./documents:/app/documents
      - ./config:/app/config
    environment:
      - NODE_ENV=production
      - PORT=3000
    restart: unless-stopped
    
  # Optional: Add Nginx for reverse proxy
  nginx:
    image: nginx:alpine
    container_name: doc-ops-nginx
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - doc-ops-mcp
    restart: unless-stopped

Environment Variables

VariableDescriptionDefault
PORTServer port3000
NODE_ENVEnvironment modeproduction
LOG_LEVELLogging levelinfo
MAX_FILE_SIZEMaximum file size (MB)50

Volume Mounts

Mount local directories for persistent storage:

# Documents directory for file processing
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -v $(pwd)/documents:/app/documents \
  -v $(pwd)/output:/app/output \
  doc-ops-mcp

Docker Configuration Examples

Production Deployment

# Production setup with Docker Swarm
docker swarm init
docker stack deploy -c docker-compose.yml doc-ops

# Scale the service
docker service scale doc-ops_mcp=3

Health Checks

The container includes built-in health checks:

# Check container health
docker ps

# View health check logs
docker inspect --format='{{.State.Health.Status}}' doc-ops-mcp

# Manual health check
docker exec doc-ops-mcp curl -f http://localhost:3000/health || exit 1

9. Development Guide

Local Development

# Clone the repository
git clone https://github.com/your-org/doc-ops-mcp.git
cd doc-ops-mcp

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build the project
npm run build

# Run tests
npm test

Project Structure

src/
├── index.ts          # MCP server entry point
├── tools/            # Tool implementations
│   ├── documentConverter.ts
│   ├── pdfTools.ts
│   └── ...
├── types/            # Type definitions
└── utils/            # Utility functions

Adding New Tools

  1. Create a new tool file in src/tools/
  2. Implement the tool logic
  3. Register the tool in src/index.ts
  4. Add test cases
  5. Update documentation

10. Troubleshooting

Common Issues

  1. Port conflicts: Change the host port in docker-compose.yml
  2. Permission issues: Ensure volume mounts have correct permissions
  3. Memory issues: Increase Docker memory allocation

Debug Mode

# Run with debug logging
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -e LOG_LEVEL=debug \
  doc-ops-mcp

# View logs
docker logs -f doc-ops-mcp

11. Contributing

How to Contribute

  1. Fork the Project
  2. Create a Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit Your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Intellectual Property License

By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License. This means:

  • You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
  • You confirm that you have the right to make these contributions
  • You understand that your contributions will become part of the open source project
  • You waive any claims to exclusive ownership of the contributed code

If you cannot agree to these terms, please do not submit a Pull Request.

Code Standards

  • Use TypeScript
  • Follow ESLint configuration
  • Add appropriate tests
  • Update relevant documentation

Reporting Issues

  • Use GitHub Issues
  • Provide detailed error information and reproduction steps
  • Include system environment information

License

This project is licensed under the MIT License - see the LICENSE file for details.

Server Config

{
  "mcpServers": {
    "doc-ops-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "doc-ops-mcp@latest"
      ],
      "env": {
        "OUTPUT_DIR": "/tmp/output",
        "CACHE_DIR": "/tmp/cache"
      }
    }
  }
}
Recommend Servers
TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.
MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs
Amap Maps高德地图官方 MCP Server
Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.
Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.
Serper MCP ServerA Serper MCP Server
CursorThe AI Code Editor
ChatWiseThe second fastest AI chatbot™
AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.
Tavily Mcp
MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
DeepChatYour AI Partner on Desktop
BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.
Context7Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.
Playwright McpPlaywright MCP server
Baidu Map百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.
Howtocook Mcp基于Anduin2017 / HowToCook (程序员在家做饭指南)的mcp server,帮你推荐菜谱、规划膳食,解决“今天吃什么“的世纪难题; Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"
WindsurfThe new purpose-built IDE to harness magic
Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code