Sensitive Lexicon Mcp

Created By

zephyrpersonal6 months ago

A Model Context Protocol (MCP) server that provides sensitive word detection and filtering capabilities for Large Language Models (LLMs), powered by the comprehensive https://github.com/konsheng/Sensitive-lexicon Chinese word database.

# sensitive-lexicon

# word-detection

Overview Content Tools Comments

Content

Sensitive Lexicon MCP Server

一个基于 Sensitive-lexicon 敏感词库的 MCP (Model Context Protocol) 服务器，为LLM提供敏感词检测和过滤功能。

功能特性

敏感词检测: 检测文本中的敏感词汇
敏感词过滤: 替换文本中的敏感词汇
多分类支持: 支持政治、色情、暴力、广告等多种敏感词分类
实时更新: 从GitHub仓库实时获取最新的敏感词库
易于集成: 标准MCP协议，易于与各种LLM集成

快速开始

方式一：NPM安装（推荐）

# 全局安装
npm install -g sensitive-lexicon-mcp

# 或项目本地安装
npm install sensitive-lexicon-mcp

方式二：源码安装

# 克隆项目
git clone https://github.com/zephyrpersonal/sensitive-lexicon-mcp.git
cd sensitive-lexicon-mcp

# 安装依赖
npm install

# 构建项目
npm run build

集成配置

Claude Desktop

在 Claude Desktop 的配置文件中添加：

Windows: %APPDATA%\Claude\claude_desktop_config.json macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "sensitive-lexicon": {
      "command": "npx",
      "args": ["sensitive-lexicon-mcp"]
    }
  }
}

如果是本地安装：

{
  "mcpServers": {
    "sensitive-lexicon": {
      "command": "node",
      "args": ["./path/to/sensitive-lexicon-mcp/dist/index.js"]
    }
  }
}

Continue.dev

在 config.json 中添加：

{
  "mcpServers": [
    {
      "name": "sensitive-lexicon",
      "command": "npx",
      "args": ["sensitive-lexicon-mcp"]
    }
  ]
}

Cline (VSCode Extension)

在 VSCode 设置中添加：

{
  "cline.mcpServers": {
    "sensitive-lexicon": {
      "command": "npx",
      "args": ["sensitive-lexicon-mcp"]
    }
  }
}

Zed Editor

在 Zed 的 settings.json 中添加：

{
  "language_models": {
    "anthropic": {
      "version": "1",
      "api_url": "https://api.anthropic.com",
      "mcp_servers": {
        "sensitive-lexicon": {
          "command": "npx",
          "args": ["sensitive-lexicon-mcp"]
        }
      }
    }
  }
}

Cursor IDE

在 Cursor 的设置中添加：

{
  "mcp.servers": {
    "sensitive-lexicon": {
      "command": "npx",
      "args": ["sensitive-lexicon-mcp"]
    }
  }
}

Custom MCP Client

如果您使用自定义的MCP客户端，可以这样连接：

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

const transport = new StdioClientTransport({
  command: 'npx',
  args: ['sensitive-lexicon-mcp']
});

const client = new Client({
  name: "sensitive-lexicon-client",
  version: "1.0.0"
}, {
  capabilities: {}
});

await client.connect(transport);

Python MCP Client

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def main():
    server_params = StdioServerParameters(
        command="npx",
        args=["sensitive-lexicon-mcp"]
    )
    
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # 初始化
            await session.initialize()
            
            # 调用工具
            result = await session.call_tool(
                "detect_sensitive_words", 
                {"text": "测试文本"}
            )
            print(result)

Docker 部署

创建 Dockerfile:

FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY dist ./dist

EXPOSE 3000

CMD ["npm", "start"]

运行容器：

docker build -t sensitive-lexicon-mcp .
docker run -p 3000:3000 sensitive-lexicon-mcp

环境变量配置

您可以通过环境变量自定义配置：

# 设置敏感词库更新间隔（秒）
export SENSITIVE_UPDATE_INTERVAL=3600

# 设置缓存大小
export SENSITIVE_CACHE_SIZE=10000

# 启用调试日志
export DEBUG=sensitive-lexicon:*

可用工具

1. detect_sensitive_words

检测文本中的敏感词

参数:

text (必需): 要检测的文本
categories (可选): 指定检测的分类数组

示例:

{
  "text": "这是一段测试文本",
  "categories": ["political", "violence"]
}

返回结果:

{
  "isSensitive": true,
  "sensitiveWordsCount": 2,
  "sensitiveWords": [
    {"word": "敏感词1", "category": "political"},
    {"word": "敏感词2", "category": "violence"}
  ],
  "summary": "Found 2 sensitive word(s) in the text"
}

2. filter_sensitive_words

过滤文本中的敏感词

参数:

text (必需): 要过滤的文本
replacement (可选): 替换字符串，默认为 "***"
categories (可选): 指定过滤的分类数组

示例:

{
  "text": "这是一段测试文本",
  "replacement": "[已屏蔽]",
  "categories": ["political"]
}

返回结果:

{
  "originalText": "这是一段测试文本",
  "filteredText": "这是一段[已屏蔽]文本",
  "isSensitive": true,
  "sensitiveWordsFound": 1,
  "sensitiveWords": [
    {"word": "测试", "category": "political"}
  ]
}

3. get_categories

获取可用的敏感词分类列表

返回结果:

{
  "categories": [
    "covid19", "gfw", "other", "subversive", 
    "advertisement", "political", "violence", 
    "livelihood", "weapons", "pornography-type", 
    "pornography", "supplementary", "corruption", 
    "tencent", "illegal-urls"
  ],
  "totalCategories": 15
}

4. get_word_count

获取敏感词库中的词汇数量

参数:

category (可选): 指定分类名称

示例:

{
  "category": "political"
}

返回结果:

{
  "category": "political",
  "wordCount": 1500
}

使用示例

在 Claude Desktop 中使用

配置完成后，您可以在 Claude Desktop 中直接使用：

请帮我检测这段文本是否包含敏感词："这是一段需要检测的文本内容"

请帮我过滤这段文本中的敏感词，并用[已屏蔽]替换："这是一段需要过滤的文本内容"

在 Continue.dev 中使用

在代码注释或文档中检测敏感词：

// 检查这个变量名是否包含敏感词
@sensitive-check 检测这个函数名：getUserPoliticalInfo

在编程中集成

// Node.js 示例
const { spawn } = require('child_process');

function detectSensitiveWords(text) {
  return new Promise((resolve, reject) => {
    const child = spawn('npx', ['sensitive-lexicon-mcp']);
    
    child.stdin.write(JSON.stringify({
      method: 'tools/call',
      params: {
        name: 'detect_sensitive_words',
        arguments: { text }
      }
    }));
    
    child.stdout.on('data', (data) => {
      resolve(JSON.parse(data));
    });
    
    child.stderr.on('data', (data) => {
      reject(new Error(data.toString()));
    });
  });
}

批量处理示例

# Python 批量处理示例
import asyncio
import json
from mcp.client.stdio import stdio_client

async def batch_check_content(texts):
    server_params = StdioServerParameters(
        command="npx",
        args=["sensitive-lexicon-mcp"]
    )
    
    results = []
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            for text in texts:
                result = await session.call_tool(
                    "detect_sensitive_words",
                    {"text": text}
                )
                results.append({
                    "text": text,
                    "result": result
                })
    
    return results

# 使用示例
texts = ["文本1", "文本2", "文本3"]
results = asyncio.run(batch_check_content(texts))
for item in results:
    print(f"文本: {item['text']}")
    print(f"结果: {item['result']}")

敏感词分类

支持以下敏感词分类：

covid19: COVID-19相关
gfw: GFW补充词库
other: 其他词库
subversive: 反动词库
advertisement: 广告类型
political: 政治类型
violence: 暴恐词库
livelihood: 民生词库
weapons: 涉枪涉爆
pornography-type: 色情类型
pornography: 色情词库
supplementary: 补充词库
corruption: 贪腐词库
tencent: 腾讯相关
illegal-urls: 非法网址

开发

# 开发模式运行
npm run dev

# 类型检查
npm run type-check

# 构建
npm run build

技术栈

TypeScript
Node.js
Model Context Protocol (MCP) SDK
Sensitive-lexicon 敏感词库

许可证

MIT License

免责声明

本项目仅用于学习和研究目的。使用者需要根据当地法律法规和平台政策合规使用。敏感词的定义可能因业务场景而异，请根据具体需求进行调整。

Server Config

{
  "mcpServers": {
    "sensitive-lexicon": {
      "command": "npx",
      "args": [
        "sensitive-lexicon-mcp"
      ]
    }
  }
}

Recommend Servers

TraeBuild with Free GPT-4.1 & Claude 3.7. Fully MCP-Ready.

DeepChatYour AI Partner on Desktop

TimeA Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

CursorThe AI Code Editor

AiimagemultistyleA Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

EdgeOne Pages MCPAn MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

Jina AI MCP ToolsA Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

Y GuiA web-based graphical interface for AI chat interactions with support for multiple AI models and MCP (Model Context Protocol) servers.

Playwright McpPlaywright MCP server

Baidu Map百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

MiniMax MCPOfficial MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Tavily Mcp

Howtocook Mcp基于Anduin2017 / HowToCook （程序员在家做饭指南）的mcp server，帮你推荐菜谱、规划膳食，解决“今天吃什么“的世纪难题； Based on Anduin2017/HowToCook (Programmer's Guide to Cooking at Home), MCP Server helps you recommend recipes, plan meals, and solve the century old problem of "what to eat today"

Zhipu Web SearchZhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

Amap Maps高德地图官方 MCP Server

MCP AdvisorMCP Advisor & Installation - Use the right MCP server for your needs

Visual Studio Code - Open Source ("Code - OSS")Visual Studio Code

Serper MCP ServerA Serper MCP Server

BlenderBlenderMCP connects Blender to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Blender. This integration enables prompt assisted 3D modeling, scene creation, and manipulation.

WindsurfThe new purpose-built IDE to harness magic

ChatWiseThe second fastest AI chatbot™