Content
# MCP Development Framework
[](https://smithery.ai/server/@aigo666/mcp-framework)
A powerful MCP (Model Context Protocol) development framework for creating custom tools that interact with large language models. This framework provides a complete set of tools to easily extend the functionality of Cursor IDE, enabling advanced features such as web content retrieval, file processing (PDF, Word, Excel, CSV, Markdown), and AI conversations.
## Main Features
This framework offers the following core functionalities:
### 1. Comprehensive File Processing
The `parse_file` tool can automatically identify file types and select the appropriate processing method, supporting PDF, Word, Excel, CSV, and Markdown files.
- **Usage**: `parse_file /path/to/document`
- **Supported Formats**:
- PDF files (.pdf)
- Word documents (.doc, .docx)
- Excel files (.xls, .xlsx, .xlsm)
- CSV files (.csv)
- Markdown files (.md)
- **Parameters**: `file_path` - Local path of the file
- **Returns**: Corresponding processing results based on the file type
### 2. PDF Document Processing
The `parse_pdf` tool can process PDF documents, supporting two processing modes:
- **Usage**: `parse_pdf /path/to/document.pdf [mode]`
- **Parameters**:
- `file_path` - Local path of the PDF file
- `mode` - Processing mode (optional):
- `quick` - Quick preview mode, extracts text content only
- `full` - Full parsing mode, extracts text and image content (default)
- **Returns**:
- Quick preview mode: Text content of the document
- Full parsing mode: Text content and images of the document
### 3. Word Document Parsing
The `parse_word` tool can parse Word documents, extracting text, tables, and image information.
- **Usage**: `parse_word /path/to/document.docx`
- **Functionality**: Parses Word documents and extracts text content, tables, and images
- **Parameters**: `file_path` - Local path of the Word document
- **Returns**: Text content, tables, and image information of the document
- **Features**: Uses the python-docx library to provide high-quality text and table extraction
### 4. Excel File Processing
The `parse_excel` tool can parse Excel files, providing complete table data and structure information.
- **Usage**: `parse_excel /path/to/spreadsheet.xlsx`
- **Functionality**: Parses all worksheets in the Excel file
- **Parameters**: `file_path` - Local path of the Excel file
- **Returns**:
- Basic file information (file name, number of worksheets)
- Detailed information for each worksheet:
- Number of rows and columns
- List of column names
- Complete table data
- **Features**:
- Uses pandas and openpyxl for high-quality table data processing
- Supports multi-worksheet processing
- Automatically handles data type conversion
### 5. CSV File Processing
The `parse_csv` tool can parse CSV files, providing complete data analysis and preview functionality.
- **Usage**: `parse_csv /path/to/data.csv`
- **Functionality**: Parses CSV files and provides data analysis
- **Parameters**:
- `file_path` - Local path of the CSV file
- `encoding` - File encoding format (optional, defaults to auto-detection)
- **Returns**:
- Basic file information (file name, number of rows, number of columns)
- List of column names
- Data preview (first 5 rows)
- Descriptive statistics
- **Features**:
- Automatic encoding detection
- Supports various encoding formats (UTF-8, GBK, etc.)
- Provides data statistical analysis
- Intelligent data type handling
### 6. Markdown File Parsing
The `parse_markdown` tool can parse Markdown files, extracting text content, heading structure, and lists.
- **Usage**: `parse_markdown /path/to/document.md`
- **Functionality**: Parses Markdown files and extracts heading structure, lists, and text content
- **Parameters**: `file_path` - Local path of the Markdown file
- **Returns**:
- Basic file information (file name, size, modification time, etc.)
- Display of heading structure levels
- Content element statistics (code blocks, lists, links, images, tables, etc.)
- Original Markdown content
- **Features**:
- Automatically identifies headings and structure
- Intelligent statistics of content elements
- Complete display of heading hierarchy
### 7. Web Content Retrieval
The `url` tool can retrieve content from any webpage.
- **Usage**: `url https://example.com`
- **Parameters**: `url` - The URL of the website to retrieve content from
- **Returns**: Text content of the webpage
- **Features**:
- Complete HTTP error handling
- Timeout management
- Automatic encoding handling
### 8. MaxKB AI Conversation
The `maxkb` tool can interact with the MaxKB API to achieve intelligent conversation functionality.
- **Usage**: `maxkb "Your question or command"`
- **Functionality**: Sends a message to the MaxKB API and retrieves AI responses
- **Parameters**:
- `message` - The message content to send (required)
- `re_chat` - Whether to restart the conversation (optional, defaults to false)
- `stream` - Whether to use streaming responses (optional, defaults to true)
- **Returns**: AI's response content
- **Features**:
- Supports streaming responses
- Automatic retry mechanism
- Complete error handling
- 60-second timeout protection
- Connection configuration optimization
## Technical Features
This framework employs various technologies to optimize file processing performance:
1. **Intelligent File Type Recognition**
- Automatically selects the appropriate processing tool based on file extension
- Provides a unified file processing interface
2. **Efficient Document Processing**
- PDF processing: Supports both quick preview and full parsing modes
- Word processing: Accurately extracts text, tables, and images
- Excel processing: Efficiently handles large table data
3. **Memory Optimization**
- Uses temporary files to manage large files
- Automatically cleans up temporary resources
- Processes large documents in chunks
4. **Error Handling**
- Complete exception capture and handling
- Detailed error information feedback
- Graceful failure handling mechanism
## Project Structure
This framework adopts a modular design for easy expansion and maintenance:
```
mcp_tool/
├── tools/
│ ├── __init__.py # Defines the base tool class and registry
│ ├── loader.py # Tool loader, automatically loads all tools
│ ├── file_tool.py # Comprehensive file processing tool
│ ├── pdf_tool.py # PDF parsing tool
│ ├── word_tool.py # Word document parsing tool
│ ├── excel_tool.py # Excel file processing tool
│ ├── csv_tool.py # CSV file processing tool
│ ├── markdown_tool.py # Markdown file parsing tool
│ ├── url_tool.py # URL tool implementation
│ └── maxkb_tool.py # MaxKB AI conversation tool
├── __init__.py
├── __main__.py
└── server.py # MCP server implementation
```
## Development Guide
### How to Develop a New Tool
1. Create a new Python file in the `tools` directory, such as `your_tool.py`
2. Import necessary dependencies and base classes
3. Create a tool class that inherits from `BaseTool`
4. Register the tool using the `@ToolRegistry.register` decorator
5. Implement the tool's `execute` method
### Tool Template Example
```python
import mcp.types as types
from . import BaseTool, ToolRegistry
@ToolRegistry.register
class YourTool(BaseTool):
"""Your tool description"""
name = "your_tool_name" # Unique identifier for the tool
description = "Your tool description" # Description of the tool to be displayed to users
input_schema = {
"type": "object",
"required": ["param1"], # Required parameters
"properties": {
"param1": {
"type": "string",
"description": "Description of parameter 1",
},
"param2": {
"type": "integer",
"description": "Description of parameter 2 (optional)",
}
},
}
async def execute(self, arguments: dict) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
"""Execute tool logic"""
# Parameter validation
if "param1" not in arguments:
return [types.TextContent(
type="text",
text="Error: Missing required argument 'param1'"
)]
# Get parameters
param1 = arguments["param1"]
param2 = arguments.get("param2", 0) # Get optional parameter with default value
# Execute tool logic
result = f"Processing parameters: {param1}, {param2}"
# Return result
return [types.TextContent(
type="text",
text=result
)]
```
## Deployment Guide
### Environment Variable Configuration
Configure the following environment variables in the `.env` file:
```bash
# Server Configuration
MCP_SERVER_PORT=8000 # Server port
MCP_SERVER_HOST=0.0.0.0 # Server host
# MaxKB Configuration
MAXKB_HOST=http://host.docker.internal:8080 # MaxKB API host address
MAXKB_CHAT_ID=your_chat_id_here # MaxKB chat ID
MAXKB_APPLICATION_ID=your_application_id_here # MaxKB application ID
MAXKB_AUTHORIZATION=your_authorization_key # MaxKB authorization key
# Debug Mode
DEBUG=false # Whether to enable debug mode
# User Agent
MCP_USER_AGENT="MCP Test Server (github.com/modelcontextprotocol/python-sdk)"
# Local Directory Mount Configuration
HOST_MOUNT_SOURCE=/path/to/your/local/directory # Local directory path
HOST_MOUNT_TARGET=/host_files # Mount path inside the container
```
### Local Directory Mounting
The framework supports mounting local directories into the container so that tools can access local files. Configuration method:
1. Set the `HOST_MOUNT_SOURCE` and `HOST_MOUNT_TARGET` environment variables in the `.env` file
2. `HOST_MOUNT_SOURCE` is the directory path on your local machine
3. `HOST_MOUNT_TARGET` is the mount path inside the container (default is `/host_files`)
When using tools, you can directly reference the local file path, and the framework will automatically convert it to the path inside the container. For example:
```
# Use PDF tool to process a local file
pdf "/Users/username/Documents/example.pdf"
# The framework will automatically convert the path to the container path
# For example: "/host_files/example.pdf"
```
This way, you can easily access local files without modifying the tool code.
### Docker Deployment (Recommended)
1. Initial Setup:
```bash
# Clone the repository
git clone https://github.com/your-username/mcp-framework.git
cd mcp-framework
# Create environment file
cp .env.example .env
```
2. Using Docker Compose:
```bash
# Build and start
docker compose up --build -d
# View logs
docker compose logs -f
# Manage containers
docker compose ps
docker compose pause
docker compose unpause
docker compose down
```
3. Access the service:
- SSE endpoint: http://localhost:8000/sse
4. Cursor IDE Configuration:
- Settings → Features → Add MCP Server
- Type: "sse"
- URL: `http://localhost:8000/sse`
### Traditional Python Deployment
1. Install system dependencies:
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y poppler-utils tesseract-ocr tesseract-ocr-chi-sim
# macOS
brew install poppler tesseract tesseract-lang
# Windows
# 1. Download and install Tesseract: https://github.com/UB-Mannheim/tesseract/wiki
# 2. Add Tesseract to system PATH
```
2. Install Python dependencies:
```bash
# Create a virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
.\venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
```
3. Start the service:
```bash
python -m mcp_tool
```
## Dependencies
Main dependencies:
- `mcp`: Model Context Protocol implementation
- `PyMuPDF`: PDF document processing
- `python-docx`: Word document processing
- `pandas` and `openpyxl`: Excel file processing
- `httpx`: Asynchronous HTTP client
- `anyio`: Asynchronous I/O support
- `click`: Command-line interface
## Contribution Guide
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.