Content
# MCP PDF Server
MCP PDF Server is a Model Context Protocol (MCP) based server that efficiently manages PDF files.
This project was created by me, an embedded developer, to make development tasks more convenient by directly reading and summarizing PDF datasheet documents in AI coding tools like Cursor, or by using question and answer. In other words, the main purpose is to support AI in quickly understanding the contents of PDF datasheets and providing necessary information immediately.
This project consists of two main components:
- **manager_server**: A FastAPI-based webpage that allows users to upload, download, and manage PDF files through a web UI, as well as providing a RESTful API for integration with external systems.
- **mcp_server**: Based on the PDF files managed by manager_server, it provides file name search and text extraction functions. The extracted text can be linked with external systems (e.g., Cursor, etc.) through the MCP protocol.
Main Features:
- PDF text extraction (local file and URL support)
- PDF search based on file name
- PDF list query and management
- PDF file web upload/download support
- RESTful API and web service provision
- External system (Curator, Cursor, etc.) integration via MCP protocol
It can be easily integrated with external systems through RESTful API and web UI, and can be easily deployed and operated in both Docker and local environments. It is suitable for automated management and search of various PDF documents such as datasheets, papers, and contracts.
## Key Features
- Extract text from local PDF files and PDFs accessible by URL
- Provides a list of PDF files under `/app/datasheets`
- Provides PDF search function by file name
- Stable text extraction and exception handling based on PyPDF2
- Provides standardized MCP tools based on FastMCP
## Run with Docker
1. **Build Image**
```bash
docker build -t mcp-pdf-server:1.0.0 .
```
2. **Run Container**
```bash
docker run -d \
-v /host/path/data:/app/datasheets \
-p 5050:5050 \
-p 5080:5080 \
--name mcp-pdf-server \
mcp-pdf-server:1.0.0
```
- If you put the PDF file in `/host/path/data`, you can access it in `/app/datasheets` inside the container.
- Ports 5050 and 5080 are used.
3. **Using docker-compose**
```bash
# Modify /path/to/your/datasheets in docker-compose.yml to the actual PDF folder path.
docker-compose up -d --build
```
## Run Directly from Local (Python)
1. **Install Dependencies**
```bash
pip install -r requirements.txt
```
2. **Run Server**
```bash
python mcp_server/mcp_pdf_server.py
# or
uvicorn manager_server.main:app --host 0.0.0.0 --port 5080
```
## MCP Tool (API) Description
- **read_local_pdf**
Extracts text by receiving the local PDF file path.
- **read_url_pdf**
Extracts text by receiving the URL of the PDF file.
- **server_pdf_list**
Returns a list of all PDF files under `/app/datasheets`.
- **server_pdf_search**
Search for PDF files on the server by entering the file name and extract the text of the PDF.
## Path Guide
- PDF data must be located in the `/app/datasheets` path (inside the Docker container).
- When using Docker, mount the host's PDF folder to `/app/datasheets`.
- The source code is located in `/app/mcp_server` (based on the inside of the container).
## License
Apache License 2.0
Author: Dev91
Connection Info
You Might Also Like
markitdown
Python tool for converting files and office documents to Markdown.
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
TrendRadar
TrendRadar: Your hotspot assistant for real news in just 30 seconds.
mempalace
The highest-scoring AI memory system ever benchmarked. And it's free.
mempalace
The highest-scoring AI memory system ever benchmarked. And it's free.