Content
# MCP Video Digest
<div align="right">
<a href="README_EN.md">English</a> | <b>中文</b>
</div>
## Project Introduction
MCP Video Digest is a video content processing service that can extract audio from videos on platforms such as YouTube, Bilibili, TikTok, and Twitter, and convert it into text. This service supports multiple transcription service providers, including Deepgram, Gladia, Speechmatics, and AssemblyAI, allowing for flexible selection based on the configured API keys. (This is the first MCP practice project, mainly to familiarize with the development and operation process of MCP.)
## Features
- Supports downloading and audio extraction from streaming content on over 1000 websites
- Multiple transcription service providers supported:
- Deepgram
- Gladia
- Speechmatics
- AssemblyAI
- Flexible service selection mechanism that automatically chooses a service based on available API keys
- Asynchronous processing design to improve concurrency performance
- Comprehensive error handling and logging
- Supports speaker separation
- × Supports local model CPU/GPU accelerated processing
## Directory Structure
```
.
├── src/ # Source code directory
│ ├── services/ # Service implementation directory
│ │ ├── download/ # Download service
│ │ └── transcription/ # Transcription service
│ ├── main.py # Main program logic
│ └── __init__.py # Package initialization file
├── config/ # Configuration file directory
├── test.py # Test script
├── run.py # Service startup script
├── pyproject.toml # Project configuration and dependency management
├── uv.lock # UV dependency lock file
└── .env # Environment variable configuration
```
## Test Screenshots


## Installation Instructions
### 1. Install uv or use Python
If you haven't installed uv yet, you can use the following command to install it:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### 2. Clone the project:
```bash
git clone https://github.com/R-lz/mcp-video-digest.git
cd mcp-video-digest
```
### 3. Create and activate a virtual environment:
```bash
uv venv
source .venv/bin/activate # Linux/Mac
# or
.venv\Scripts\activate # Windows
```
### 4. Install dependencies:
```bash
uv pip install -e .
```
> There were various issues with using requests for Speechmatics debugging (not a problem with Speechmatics, just my lack of experience), so I used the Speechmatics SDK instead.
## Configuration Instructions
1. Create a `.env` file in the project root directory or rename `.env.example`, and configure the required API keys:
```
mv .env.example .env
# Modify
DEEPGRAM_API_KEY=your_deepgram_key
GLADIA_API_KEY=your_gladia_key
SPEECHMATICS_API_KEY=your_speechmatics_key
ASSEMBLYAI_API_KEY=your_assemblyai_key
```
Note: At least one service's API key must be configured.
2. Service priority order:
- Deepgram (recommended for Chinese content)
- Gladia
- Speechmatics
- AssemblyAI
## Usage
1. Start the service:
```bash
uv run src/main.py
```
Or use debug mode:
```bash
UV_DEBUG=1 uv run src/main.py
```
2. Call the service:
```python
from mcp.client import MCPClient
async def process_video():
client = MCPClient()
result = await client.call(
"get_video_content",
url="https://www.youtube.com/watch?v=video_id"
)
print(result)
```
3. Example of client SSE
```bash
{
"mcpServers": {
"video_digest": {
"url": "http://<ip>:8000/sse"
}
}
}
# You can also pass the key in the Client
"env": {
"DEEPGRAM_API_KEY":"your_deepgram_key"
}
```
> You can modify the startup command using STDIO: unverified and untested [MCP Documentation](https://modelcontextprotocol.io/)
## Testing
Run the test script:
```bash
uv run test.py
# or
python test.py
```
The test script will:
- Validate environment variable configuration
- Test YouTube download functionality
- Test various transcription services
- Test the complete video processing workflow
## Development Guide
1. Add a new transcription service:
- Create a new service class in the `src/services/transcription/` directory
- Inherit from the `BaseTranscriptionService` class
- Implement the `transcribe` method
2. Customize the download service:
- Modify or add a new downloader in the `src/services/download/` directory
- Inherit or modify the `YouTubeDownloader` class
## Dependency Management
- Use `uv pip install package_name` to install new dependencies
- Use `uv pip freeze > requirements.txt` to export the dependency list
- Manage dependencies with `pyproject.toml`, and lock dependency versions with `uv.lock`
## Error Handling
The service will handle the following situations:
- Missing or invalid API keys
- Video download failures
- Audio transcription failures
- Network connection issues
- Service limitations and quotas
## Notes
1. Ensure there is enough disk space for temporary files
2. Be aware of the API usage limits of each service provider
3. It is recommended to use Python 3.11 or higher
4. Temporary files will be automatically cleaned up
5. Using uv can provide faster dependency installation speeds and better dependency management
6. YouTube downloads may require authentication; you can copy cookies to a file named cookies.txt in the root directory [quickly generate using a plugin](https://chromewebstore.google.com/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) or use other authentication methods like cookies-from-browser, [yt-dlp](https://github.com/yt-dlp/yt-dlp)
## STT Key Application and Free Quota
- [Speechmatics](https://www.speechmatics.com/) offers 8 hours free per month - [Pricing](https://www.speechmatics.com/pricing)
- [Gladia](https://app.gladia.io/) offers 10 hours free per month - [Pricing](https://app.gladia.io/billing)
- [AssemblyAI](https://www.assemblyai.com/) offers a total of $50 free credit - [Pricing](https://www.assemblyai.com/pricing)
- [Deepgram](https://deepgram.com/) offers a total of $200 free credit - [Pricing](https://deepgram.com/pricing)
> Content for reference only
## License
This project is licensed under the MIT License.
Connection Info
You Might Also Like
semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
repomix
📦 Repomix is a powerful tool that packs your entire repository into a...
Serena
A powerful coding agent toolkit providing semantic retrieval and editing...
Blender
BlenderMCP integrates Blender with Claude AI for enhanced 3D modeling.
pydantic-ai
GenAI Agent Framework, the Pydantic way
cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and...