Content
# Transcriber MCP
Transcriber MCP is a server compliant with the Model Context Protocol (MCP) that converts audio and video files into text. It utilizes faster-whisper to provide a lightweight and practical transcription server that operates in a CPU environment.
## Features
- Implementation of a server compliant with the MCP protocol
- Accepts audio and video files (mp3, mp4, wav, mov, avi) and performs transcription
- Outputs results in text file format
- Provides a communication interface with MCP clients
## Installation
### Prerequisites
- Python 3.8 or higher
- faster-whisper
- ffmpeg (for audio and video file processing)
### Installation Steps
1. Clone the repository
```bash
git clone https://github.com/yourusername/transcriber-mcp.git
cd transcriber-mcp
```
2. Create a virtual environment and install dependencies
```bash
# Create a virtual environment using uv
uv venv
# Install dependencies
uv pip install -r requirements.txt
```
## Usage
### Starting the Server
```bash
# Start the server using uv
uv run -m src.main
```
### Verify Functionality Using Client Example
```bash
# Create a test audio file
uv pip install gtts
uv run -c "from gtts import gTTS; tts = gTTS('This is a test audio file. We will check if the transcription works correctly.', lang='en'); tts.save('test_audio.mp3')"
# Execute transcription
uv run -m src.client_example test_audio.mp3
```
### Usage from MCP Client
Send a request from a client compliant with the MCP protocol as follows:
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "transcribe",
"params": {
"file_path": "/path/to/your/audio_or_video_file.mp3"
}
}
```
### Example Response
```json
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"result": "/path/to/output/audio_or_video_file_transcribed.txt"
}
}
```
## Configuration with Cline
By using Cline in conjunction, you can perform transcription through interaction with LLM.
### Cline Configuration
Add the following settings to the Cline configuration file (usually located at `~/.config/cline/settings/mcp_settings.json`):
```json
{
"transcribe": {
"command": "uv",
"args": [
"run",
"--directory",
"/path/to/transcriber-mcp",
"python",
"-m",
"src.main",
"--model-size=base"
]
}
}
```
* Please adjust the path for `--directory` according to your actual environment.
* The `--model-size` can be selected from "tiny", "base", "small", "medium", "large".
## Supported File Formats
- Audio files: mp3, wav
- Video files: mp4, mov, avi
## Changing Model Size
You can change the model size to improve transcription accuracy.
```python
# Edit src/transcriber.py
self.model_size = "medium" # Choose from tiny, base, small, medium, large
```
Using a larger model improves accuracy but increases memory usage and loading time.
## Future Expansion Plans
- Timestamped transcription
- Multilingual support
- Model switching functionality
## License
This project is licensed under the MIT License. Please refer to the LICENSE file for details.
Connection Info
You Might Also Like
Continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with Semantic Kernel's orchestration...
repomix
Repomix packages your codebase into AI-friendly formats for seamless integration.
Serena
Serena is a free, open-source toolkit that enhances LLMs with IDE-like coding tools.
Blender
BlenderMCP integrates Blender with Claude AI for enhanced 3D modeling.
pydantic-ai
Pydantic AI: A GenAI Agent Framework designed with Pydantic principles.