Content
# Voicevox MCP Server
Voicevox client server compliant with Model Context Protocol (MCP)
## Overview
- This project provides an MCP server that utilizes the Voicevox Engine for speech synthesis and plays back the results. It offers endpoints that can be called from AI tools like Cursor and Cline, enabling the conversion of text to speech and playback functionality.
- Note: By default, Voicevox pronounces English words by sounding out each letter individually. To avoid this, there are two methods: preparing and registering a custom dictionary, or converting the input text into Katakana (カタカナ).
Even if you instruct the LLM to automatically convert to Katakana, it may not work perfectly at this time. The custom dictionary creation interface is currently unimplemented.
## Features
- Conversion from text to Audio Query
- Conversion from Audio Query to WAV data
- Playback of generated audio data
- JSON-RPC over stdio interface compliant with MCP protocol
## Requirements
- Python 3.10 or higher
- Voicevox Engine (running locally or remotely)
- Required Python packages (see requirements.txt)
## Installation
1. Clone the repository
```bash
git clone https://github.com/yourusername/voicevox-mcp-vc1.git
cd voicevox-mcp-vc1
```
2. Install dependencies
```bash
uv sync
```
3. Start the Voicevox Engine
```bash
# Using CPU version Docker
docker pull voicevox/voicevox_engine:cpu-latest
docker run --rm -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:cpu-latest
```
```bash
# Using GPU version Docker
docker pull voicevox/voicevox_engine:nvidia-latest
docker run --rm --gpus all -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:nvidia-latest
```
## Usage
### CLINE / Roo code
- For default startup (http://127.0.0.1:50021 speaker id:1)
```json
{
"mcpServers": {
"voicevox-mcp-light": {
"disabled": false,
"command": "uv",
"args": [
"run",
"--directory",
"/full path/voicevox_mcp_light/",
"python",
"-m",
"src.main"
],
"transportType": "stdio",
"alwaysAllow": [],
"env": {
"PULSE_SERVER": "/run/user/1000/pulse/native"
}
}
}
}
```
```json
"voicevox-mcp-light": {
"disabled": false,
"command": "/full path/uv",
"args": [
"run",
"--directory",
"/full path/voicevox_mcp_light/",
"python",
"-m",
"src.main",
"--speaker",
"8"
],
"transportType": "stdio",
"alwaysAllow": [],
"env": {
"PULSE_SERVER": "/run/user/1000/pulse/native"
}
}
```
- Please copy the contents of PULSE_SERVER from the result of the following command
``` bash
# Check the status of PulseAudio
pactl info
```
For Windows / Mac, the contents of env are not required, as per Claude's response. This has not been tested yet.
Development tests were conducted on Ubuntu 22.04.
In the case of Roo code, the MCP Client outputs DEBUG information. This is due to embedding debug information into the LLM during development, and we appreciate your understanding as it is left as is. It does not appear in CLINE.
Currently unverified on Claude Desktop.
#### Options
- `--host`: Host IP address of the Voicevox Engine (default: 127.0.0.1)
- `--port`: Port number of the Voicevox Engine (default: 50021)
- `--speaker`: Voice model ID (default: 3)
Refer to [Voice Model ID](https://github.com/VOICEVOX/voicevox_vvm/blob/main/README.md#%E9%9F%B3%E5%A3%B0%E3%83%A2%E3%83%87%E3%83%ABvvm%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB%E3%81%A8%E5%A3%B0%E3%82%AD%E3%83%A3%E3%83%A9%E3%82%AF%E3%82%BF%E3%83%BC%E3%82%B9%E3%82%BF%E3%82%A4%E3%83%AB%E5%90%8D%E3%81%A8%E3%82%B9%E3%82%BF%E3%82%A4%E3%83%AB-id-%E3%81%AE%E5%AF%BE%E5%BF%9C%E8%A1%A8) for reference.
### Usage from MCP Client
This server provides a JSON-RPC over stdio interface compliant with the MCP protocol.
It can be used from MCP clients like Claude Desktop and Cursor as follows:
```
# Install MCP server (for Claude Desktop)
mcp install src/main.py
# Example of tool invocation
synthesizeAndPlay(message="Hello, World!")
```
## Development
### Running Tests
```bash
python -m pytest
```
### Code Style
This project adheres to PEP 8 coding standards.
## License
[MIT License](LICENSE)
## Acknowledgments
- [VOICEVOX](https://voicevox.hiroshiba.jp/) - High-quality speech synthesis engine
- [VOICEVOX Engine Github](https://github.com/VOICEVOX/voicevox_engine)
- [Model Context Protocol](https://modelcontextprotocol.io/) - Standard protocol for context sharing with LLMs
- [Trying and Creating with Model Context Protocol (MCP)](https://zenn.dev/karaage0703/articles/42f7b0655a6af8) - Basics of MCP in general
- [notion-mcp-light](https://github.com/karaage0703/notion-mcp-light) - Structural analysis of NotionMCP Light for Vibe Coding
- [Speech Synthesis with Python and VOICEVOX](https://zenn.dev/karaage0703/articles/0187d1d1f4d139) - Knowledge about voicevox
Connection Info
You Might Also Like
OpenAI Whisper
OpenAI Whisper MCP Server - 基于本地 Whisper CLI 的离线语音识别与翻译,无需 API Key,支持...
markitdown
Python tool for converting files and office documents to Markdown.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
continue
Continue is an open-source project for seamless server management.
claude-flow
Claude-Flow v2.7.0 is an enterprise AI orchestration platform.