Content
# Tool List
<h2 align="center">Your Voice, Never Leaves Your Computer</h2>
**VocoType** is a **completely free** desktop voice input method designed for professionals who value privacy and efficiency. All recognition is done locally, without fear of network disconnection, and no data is uploaded.
This GitHub project is the **CLI (Command Line) open-source version** of the VocoType core engine, primarily for developers.
---
### **➡️ Want the Best Experience? Download the Free Desktop Version Now!**
Out-of-the-box, more complete features, no technical background required.
**[Visit the official website to download the free, complete VocoType desktop version](https://vocotype.com)**
## Feature Introduction
VocoType is an intelligent voice input tool that converts speech into text in real-time and automatically inputs it into the current application using keyboard shortcuts. It supports MCP voice-to-text, AI-optimized text, custom replacement dictionaries, and more, making voice input more efficient and accurate.
### 📹 Demo Video
<video controls width="100%">
<source src="https://s1.bib0.com/leilei/i/2025/11/04/5yba.mp4" type="video/mp4">
Your browser does not support video playback.
</video>
## Download
| OS | Download |
|---|---|
| **Windows** | [](https://github.com/233stone/vocotype-cli/releases/download/v1.5.2/VocoType_1.5.2_x64-setup.exe) |
| **macOS** | [](https://github.com/233stone/vocotype-cli/releases/download/v1.5.2/VocoType_1.5.2_Universal.dmg) [](https://github.com/233stone/vocotype-cli/releases/download/v1.5.2/VocoType_1.5.2_Universal.dmg) |
---
## 🤔 What Makes VocoType Different?
| Feature | ✅ **VocoType** | Traditional Cloud-based Input Method | Operating System Built-in |
| :------------- | :--------------------: | :---------------: | :-------------: |
| **Privacy and Security** | **Local, Offline, No Upload** | ❌ Data needs to be uploaded to the cloud | ⚠️ Complex privacy policy |
| **Network Dependency** | **No need for a network connection** | ❌ Must be used with a network connection | ❌ Strongly dependent on the network |
| **Response Speed** | **0.1 second level** | Slow, affected by network speed | Slow, affected by network speed |
| **Customization Capability** | **Powerful custom dictionary** | Weak or none | Basic none |
## ✅ Core Features
- **Complete graphical user interface**: Out-of-the-box, all operations are clear and intuitive.
- **System-level global input**: Can directly input voice in any software, any text box.
- **Custom dictionary**: Supports adding 20 common terms, names, to improve recognition accuracy.
- **100% offline operation**: Absolute privacy and data security.
- **Flagship-level recognition engine**: Accurately recognizes mixed Chinese and English content.
- **AI intelligent optimization**: Supports choosing multiple AI models, automatically correcting wrong characters, homophones, and self-correction through customizable Prompt templates, intelligently identifying oral correction instructions (e.g., "not correct", "change to"), making output text more accurate and smooth.
_(For professional users with higher demands, the application provides an option to upgrade to the Pro version to unlock advanced features like unlimited dictionaries.)_
## 🎯 Suitable for Various Professional Scenarios
Whether you are a writer, lawyer, scholar, gamer, or for daily office work, VocoType can be your trusted efficiency partner.
| User | Scenario |
| :------------------ | :--------------------------------------------------------------------------------------------- |
| **Writers and Creators** | Write articles, novels, organize meeting minutes, let thoughts be converted into text instantly, focus on creation itself. |
| **Law and Medical Professionals** | Handle highly sensitive client information or medical records, 100% offline to ensure data security. Custom dictionaries can easily handle industry terminology. |
| **Students and Scholars** | Quickly record class notes, organize interview recordings, write academic papers. Bid farewell to tedious typing, devote more energy to thinking and research. |
| **Developers and Programmers** | Whether it's pairing with AI for programming or writing technical documents, accurately recognize professional terms like `function`, `Kubernetes pod`. |
| **Gamers** | In intense game battles, quickly type messages to communicate with teammates through voice, no need to pause the game, maintain the game rhythm, and improve team collaboration efficiency. |
## ✨ VocoType Core Engine Features
_All VocoType versions share the same powerful core engine._
- **🛡️ 100% Offline, No Privacy Concerns**: All voice recognition is done locally on your computer.
- **⚡️ Flagship-level Recognition Engine**: Accurately recognizes mixed Chinese and English content, bid farewell to repeated modifications.
- **⚙️ Highly Customizable**: Unique replacement dictionary function, letting names, places, and industry terms be accurate on the first try.
- **💻 Lightweight Design**: Only 700MB of memory required, pure CPU reasoning, no need for expensive graphics cards.
- **🚀 0.1 Second Level Response**: Experience the smoothness of instant speech-to-text, let your inspiration not be interrupted by waiting.
---
## 🛠️ 【Developer Exclusive】CLI Version Installation Guide
**Please note:** This version is for developers with certain technical backgrounds. If you're not familiar with the command line, we strongly recommend visiting the official website to download the easy-to-use **VocoType free desktop version**.
### 1. Environment Dependencies
- Python 3.12
- We strongly recommend using `uv` or `venv` to create a virtual environment.
### 2. Clone and Installation
```bash
# 1. Clone the repository
git clone https://github.com/233stone/vocotype-cli.git
cd vocotype-cli
# 2. (Recommended) Create and activate a virtual environment
pip install uv
uv venv --python 3.12
source .venv/bin/activate # macOS/Linux
# or .\.venv\Scripts\activate (Windows)
# 3. Install dependencies
uv pip install -r requirements.txt
# 4. Run
python main.py
# Save dataset and run
python main.py --save-dataset
```
> **Model Download**: The program will automatically download about 500MB of model files during the first run, please ensure a stable network connection.
## 🌐 Volcengine BigASR Streaming Recognition Backend (Optional)
In addition to the default local FunASR offline engine, VocoType CLI also supports accessing [Volcengine's BigASR streaming speech recognition](https://www.volcengine.com/docs/6561/1354869) as a cloud recognition backend.
### Advantages
| Feature | Local FunASR | Volcengine BigASR |
|:--|:--:|:--:|
| Network Requirements | None | Requires a network connection |
| Model Download | ~500 MB | No need to download |
| Response Delay | Local reasoning | Cloud-end extremely low delay |
| Recognition Quality | High | Flagship-level large model |
| Data Privacy | Completely offline | Audio sent to Volcengine |
### Configuration Steps
1. Log in to the [Volcengine Console](https://console.volcengine.com/speech/app) and create a speech application to obtain **App Key** and **Access Key**.
2. Create `config.json` in the project directory:
```json
{
"backend": "volcengine",
"volcengine": {
"app_key": "YOUR_APP_KEY",
"access_key": "YOUR_ACCESS_KEY",
"resource_id": "volc.bigasr.sauc.duration",
"enable_punc": true,
"enable_itn": true
}
}
```
3. Start with the `--config` parameter:
```bash
python main.py --config config.json
```
> **Note**: When using the Volcengine backend, recording data will be sent to the Volcengine server for recognition, no longer completely offline. If you have strict requirements for privacy, please continue to use the default local FunASR backend.
## Frequently Asked Questions (FAQ)
**Q: Is my data safe?**
> A: **100% safe**. All voice recognition is done locally and offline, and your audio data will not be uploaded to any server.
## 📞 Contact Us
- **Bugs and Suggestions**: Please use GitHub Issues first.
- **Follow us for the latest updates**: [https://vocotype.com](https://vocotype.com)
## 🙏 Acknowledgements
VocoType's birth is inseparable from the following excellent open-source projects:
- **[FunASR](https://github.com/modelscope/FunASR)** - Alibaba's Damo Academy open-source speech recognition framework, providing VocoType with powerful offline speech recognition capabilities.
- **[QuQu](https://github.com/yan5xu/ququ)** - An excellent open-source project, providing important technical references and inspiration for VocoType.
Thanks to the selfless contributions of these open-source communities!
Connection Info
You Might Also Like
OpenAI Whisper
OpenAI Whisper MCP Server - 基于本地 Whisper CLI 的离线语音识别与翻译,无需 API Key,支持...
markitdown
Python tool for converting files and office documents to Markdown.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
claude-flow
Claude-Flow v2.7.0 is an enterprise AI orchestration platform.
continue
Continue is an open-source project for seamless server management.