Content
# Tool List
<div align="center">




An intelligent video processing AI-Agent based on MCP (Model Context Protocol), integrating NVIDIA NIM, FFmpeg, and Web search functionality, providing a natural language video editing experience.
[🚀 Quick Start](#-quick-start) • [📖 User Guide](#-user-guide) • [🛠️ API Documentation](#️-api-documentation)
</div>
## ✨ Project Features
### 🎯 Core Highlights
- **🤖 Natural Language Interaction**: Describe requirements in natural language, and AI automatically selects suitable tools for execution.
- **🎬 Professional Video Processing**: Complete video editing tool chain based on FFmpeg.
- **🌐 Modern Web Interface**: Responsive design, supporting drag-and-drop upload and real-time preview.
- **⚡ Streaming Response**: Real-time display of processing progress and AI thinking process.
### 🛠️ Supported Video Operations
| Function | Description | Example Command |
|------|------|----------|
| 📹 **Video Information** | Get detailed information such as duration, resolution, and encoding | "Get detailed information of video.mp4" |
| ✂️ **Smart Cutting** | Precisely cut video clips by time segment | "Cut from 30 seconds to 1 minute" |
| 🔗 **Seamless Merging** | Intelligent splicing of multiple video files | "Merge these three videos into one" |
| 📐 **Resolution Adjustment** | Video scaling and resolution conversion | "Adjust video to 1080p" |
| 🎭 **Picture-in-Picture Effect** | Video overlay and picture-in-picture creation | "Add a small window in the top right corner of the main video" |
| 🎵 **Audio Extraction** | Extract high-quality audio from video | "Extract background music from video" |
| 🖼️ **Frame Extraction** | Extract video screenshots by frame rate | "Extract one picture per second" |
| ▶️ **Preview Playback** | Built-in video player preview | "Play processed video" |
## 📁 Project Architecture
```text
mcp_demo/
├── 🌐 Web Frontend Layer
│ ├── static/
│ │ ├── index.html # Main Interface - Modern Responsive Design
│ │ ├── demo_separated.html # AI Dialogue Demo Page
│ │ ├── test_stream.html # Streaming Response Test Page
│ │ ├── style.css # Style File - CSS Grid + Flexbox
│ │ └── script.js # Frontend Logic - Native ES6+
│ └── app.py # FastAPI Web Server
│
├── 🤖 AI Processing Layer
│ ├── ffmpeg_mcp_demo.py # MCP Client Core
│ ├── ffmpeg_mcp_config.py # Configuration Management
│ └── demo_web.py # Web Demo Script
│
├── 🎬 Video Processing Layer (Submodule)
│ └── ffmpeg-mcp/ # FFmpeg MCP Server
│ └── src/ffmpeg_mcp/
│ ├── server.py # MCP Protocol Server
│ ├── cut_video.py # Video Processing Core Algorithm
│ ├── ffmpeg.py # FFmpeg Command Encapsulation
│ ├── typedef.py # Type Definition and Data Structure
│ └── utils.py # Tool Function Library
│
├── 📁 Data Storage Layer
│ ├── uploads/ # User Upload Files
│ └── outputs/ # Processing Result Output
│
└── ⚙️ Configuration Files
├── pyproject.toml # Project Dependency and Configuration
├── uv.lock # Dependency Version Locking
├── .gitmodules # Git Submodule Configuration
└── env.example # Environment Variable Template
```
## 🚀 Quick Start
### 📋 Environment Requirements
- **Python**: 3.12+ (Recommended 3.12.7)
- **Package Manager**: [uv](https://docs.astral.sh/uv/) (Modern Python Package Manager)
- **System Tools**: Git, FFmpeg
- **API Key**: NVIDIA API Key
### 🔧 Installation Steps
#### 1️⃣ Clone Project
```bash
# Clone main project
git clone https://github.com/JackyHua23/mcp_demo.git
cd mcp_demo
# Initialize submodule
git submodule update --init --recursive
```
#### 2️⃣ Install Dependencies
```bash
# Install main project dependencies using uv
uv sync
# Install FFmpeg MCP submodule dependencies
cd ffmpeg-mcp
uv sync
cd ..
```
#### 3️⃣ Configure Environment Variables
```bash
# Copy environment variable template
cp env.example .env
# Edit configuration file
nano .env
```
**Environment Variable Configuration:**
```bash
# NVIDIA API Key (required) - Get address: https://build.nvidia.com/
NVIDIA_API_KEY="your_nvidia_api_key_here"
```
#### 4️⃣ Start Application
```bash
# Method 1: Use demo script to start (recommended)
uv run python demo_web.py
# Method 2: Directly start FastAPI application
uv run python app.py
# Method 3: Use uvicorn to start (development mode)
uv run uvicorn app:app --host 0.0.0.0 --port 8000 --reload
```
🎉 **Access Application**: http://localhost:8000
## 💻 User Guide
### 🌐 Web Interface Operation
#### 📤 File Upload
1. **Drag-and-Drop Upload**: Drag video files to the left upload area
2. **Click Upload**: Click upload button to select files
3. **Format Support**: MP4, AVI, MOV, MKV, WMV, FLV, WebM
#### 💬 Intelligent Dialogue
Enter natural language instructions in the right chat area:
```text
✅ Supported Instruction Examples:
• "Get detailed information of current video"
• "Cut 1 minute of content starting from 30 seconds"
• "Adjust video resolution to 1920x1080"
• "Extract audio from video and save as MP3 format"
• "Add watermark effect to video"
```
#### ⚡ Quick Operations
Use preset buttons for quick execution of common operations:
- 🔍 **Get Information** - View detailed video parameters
- ✂️ **Smart Cutting** - Quickly cut video clips
- 🎵 **Extract Audio** - Export audio files
- 📐 **Adjust Size** - Modify video resolution
### 🖥️ Command Line Usage
#### Basic Example
```python
import asyncio
from ffmpeg_mcp_demo import FFmpegMCPClient
async def main():
client = FFmpegMCPClient()
# Natural language processing
response = await client.process_video_request(
"Cut 30 seconds of content from uploads/video.mp4 starting from 10 seconds"
)
print(response)
asyncio.run(main())
```
#### Advanced Configuration
```python
from ffmpeg_mcp_config import FFmpegMCPConfig
from ffmpeg_mcp_demo import FFmpegMCPClient
# Custom configuration
config = FFmpegMCPConfig(
api_key="your_nvidia_api_key",
model="nvidia/llama-3.1-nemotron-ultra-253b-v1",
base_url="https://integrate.api.nvidia.com/v1"
)
client = FFmpegMCPClient(
api_key=config.api_key,
model=config.model,
base_url=config.base_url
)
```
## 🛠️ API Documentation
### 🌐 Web API Endpoints
| Method | Endpoint | Description | Parameters |
|------|------|------|------|
| `GET` | `/` | Main page | - |
| `GET` | `/demo` | AI dialogue demo page | - |
| `POST` | `/api/upload` | File upload | `file: UploadFile` |
| `GET` | `/api/files` | Get file list | - |
| `POST` | `/api/process` | Process video request | `message: str, video_path?: str` |
| `POST` | `/api/process-stream` | Streaming process request | `message: str, video_path?: str` |
| `GET` | `/api/tools` | Get available tools | - |
| `GET` | `/api/download/{type}/{filename}` | File download | `type: str, filename: str` |
| `DELETE` | `/api/files/{type}/{filename}` | File deletion | `type: str, filename: str` |
### 🎬 FFmpeg MCP Tools
| Tool Name | Function Description | Parameter Description |
|----------|----------|----------|
| `find_video_path` | Intelligent video file search | `root_path`, `video_name` |
| `get_video_info` | Get detailed video information | `video_path` |
| `clip_video` | Precise video clip cutting | `video_path`, `start`, `end/duration`, `output_path?` |
| `concat_videos` | Seamless merging of multiple videos | `input_files[]`, `output_path?`, `fast?` |
| `scale_video` | Video resolution adjustment | `video_path`, `width`, `height`, `output_path?` |
| `overlay_video` | Video overlay effect | `background_video`, `overlay_video`, `position?`, `dx?`, `dy?` |
| `extract_audio_from_video` | Audio track extraction | `video_path`, `output_path?`, `audio_format?` |
| `extract_frames_from_video` | Video frame extraction | `video_path`, `fps?`, `output_folder?`, `format?` |
| `play_video` | Video preview playback | `video_path`, `speed?`, `loop?` |
## 🎯 Technical Stack Details
### 🔧 Backend Technology
- **FastAPI**: High-performance asynchronous web framework, automatically generating API documentation
- **MCP**: Model Context Protocol, AI tool invocation standard protocol
- **NVIDIA NIM**: Enterprise-level AI inference service, supporting Llama 3.1 Nemotron
- **FFmpeg**: Industry-standard multimedia processing tool
- **uv**: Next-generation Python package manager, 10-100 times faster than pip
### 🎨 Frontend Technology
- **HTML5**: Semantic markup, supporting drag-and-drop API
- **CSS3**: Modern styling, Grid + Flexbox layout, CSS animations
- **JavaScript ES6+**: Native JavaScript, Fetch API, WebSocket
- **Font Awesome**: Vector icon library
### 📦 Dependency Management
- **pyproject.toml**: Modern Python project configuration standard
- **uv.lock**: Ensure dependency version consistency
- **Git Submodules**: Modular code management
Connection Info
You Might Also Like
OpenAI Whisper
OpenAI Whisper MCP Server - 基于本地 Whisper CLI 的离线语音识别与翻译,无需 API Key,支持...
markitdown
Python tool for converting files and office documents to Markdown.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
kicad-mcp-pro
Professional Model Context Protocol server for KiCad: project setup,...
excalidraw-architect-mcp
An MCP server that generates beautiful Excalidraw architecture diagrams with...
LambChat
LambChat is a production-grade AI Agent system built on FastAPI + LangGraph....