Content

# Tool List <div align="center"> ![Python](https://img.shields.io/badge/Python-3.12+-blue.svg) ![FastAPI](https://img.shields.io/badge/FastAPI-0.115+-green.svg) ![MCP](https://img.shields.io/badge/MCP-1.6+-orange.svg) ![License](https://img.shields.io/badge/License-MIT-yellow.svg) An intelligent video processing AI-Agent based on MCP (Model Context Protocol), integrating NVIDIA NIM, FFmpeg, and Web search functionality, providing a natural language video editing experience. [🚀 Quick Start](#-quick-start) • [📖 User Guide](#-user-guide) • [🛠️ API Documentation](#️-api-documentation) </div> ## ✨ Project Features ### 🎯 Core Highlights - **🤖 Natural Language Interaction**: Describe requirements in natural language, and AI automatically selects suitable tools for execution. - **🎬 Professional Video Processing**: Complete video editing tool chain based on FFmpeg. - **🌐 Modern Web Interface**: Responsive design, supporting drag-and-drop upload and real-time preview. - **⚡ Streaming Response**: Real-time display of processing progress and AI thinking process. ### 🛠️ Supported Video Operations | Function | Description | Example Command | |------|------|----------| | 📹 **Video Information** | Get detailed information such as duration, resolution, and encoding | "Get detailed information of video.mp4" | | ✂️ **Smart Cutting** | Precisely cut video clips by time segment | "Cut from 30 seconds to 1 minute" | | 🔗 **Seamless Merging** | Intelligent splicing of multiple video files | "Merge these three videos into one" | | 📐 **Resolution Adjustment** | Video scaling and resolution conversion | "Adjust video to 1080p" | | 🎭 **Picture-in-Picture Effect** | Video overlay and picture-in-picture creation | "Add a small window in the top right corner of the main video" | | 🎵 **Audio Extraction** | Extract high-quality audio from video | "Extract background music from video" | | 🖼️ **Frame Extraction** | Extract video screenshots by frame rate | "Extract one picture per second" | | ▶️ **Preview Playback** | Built-in video player preview | "Play processed video" | ## 📁 Project Architecture ```text mcp_demo/ ├── 🌐 Web Frontend Layer │ ├── static/ │ │ ├── index.html # Main Interface - Modern Responsive Design │ │ ├── demo_separated.html # AI Dialogue Demo Page │ │ ├── test_stream.html # Streaming Response Test Page │ │ ├── style.css # Style File - CSS Grid + Flexbox │ │ └── script.js # Frontend Logic - Native ES6+ │ └── app.py # FastAPI Web Server │ ├── 🤖 AI Processing Layer │ ├── ffmpeg_mcp_demo.py # MCP Client Core │ ├── ffmpeg_mcp_config.py # Configuration Management │ └── demo_web.py # Web Demo Script │ ├── 🎬 Video Processing Layer (Submodule) │ └── ffmpeg-mcp/ # FFmpeg MCP Server │ └── src/ffmpeg_mcp/ │ ├── server.py # MCP Protocol Server │ ├── cut_video.py # Video Processing Core Algorithm │ ├── ffmpeg.py # FFmpeg Command Encapsulation │ ├── typedef.py # Type Definition and Data Structure │ └── utils.py # Tool Function Library │ ├── 📁 Data Storage Layer │ ├── uploads/ # User Upload Files │ └── outputs/ # Processing Result Output │ └── ⚙️ Configuration Files ├── pyproject.toml # Project Dependency and Configuration ├── uv.lock # Dependency Version Locking ├── .gitmodules # Git Submodule Configuration └── env.example # Environment Variable Template ``` ## 🚀 Quick Start ### 📋 Environment Requirements - **Python**: 3.12+ (Recommended 3.12.7) - **Package Manager**: [uv](https://docs.astral.sh/uv/) (Modern Python Package Manager) - **System Tools**: Git, FFmpeg - **API Key**: NVIDIA API Key ### 🔧 Installation Steps #### 1️⃣ Clone Project ```bash # Clone main project git clone https://github.com/JackyHua23/mcp_demo.git cd mcp_demo # Initialize submodule git submodule update --init --recursive ``` #### 2️⃣ Install Dependencies ```bash # Install main project dependencies using uv uv sync # Install FFmpeg MCP submodule dependencies cd ffmpeg-mcp uv sync cd .. ``` #### 3️⃣ Configure Environment Variables ```bash # Copy environment variable template cp env.example .env # Edit configuration file nano .env ``` **Environment Variable Configuration:** ```bash # NVIDIA API Key (required) - Get address: https://build.nvidia.com/ NVIDIA_API_KEY="your_nvidia_api_key_here" ``` #### 4️⃣ Start Application ```bash # Method 1: Use demo script to start (recommended) uv run python demo_web.py # Method 2: Directly start FastAPI application uv run python app.py # Method 3: Use uvicorn to start (development mode) uv run uvicorn app:app --host 0.0.0.0 --port 8000 --reload ``` 🎉 **Access Application**: http://localhost:8000 ## 💻 User Guide ### 🌐 Web Interface Operation #### 📤 File Upload 1. **Drag-and-Drop Upload**: Drag video files to the left upload area 2. **Click Upload**: Click upload button to select files 3. **Format Support**: MP4, AVI, MOV, MKV, WMV, FLV, WebM #### 💬 Intelligent Dialogue Enter natural language instructions in the right chat area: ```text ✅ Supported Instruction Examples: • "Get detailed information of current video" • "Cut 1 minute of content starting from 30 seconds" • "Adjust video resolution to 1920x1080" • "Extract audio from video and save as MP3 format" • "Add watermark effect to video" ``` #### ⚡ Quick Operations Use preset buttons for quick execution of common operations: - 🔍 **Get Information** - View detailed video parameters - ✂️ **Smart Cutting** - Quickly cut video clips - 🎵 **Extract Audio** - Export audio files - 📐 **Adjust Size** - Modify video resolution ### 🖥️ Command Line Usage #### Basic Example ```python import asyncio from ffmpeg_mcp_demo import FFmpegMCPClient async def main(): client = FFmpegMCPClient() # Natural language processing response = await client.process_video_request( "Cut 30 seconds of content from uploads/video.mp4 starting from 10 seconds" ) print(response) asyncio.run(main()) ``` #### Advanced Configuration ```python from ffmpeg_mcp_config import FFmpegMCPConfig from ffmpeg_mcp_demo import FFmpegMCPClient # Custom configuration config = FFmpegMCPConfig( api_key="your_nvidia_api_key", model="nvidia/llama-3.1-nemotron-ultra-253b-v1", base_url="https://integrate.api.nvidia.com/v1" ) client = FFmpegMCPClient( api_key=config.api_key, model=config.model, base_url=config.base_url ) ``` ## 🛠️ API Documentation ### 🌐 Web API Endpoints | Method | Endpoint | Description | Parameters | |------|------|------|------| | `GET` | `/` | Main page | - | | `GET` | `/demo` | AI dialogue demo page | - | | `POST` | `/api/upload` | File upload | `file: UploadFile` | | `GET` | `/api/files` | Get file list | - | | `POST` | `/api/process` | Process video request | `message: str, video_path?: str` | | `POST` | `/api/process-stream` | Streaming process request | `message: str, video_path?: str` | | `GET` | `/api/tools` | Get available tools | - | | `GET` | `/api/download/{type}/{filename}` | File download | `type: str, filename: str` | | `DELETE` | `/api/files/{type}/{filename}` | File deletion | `type: str, filename: str` | ### 🎬 FFmpeg MCP Tools | Tool Name | Function Description | Parameter Description | |----------|----------|----------| | `find_video_path` | Intelligent video file search | `root_path`, `video_name` | | `get_video_info` | Get detailed video information | `video_path` | | `clip_video` | Precise video clip cutting | `video_path`, `start`, `end/duration`, `output_path?` | | `concat_videos` | Seamless merging of multiple videos | `input_files[]`, `output_path?`, `fast?` | | `scale_video` | Video resolution adjustment | `video_path`, `width`, `height`, `output_path?` | | `overlay_video` | Video overlay effect | `background_video`, `overlay_video`, `position?`, `dx?`, `dy?` | | `extract_audio_from_video` | Audio track extraction | `video_path`, `output_path?`, `audio_format?` | | `extract_frames_from_video` | Video frame extraction | `video_path`, `fps?`, `output_folder?`, `format?` | | `play_video` | Video preview playback | `video_path`, `speed?`, `loop?` | ## 🎯 Technical Stack Details ### 🔧 Backend Technology - **FastAPI**: High-performance asynchronous web framework, automatically generating API documentation - **MCP**: Model Context Protocol, AI tool invocation standard protocol - **NVIDIA NIM**: Enterprise-level AI inference service, supporting Llama 3.1 Nemotron - **FFmpeg**: Industry-standard multimedia processing tool - **uv**: Next-generation Python package manager, 10-100 times faster than pip ### 🎨 Frontend Technology - **HTML5**: Semantic markup, supporting drag-and-drop API - **CSS3**: Modern styling, Grid + Flexbox layout, CSS animations - **JavaScript ES6+**: Native JavaScript, Fetch API, WebSocket - **Font Awesome**: Vector icon library ### 📦 Dependency Management - **pyproject.toml**: Modern Python project configuration standard - **uv.lock**: Ensure dependency version consistency - **Git Submodules**: Modular code management

mcp_demo

Content

Connection Info

You Might Also Like

OpenAI Whisper

markitdown

oh-my-opencode

kicad-mcp-pro

excalidraw-architect-mcp

LambChat

mcp_demo

Scan with WeChat to Share

Authentication Required

Content

Connection Info

You Might Also Like

OpenAI Whisper

markitdown

oh-my-opencode

kicad-mcp-pro

excalidraw-architect-mcp

LambChat