Content
# LLM-MCP-RAG Experimental Project
> This project is a Python implementation based on [KelvinQiu802/llm-mcp-rag](https://github.com/KelvinQiu802/llm-mcp-rag) for learning and practicing LLM, MCP, and RAG technologies.
>
> The author of this project has a demonstration video available at https://www.bilibili.com/video/BV1dcRqYuECf/
>
> It is highly recommended to first browse its README, as this repository has made some adjustments to the logic and naming!
## Project Overview
This project is an experimental project based on Large Language Models (LLM), Model Context Protocol (MCP), and Retrieval-Augmented Generation (RAG). It demonstrates how to build an AI assistant system that can interact with external tools and utilize retrieval-augmented generation technology.
### Core Features
- Large Language Model calls based on OpenAI API
- Interaction between LLM and external tools via MCP (Model Context Protocol)
- Implementation of a retrieval-augmented generation (RAG) system based on vector retrieval
- Support for file system operations and web content retrieval
## System Architecture
```mermaid
graph TD
A[User] -->|Ask| B[Agent]
B -->|Call| C[LLM]
C -->|Generate Answer/Tool Call| B
B -->|Tool Call| D[MCP Client]
D -->|Execute| E[MCP Server]
E -->|File System Operations| F[File System]
E -->|Web Retrieval| G[Web Content]
H[Documents/Knowledge Base] -->|Embed| I[Vector Store - In-Memory]
B -->|Query| I
I -->|Relevant Context| B
```
## Main Components
```mermaid
classDiagram
class Agent {
+mcp_clients: list[MCPClient]
+model: str
+llm: AsyncChatOpenAI
+system_prompt: str
+context: str
+init()
+cleanup()
+invoke(prompt: str)
}
class MCPClient {
+name: str
+command: str
+args: list[str]
+version: str
+init()
+cleanup()
+get_tools()
+call_tool(name: str, params: dict)
}
class AsyncChatOpenAI {
+model: str
+messages: list
+tools: list[Tool]
+system_prompt: str
+context: str
+chat(prompt: str, print_llm_output: bool)
+get_tools_definition()
+append_tool_result(tool_call_id: str, tool_output: str)
}
class EembeddingRetriever {
+embedding_model: str
+vector_store: VectorStore
+embed_query(query: str)
+embed_documents(document: str)
+retrieve(query: str, top_k: int)
}
class VectorStore {
+items: list[VectorStoreItem]
+add(item: VectorStoreItem)
+search(query_embedding: list[float], top_k: int)
}
class ALogger {
+prefix: str
+title(text: str, rule_style: str)
}
Agent --> MCPClient
Agent --> AsyncChatOpenAI
Agent ..> EembeddingRetriever
EembeddingRetriever --> VectorStore
Agent ..> ALogger
AsyncChatOpenAI ..> ALogger
```
## Quick Start
### Environment Setup
1. Ensure that Python 3.12 or higher is installed.
2. Clone this repository.
3. Copy `.env.example` to `.env` and fill in the necessary configuration information:
- `OPENAI_API_KEY`: OpenAI API key
- `OPENAI_BASE_URL`: OpenAI API base URL, make sure to keep the trailing '/v1' (default is 'https://api.openai.com/v1')
- `DEFAULT_MODEL_NAME`: (optional) Default model name to use (default is "gpt-4o-mini")
- `EMBEDDING_KEY`: (optional) Embedding model API key (default is $OPENAI_API_KEY)
- `EMBEDDING_BASE_URL`: (optional) Embedding model API base URL, such as a silicon-based flow API or an API compatible with OpenAI format (default is $OPENAI_BASE_URL)
- `USE_CN_MIRROR`: (optional) Whether to use the China mirror, set any value (e.g., '1') to true (default is false)
- `PROXY_URL`: (optional) Proxy URL (e.g., "http(s)://xxx"), used for `fetch` (mcp-tool) to go through the proxy
### Install Dependencies
```bash
# Use uv to install dependencies
uv sync
```
### Run Examples
This project uses the `just` command tool to run different examples:
```bash
# View available commands
just help
```
## RAG Example Workflow
```mermaid
sequenceDiagram
participant User as User
participant Agent as Agent
participant LLM as LLM
participant ER as EmbeddingRetriever
participant VS as VectorStore
participant MCP as MCP Client
participant Logger as ALogger
User->>Agent: Provide query
Agent->>Logger: Log operation
Agent->>ER: Retrieve relevant documents
ER->>VS: Query vector store
VS-->>ER: Return relevant documents
ER-->>Agent: Return context
Agent->>LLM: Send query and context
LLM-->>Agent: Generate answer or tool call
Agent->>Logger: Log tool call
Agent->>MCP: Execute tool call
MCP-->>Agent: Return tool result
Agent->>LLM: Send tool result
LLM-->>Agent: Generate final answer
Agent-->>User: Return answer
```
## Project Structure
- `src/augmented/`: Main source code directory
- `agent.py`: Implementation of Agent, responsible for coordinating LLM and tools
- `chat_openai.py`: OpenAI API client wrapper
- `mcp_client.py`: Implementation of MCP client
- `embedding_retriever.py`: Implementation of embedding retriever
- `vector_store.py`: Implementation of vector store
- `mcp_tools.py`: Definition of MCP tools
- `utils/`: Utility functions
- `info.py`: Project information and configuration
- `pretty.py`: Unified logging output system
- `rag_example.py`: RAG example program
- `justfile`: Task running configuration file
## Learning Resources
- [Model Context Protocol (MCP)](https://modelcontextprotocol.io/): Learn about the MCP protocol
- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference): OpenAI API reference
- [RAG (Retrieval-Augmented Generation)](https://arxiv.org/abs/2005.11401): RAG technical paper
You Might Also Like
Ollama
Ollama enables easy access to large language models on various platforms.

n8n
n8n is a secure workflow automation platform for technical teams with 400+...
OpenWebUI
Open WebUI is an extensible web interface for customizable applications.

Dify
Dify is a platform for AI workflows, enabling file uploads and self-hosting.

Zed
Zed is a high-performance multiplayer code editor from the creators of Atom.
MarkItDown MCP
markitdown-mcp is a lightweight MCP server for converting various URIs to Markdown.