Content
Simplified Chinese | [English](./README.en.md)
## Text2Sql.Net - A .NET Implementation of Natural Language to SQL
### Project Background
Text2Sql.Net is a natural language to SQL tool based on the .NET platform, designed to help developers and data analysts quickly generate database query statements through simple natural language descriptions. The project combines large language models (LLM) with traditional SQL parsing techniques and supports various mainstream databases.
### Core Features
- Natural Language to SQL: Input everyday language descriptions to automatically generate corresponding SQL query statements.
- Multi-Database Support: Compatible with SQL Server, MySQL, PostgreSQL, and SQLite.
- Intelligent Context Understanding: Understand user query intentions based on chat history.
- Vector Search Integration: Supports semantic similarity search.
- Syntax Validation: Automatically checks the correctness of generated SQL syntax.
- **MCP Protocol Support**: Seamless integration with IDE tools (Cursor, Trae, etc.).
- **Intelligent Q&A Example System**: Enhances SQL generation accuracy through example learning.
## Technical Architecture





Configuration file. The project supports running with sqlite or pgsql, and supports configuring SqlService, MySql, PgSql, and Sqlite for Text2Sql.
```
"Text2SqlOpenAI": {
"Key": "your secret key",
"EndPoint": "https://api.antsk.cn/",
"ChatModel": "gpt-4o",
"EmbeddingModel": "text-embedding-ada-002"
},
"Text2SqlConnection": {
"DbType": "Sqlite", //PostgreSQL
"DBConnection": "Data Source=text2sql.db",
"VectorConnection": "text2sqlmem.db",
"VectorSize": 1536 //PostgreSQL needs to be set, sqlite can be left unset
}
```
You are also welcome to join our WeChat group. You can add my WeChat: **xuzeyu91** to send a request to join the group.
### Core Modules
**Database Adaptation Layer**
**Vector Database Integration**
- Multi-database support implemented based on the Strategy Pattern
- Standard operations defined through the IDatabaseProvider interface
- Dynamically load corresponding database drivers (SQLite/Postgres/MySql/SqlServer)
- Automatically generate SQL statements specific to the database dialect
## Core Processing Flow
```mermaid
flowchart TD
A[User inputs natural language query] --> B{Select database connection}
B -->|Not selected| C[Prompt to select a database]
B -->|Selected| D[Save user message to chat history]
D --> E[Semantic search to obtain relevant Schema]
E --> F[Query vector database]
F --> G[Relevance scoring and table association inference]
G --> H[Construct Schema context]
H --> I[Call LLM to generate SQL]
I --> J[Use Semantic Kernel plugin]
J --> K[SQL security check]
K -->|Query statement| L[Automatically execute SQL]
K -->|Operation statement| M[Only generate SQL<br/>do not execute automatically]
L --> N{Was execution successful?}
N -->|Successful| O[Return query results]
N -->|Failed| P[SQL optimization]
P --> Q[Use error information to optimize SQL]
Q --> R[Re-execute optimized SQL]
R --> S[Return final results]
M --> T[Prompt for manual execution]
O --> U[Save response to chat history]
S --> U
T --> U
U --> V[Display results to user]
style A fill:#e1f5fe
style V fill:#e8f5e8
style K fill:#fff3e0
style P fill:#fce4ec
```
## Schema Training and Vector Search Process
```mermaid
flowchart TD
A[Database Connection Configuration] --> B[Schema Training Service]
B --> C[Extract Database Table Structure]
C --> D[Obtain Table/Column/Foreign Key Information]
D --> E[Generate Table Description Text]
E --> F[Text Vectorization]
F --> G[Store in Vector Database]
G --> H{Vector Storage Type}
H -->|SQLite| I[SQLiteMemoryStore]
H -->|PostgreSQL| J[PostgresMemoryStore with pgvector]
I --> K[Schema Training Completed]
J --> K
K --> L[Waiting for User Query]
L --> M[Semantic Search]
M --> N[Relevance Matching]
N --> O[Return Relevant Table Structure]
style A fill:#e3f2fd
style F fill:#f3e5f5
style G fill:#e8f5e8
style M fill:#fff3e0
```
## System Architecture Diagram
```mermaid
flowchart LR
subgraph "User Interface Layer"
A[Blazor Frontend Page]
B[Database Connection Selection]
C[Chat Input Box]
D[SQL Result Display]
end
subgraph "Service Layer"
E[ChatService<br/>Chat Service]
F[SchemaTrainingService<br/>Schema Training Service]
G[SemanticService<br/>Semantic Service]
H[SqlExecutionService<br/>SQL Execution Service]
I[QAExampleService<br/>Q&A Example Service]
J[McpServer<br/>MCP Protocol Server]
end
subgraph "Data Access Layer"
K[DatabaseConnectionRepository<br/>Database Connection Repository]
L[ChatMessageRepository<br/>Chat Message Repository]
M[DatabaseSchemaRepository<br/>Schema Repository]
N[SchemaEmbeddingRepository<br/>Vector Embedding Repository]
O[QAExampleRepository<br/>Q&A Example Repository]
end
subgraph "External Services"
P[OpenAI API<br/>LLM Service]
Q[Vector Database<br/>SQLite/PostgreSQL]
R[Business Database<br/>Multi-database Support]
S[MCP Client<br/>IDE Tool Integration]
end
A --> E
B --> K
C --> E
D --> H
E --> F
E --> G
E --> H
E --> L
E --> I
F --> M
F --> N
G --> Q
H --> K
H --> R
I --> O
J --> S
E --> P
G --> P
style A fill:#e1f5fe
style E fill:#f3e5f5
style P fill:#fff3e0
style Q fill:#e8f5e8
style J fill:#fce4ec
style I fill:#e3f2fd
```
## 🔧 MCP Protocol Integration
### Model Context Protocol (MCP) Support
Text2Sql.Net integrates the Model Context Protocol and can serve as an MCP server to provide Text2SQL functionality for various AI development tools.
#### Supported MCP Tools
- `get_database_connections`: Retrieve all database connection configurations
- `get_database_schema`: Retrieve database table structure information
- `generate_sql`: Generate SQL queries based on natural language
- `execute_sql`: Execute SQL query statements
- `get_chat_history`: Retrieve chat history
- `get_table_structure`: Retrieve detailed structure of a specified table
- `get_all_tables`: Retrieve information of all tables
#### IDE Integration Configuration
In IDEs that support MCP (such as Cursor, Trae, etc.), you can connect to Text2Sql.Net with the following configuration:
```json
{
"mcpServers": {
"text2sql": {
"name": "Text2Sql.Net - sqlserver",
"type": "sse",
"description": "Intelligent Text2SQL service - Supports natural language to SQL query. Compatible with IDEs like Cursor, Trae, etc.",
"isActive": true,
"url": "http://localhost:5000/mcp/sse?connectionId=xxxxxx"
}
}
}
```
Once the configuration is complete, you can directly interact with the database using natural language in the IDE:
- "Show the structure of all user tables"
- "Query order data from the past week"
- "Count the number of products in each category"
### MCP Use Cases
1. **Code Development**: Quickly generate data query code in the IDE
2. **Data Analysis**: Rapidly explore data through natural language
3. **Report Generation**: Quickly build complex statistical queries
4. **System Integration**: Integrate Text2SQL capabilities into other toolchains
## 📚 Intelligent Q&A Example System
### QA Example Functionality
Text2Sql.Net provides an intelligent question-and-answer example management system that enhances the accuracy of SQL generation by learning and accumulating examples.
#### Core Features
- **Example Management**: Supports manual creation and correction of generated Q&A examples
- **Semantic Search**: Matches relevant examples based on vector similarity
- **Categorical Organization**: Supports classification of basic queries, complex queries, aggregate queries, etc.
- **Usage Statistics**: Tracks the usage frequency and effectiveness of examples
- **Batch Operations**: Supports bulk enabling, disabling, and deletion of examples
#### Example Classification Description
- **Basic Query**: Simple SELECT statements and basic filtering
- **Complex Query**: Complex scenarios such as multi-table joins and subqueries
- **Aggregate Query**: Queries that include aggregate functions like GROUP BY, SUM, COUNT, etc.
- **Join Query**: Multi-table JOIN operations
- **Correction Example**: Examples generated from corrected erroneous SQL
#### Intelligent Matching Mechanism
When a user inputs a query, the system will:
1. Vectorize the user's question
2. Perform semantic search in the example library
3. Return the most relevant examples (default relevance threshold 0.7)
4. Provide the relevant examples as context to the LLM
5. Update example usage statistics
#### Example Format
```json
{
"question": "Query the number of active users in the last month",
"sqlQuery": "SELECT COUNT(DISTINCT user_id) FROM user_activities WHERE activity_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH)",
"category": "aggregate",
"description": "Count the number of distinct users with activity records in the last 30 days"
}
```
### Example Creation Methods
1. **Manual Creation**: Directly add Q&A pairs in the management interface.
2. **Correction Generation**: Automatically create examples after the user corrects errors in SQL generation.
3. **Batch Import**: Support batch generation of examples from existing query history.
## More Rag Scenarios Available at AntSK
Project Address: [AntSK](https://github.com/AIDotNet/AntSK)
Experience Environment:
[Demo Address](https://demo.antsk.cn)
Account: test
Password: test
You are also welcome to join our WeChat group. You can add my WeChat: **antskpro** to request joining the group.
Connection Info
You Might Also Like
MarkItDown MCP
Converting files and office documents to Markdown.
Time
Obtaining current time information and converting time between different...
Filesystem
Model Context Protocol Servers
Sequential Thinking
Offers a structured approach to dynamic and reflective problem-solving,...
Git
Model Context Protocol Servers
Context 7
Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors