Content
# Tool List
[](LICENSE)
[](https://python.org)
[](https://github.com/jlowin/fastmcp)
MCP Server toolset for Alibaba Cloud Container Service: ack-mcp-server.
This toolset unifies ACK cluster/resource management, Kubernetes native operations, and observability capabilities, security auditing, diagnostic inspection, and other O&M capabilities into a standardized AI-native toolset.
The capabilities of this toolset are integrated into [Alibaba Cloud Container Service Intelligent Assistant](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/use-container-ai-assistant-for-troubleshooting-and-intelligent-q-a). It also supports third-party AI Agents ([kubectl-ai](https://github.com/GoogleCloudPlatform/kubectl-ai/blob/main/pkg/mcp/README.md#local-stdio-based-server-configuration), [QWen Code](https://qwenlm.github.io/qwen-code-docs/en/tools/mcp-server/#using-qwen-mcp-to-manage-mcp-server), [Claude Code](https://docs.claude.com/en/docs/claude-code/mcp), [Cursor](https://cursor.com/en/docs/context/mcp/directory), [Gemini CLI](https://github.com/google-gemini/gemini-cli/blob/main/docs/tools/mcp-server.md#configure-the-mcp-server-in-settingsjson), [VS Code](https://code.visualstudio.com/docs/copilot/chat/mcp-servers#_add-an-mcp-server) etc.) or automation system call integration, based on the [MCP (Model Context Protocol)](https://modelcontextprotocol.io/docs/getting-started/intro) protocol.
It enables interaction with AI assistants through natural language to complete complex container O&M tasks and helps build users' own container scenario AIOps O&M systems.
- [1. Overview & Function Introduction](#-1-overview--function-introduction)
- [2. How to Use & Deploy](#-2-how-to-use--deploy)
- [3. How to Develop and Run Locally](#-3-how-to-develop-and-run-locally)
- [4. How to Participate in Community Contributions](#-4-how-to-participate-in-community-contributions)
- [5. Effects-benchmark](#-5-effects--benchmark-continuous-construction)
- [6. Evolution Plan-roadmap](#-6-evolution-plan--roadmap)
- [7. Frequently Asked Questions](#7-frequently-asked-questions)
## 🌟 1. Overview & Function Introduction
### 🎬 1.1 Demo Effect
https://github.com/user-attachments/assets/9e48cac3-0af1-424c-9f16-3862d047cc68
### 🎯 1.2 Core Functions
**Alibaba Cloud ACK Full Life Cycle Resource Management**
- Cluster query (`list_clusters`)
- Node resource management, node pool scaling (Later)
- Component Addon management (Later)
- Cluster creation, deletion (Later)
- Cluster upgrade (Later)
- Cluster resource O&M task query (Later)
**Kubernetes Native Operations** (`ack_kubectl`)
- Execute `kubectl` class operations (read and write permissions controllable)
- Get logs, events, and resources CRUD
- Support all standard Kubernetes APIs
**AI-Native Container Scenario Observability**
- **Prometheus**: Support ACK cluster corresponding Alibaba Cloud Prometheus, self-built Prometheus indicator query, natural language conversion to PromQL (`query_prometheus` / `query_prometheus_metric_guidance`)
- **Cluster control plane log query**: Support ACK cluster control plane SLS log query, including SLS SQL query, natural language conversion to SLS-SQL (`query_controlplane_logs`)
- **Audit log**: Kubernetes operation audit tracking (`query_audit_log`)
- …… (more [container observability capabilities](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/observability-best-practices) ing)
**Alibaba Cloud ACK Diagnosis and Inspection Function**
- Cluster resource diagnosis (`diagnose_resource`)
- Cluster health inspection (`query_inspect_report`)
**Enterprise-Level Engineering Capabilities**
- 🏗️ Hierarchical architecture: tool layer, service layer, and authentication layer completely decoupled
- 🔐 Dynamic credential injection: support request-level AK injection or environment credential
- 📊 Robust error handling: structured error output and typed response
- 📦 Modular design: each sub-service can run independently
### 🏆 1.3 Core Advantages
- **🤖 AI Native**: Standardized interface designed for AI agents
- **🔧 Unified Toolset**: One-stop container O&M capability integration
- **⚡ Knowledge Precipitation**: Built-in ACK, K8s, and container observability system best practice experience precipitation
- **🛡️ Enterprise-Level**: Complete authentication, authorization, and log mechanism
- **📈 Scalable**: Plug-in architecture, easy to expand new functions
### 📈 1.4 Benchmark Effect Verification (Continuously Updated)
AI capability evaluation based on actual scenarios, supporting multiple AI agents and large model effect comparison:
| Task Scenario | AI Agent | Large Model | Success Rate | Average Processing Time |
| --- | --- | --- | --- | --- |
| Pod OOM Repair | qwen_code | qwen3-coder-plus | ✅ 100% | 2.3min |
| Cluster Health Check | qwen_code | qwen3-coder-plus | ✅ 95% | 6.4min |
| Resource Abnormal Diagnosis | kubectl-ai | qwen3-32b | ✅ 90% | 4.1min |
| Historical Resource Analysis | qwen_code | qwen3-coder-plus | ✅ 85% | 3.8min |
Latest Benchmark report see [`benchmarks/results/`](benchmarks/results/) directory.
## 🚀 2. How to Use & Deploy
### 💻 2.1 Alibaba Cloud Authentication and Permission Preparation
It is recommended to configure the Alibaba Cloud account authentication for ack-mcp-server as a sub-account of the main account and follow the principle of minimum permissions. Assign the required **RAM permissions and RBAC permissions** to this sub-account.
#### 2.1.1 Required RAM Permission Policy Set
For details on how to add required permissions to a RAM account of an Alibaba Cloud account, refer to the documentation: [RAM Permission Policy](https://help.aliyun.com/zh/ram/user-guide/policy-overview).
The current ack-mcp-server required read-only permission set is:
- Container Service cs all read-only permissions
- Log Service log all read-only permissions
- Alibaba Cloud prometheus(arms) instance read-only permissions
- ……Subsequent additional resource change permissions to support full life cycle management of resources
```json
{
"Version": "1",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cs:Check*",
"cs:Describe*",
"cs:Get*",
"cs:List*",
"cs:Query*",
"cs:RunClusterCheck",
"cs:RunClusterInspect"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "arms:GetPrometheusInstance",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["log:Describe*", "log:Get*", "log:List*"],
"Resource": "*"
}
]
}
```
#### 2.1.2 Grant RBAC Permissions
Please contact the ACK cluster administrator and refer to the [RBAC authorization documentation](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/grant-rbac-permissions-to-ram-users-or-ram-roles) to grant suitable RBAC permissions to this sub-account.
### 💻 2.2 (Optional) Create an ACK Cluster
- An ACK cluster created in the Alibaba Cloud account
- The cluster network can be accessed, and the corresponding Kubernetes cluster access credentials are configured. Refer to [configuration method](./DESIGN.md#kubernetes-cluster-access-policy). In a production environment, it is recommended to configure KUBECONFIG_MODE = ACK_PRIVATE and access the cluster through the internal network.
### 📍 2.3 Deploy and Run ack-mcp-server
#### 2.3.1 Deployment Method 1 - Use Helm to Deploy in a k8s Cluster
Deploy in a Kubernetes cluster:
```bash
# Clone the code repository
git clone https://github.com/aliyun/alibabacloud-ack-mcp-server
cd alibabacloud-ack-mcp-server
# Deploy using Helm
helm install \
--set accessKeyId=<your-access-key-id> \
--set accessKeySecret=<your-access-key-secret> \
--set transport=sse \
ack-mcp-server \
./deploy/helm \
-n kube-system
```
Expose the service to the outside through a load balancer after deployment.
**Parameter Description**
- `accessKeyId`: AccessKeyId of the Alibaba Cloud account
- `accessKeySecret`: AccessKeySecret of the Alibaba Cloud account
#### 2.3.2 Deployment Method 2 - 📦 Use Docker Image to Deploy ack-mcp-server
```bash
# Pull the image
docker pull registry-cn-beijing.ack.aliyuncs.com/acs/ack-mcp-server:latest
# Run the container
docker run \
-d \
--name ack-mcp-server \
-e ACCESS_KEY_ID="your-access-key-id" \
-e ACCESS_KEY_SECRET="your-access-key-secret" \
-p 8000:8000 \
registry-cn-beijing.ack.aliyuncs.com/acs/ack-mcp-server:latest \
python -m main_server --transport sse --host 127.0.0.1 --port 8000 --allow-write
```
#### 2.3.3 Deployment Method 3 - 💻 Use Binary to Start and Deploy
Download the pre-compiled binary file or build the binary file locally and run:
```bash
# Build the binary file (local build)
make build-binary
# Run
./dist/binary/ack-mcp-server --help
```
### 2.4 MCP Client Configuration
Add the following configuration to the MCP client:
```json
{
"mcpServers": {
"ack-mcp-server": {
"command": "uvx",
"args": ["alibabacloud-ack-mcp-server@latest"],
"env": {
"ACCESS_KEY_ID": "<ak>",
"ACCESS_KEY_SECRET": "<sk>"
}
}
}
}
```
<details>
<summary>Claude Code</summary>
**Install using CLI**
Add ACK MCP server using Claude Code CLI:
```bash
# Either STDIO:
claude mcp add --scope user --env ACCESS_KEY_ID=${ACCESS_KEY_ID} --env ACCESS_KEY_SECRET=${ACCESS_KEY_SECRET} ack-mcp-server -- uvx alibabacloud-ack-mcp-server@latest
# Or HTTP:
claude mcp add --transport http --scope user ack-mcp-server <endpoint>
```
Refer to [Claude Code CLI Add MCP server](https://code.claude.com/docs/en/mcp)
</details>
<details>
<summary>Codex</summary>
Install ACK MCP server using Codex CLI:
```bash
# Either STDIO:
codex mcp add --env ACCESS_KEY_ID=${ACCESS_KEY_ID} --env ACCESS_KEY_SECRET=${ACCESS_KEY_SECRET} ack-mcp-server -- uvx alibabacloud-ack-mcp-server@latest
# Or HTTP:
codex mcp add ack-mcp-server --url <endpoint>
```
Refer to [Codex Add MCP server](https://developers.openai.com/codex/mcp/#configure-with-the-cli)
</details>
<details>
<summary>Gemini CLI</summary>
Install ACK MCP server using Gemini CLI:
```bash
# Either STDIO:
gemini mcp add --scope user --env ACCESS_KEY_ID=${ACCESS_KEY_ID} --env ACCESS_KEY_SECRET=${ACCESS_KEY_SECRET} ack-mcp-server uvx alibabacloud-ack-mcp-server@latest
# Or HTTP:
gemini mcp add --transport http --scope user ack-mcp-server <endpoint>
# Gemini extension:
gemini extensions install --ref master --auto-update https://github.com/aliyun/alibabacloud-ack-mcp-server
# Configure ak/sk through environment variables or `<home>/.gemini/extensions/ack-mcp-server/.env` file
```
Refer to [Gemini CLI Add MCP server](https://github.com/google-gemini/gemini-cli/blob/main/docs/tools/mcp-server.md#how-to-set-up-your-mcp-server)
</details>
<details>
<summary>OpenCode</summary>
Add the following configuration to the `opencode.json` file:
```json
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"ack-mcp-server": {
"type": "local",
"command": ["uvx", "alibabacloud-ack-mcp-server@latest"],
"environment": {
"ACCESS_KEY_ID": "<ak>",
"ACCESS_KEY_SECRET": "<sk>"
}
}
}
}
```
Refer to [OpenCode Add MCP server](https://opencode.ai/docs/mcp-servers)
</details>
<details>
<summary>Qoder CLI</summary>
Install ACK MCP server using Qoder CLI:
```bash
# Either STDIO:
qodercli mcp add --scope user --env ACCESS_KEY_ID=${ACCESS_KEY_ID} --env ACCESS_KEY_SECRET=${ACCESS_KEY_SECRET} ack-mcp-server -- uvx alibabacloud-ack-mcp-server@latest
# Or HTTP:
qodercli mcp add --transport http --scope user ack-mcp-server <endpoint>
```
Refer to [Qoder CLI Add MCP server](https://docs.qoder.com/cli/using-cli#mcp-servers)
</details>
<details>
<summary>Qwen Code</summary>
Install ACK MCP server using Qwen Code:
```bash
# Either STDIO:
qwen mcp add --scope user --env ACCESS_KEY_ID=${ACCESS_KEY_ID} --env ACCESS_KEY_SECRET=${ACCESS_KEY_SECRET} ack-mcp-server uvx alibabacloud-ack-mcp-server@latest
```
# Tool List
## 🎯 3 How to Develop and Run Locally
### 💻 3.1 Environment Preparation
**Build Environment Requirements**
- Python 3.12+
- Alibaba Cloud account and AccessKey, AccessSecretKey, and required permission set
- ACK cluster created in Alibaba Cloud account
- Configure ACK cluster to be accessible from local network with kubeconfig, refer to [configuration method](./DESIGN.md#kubernetes-cluster-access-policy).
- Note: It is recommended to enable cluster network access in production environment, and configure KUBECONFIG_MODE = ACK_PRIVATE to access cluster through private network. For local testing, use public network access cluster kubeconfig, which needs to be enabled in [corresponding ACK to open public network access kubeconfig](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/control-public-access-to-the-api-server-of-a-cluster).
### 📋 3.2 Development Environment Setup
```bash
# Clone project
git clone https://github.com/aliyun/alibabacloud-ack-mcp-server
cd alibabacloud-ack-mcp-server
# Install dependencies
uv sync
# Activate virtual environment (Bash)
source .venv/bin/activate
# Configure environment
cp .env.example .env
vim .env
# Run development service
make run
```
**Install Dependencies**
Using `uv` (recommended):
```bash
uv sync
source .venv/bin/activate
```
Or using `pip`:
```bash
pip install -r requirements.txt
```
### ⚙️ 3.3 Configuration Settings
Create `.env` file (refer to `.env.example`):
```env
# Alibaba Cloud credentials and region
ACCESS_KEY_ID=your-access-key-id
ACCESS_KEY_SECRET=your-access-key-secret
# Cache configuration
CACHE_TTL=300
CACHE_MAX_SIZE=1000
# Log configuration
FASTMCP_LOG_LEVEL=INFO
DEVELOPMENT=false
```
> ⚠️ **Note**: If ACCESS_KEY_ID/ACCESS_KEY_SECRET is not set, some features that depend on cloud API will not be available.
### 3.4 Running Modes
#### 3.4.1 Interactive Interface based on [MCP Inspector](https://github.com/modelcontextprotocol/inspector) (suitable for local effect debugging)
```bash
npx @modelcontextprotocol/inspector --config ./mcp.json
```
#### 3.4.2 Local Python Command Running ack-mcp-server
**Local Running ack-mcp-server Stdio Mode (suitable for local development)**
```bash
make run
# or
python -m src.main_server
```
**Local Running ack-mcp-server Streaming HTTP Mode (recommended for online system integration)**
```bash
make run-http
# or
python -m src.main_server --transport http --host 127.0.0.1 --port 8000
```
**Local Running ack-mcp-server SSE Mode**
```bash
make run-sse
# or
python -m src.main_server --transport sse --host 127.0.0.1 --port 8000
```
**Common Parameters**
| Parameter | Description | Default Value |
| --- | --- | --- |
| `--access-key-id` | AccessKey ID | Alibaba Cloud account credential AK |
| `--access-key-secret` | AccessKey Secret | Alibaba Cloud account credential SK |
| `--allow-write` | Enable write operation | Not enabled by default |
| `--transport` | Transport mode | stdio / sse / http |
| `--host` | Bind host | localhost |
| `--port` | Port number | 8000 |
| `--allowed-origins` | Allowed Origin whitelist | None (local mode automatically allows localhost) |
### 3.6 Security Considerations
- Service defaults to bind `127.0.0.1`, only allowing local access. If you need to expose it to the network, please configure Origin whitelist with `--allowed-origins` parameter.
- Service has built-in Origin header verification middleware, complying with MCP 2025-03-26 specification requirements, which can defend against DNS Rebinding attacks.
- **Note:** When service binds to non-localhost address (e.g., `0.0.0.0`) and `--allowed-origins` is not configured, all requests with Origin header will be rejected (403 Forbidden). Please configure Origin whitelist with `--allowed-origins` or `ALLOWED_ORIGINS` environment variable before deploying to production environment.
- Production environment deployment is recommended to be used with reverse proxy, API Gateway, or Kubernetes NetworkPolicy to increase authentication and network isolation.
- Complete security guide please refer to [SECURITY.md](./SECURITY.md).
```bash
# Specify allowed Origin sources
python -m src.main_server --transport http --host 127.0.0.1 --port 8000 --allowed-origins "http://localhost:3000,https://myapp.example.com"
# or through environment variable
export ALLOWED_ORIGINS="http://localhost:3000,https://myapp.example.com"
python -m src.main_server --transport http --host 127.0.0.1 --port 8000
```
### 3.7 Functional Testing UT
```bash
# Run all test UT
make test
```
## 🛠️ 4. How to Participate in Community Contribution
### 🏗️ 4.1 Project Architecture Design
**Technology Stack**: Python 3.12+ + FastMCP 2.12.2+ + Alibaba Cloud SDK + Kubernetes Client
Detailed architecture design see [`DESIGN.md`](DESIGN.md).
### 👥 4.2 Project Maintenance Mechanism
#### 🤝 How to Contribute
1. **Issue Feedback**: Through [GitHub Issues](https://github.com/aliyun/alibabacloud-ack-mcp-server/issues)
2. **Feature Request**: Through instant messaging DingTalk group: 70080006301 discussion and communication
3. **Code Contribution**: Fork → Feature branch → Pull Request
4. **Documentation Improvement**: API documentation, tutorial writing
### 💬 Community Exchange
- GitHub Discussions through Issues: Technical discussion, Q&A
- DingTalk group: Daily communication, Q&A support, community co-construction. Search DingTalk group number: 70080006301
## 📊 5. Effects & Benchmark (under continuous construction)
### 🔍 Test Scenarios
| Scenario | Description | Involved Modules |
| --- | --- | --- |
| Pod OOM Fix | Memory overflow problem diagnosis and repair | kubectl, diagnosis |
| Cluster Health Check | Comprehensive cluster state inspection | diagnosis, inspection |
| Resource Exception Diagnosis | Abnormal resource root cause analysis | kubectl, diagnosis |
| Historical Resource Analysis | Resource usage trend analysis | prometheus, sls |
### 📊 Effect Data
Based on the latest Benchmark results:
- Success rate: 92%
- Average processing time: 4.2 minutes
- Support AI agent: qwen_code, kubectl-ai
- Support LLM: qwen3-coder-plus, qwen3-32b
### How to Run Benchmark
Detailed see [`Benchmark README.md`](./benchmarks/README.md).
```bash
# Run Benchmark
cd benchmarks
./run_benchmark.sh --openai-api-key your-key --agent qwen_code --model qwen3-coder-plus
```
## 🗺️ 6. Evolution Plan & Roadmap
### 🎯 Recent Plan
- Support ACK cluster, node, and functional bearing component (addon) full life cycle resource maintenance
- Take benchmark effect as baseline target, continuously optimize core scene effect in general third-party Agent, LLM model, improve core maintenance scene effect success rate
- Continuously supplement benchmark core maintenance scene case, cover most ACK maintenance scenes, if needed, welcome to raise issues
- Performance optimization and cache improvement
### 🚀 Medium and Long-term Goals
- Cover [five pillars of excellent architecture](https://help.aliyun.com/product/2362200.html) of container scene: security, stability, cost, efficiency, performance (high reliability, etc.), provide better AIOps experience for multi-step complex container maintenance scene.
- - Cluster cost insight and governance
- - Cluster elastic scaling best practice
- - Cluster security vulnerability discovery and governance
- - ……
- Enterprise-level features (RBAC, security scanning)
- AI automation maintenance capability
## 7. Frequently Asked Questions
- **No AK configured**: Please check ACCESS_KEY_ID/ACCESS_KEY_SECRET environment variables
- **ACK cluster network inaccessible**: When ack-mcp-server uses KUBECONFIG_MODE = ACK_PUBLIC public network mode to access cluster kubeconfig, ACK cluster needs to enable public network access kubeconfig. In production environment, it is recommended to enable cluster network and use ACK_PRIVATE private network mode to access cluster kubeconfig to comply with production security best practices.
## 8. Security
- Please send email to **kubernetes-security@service.aliyun.com** to report security vulnerabilities. For details, please refer to [SECURITY.md](./SECURITY.md) file.
## License
Apache-2.0. See [`LICENSE`](LICENSE).
Connection Info
You Might Also Like
everything-claude-code
Complete Claude Code configuration collection - agents, skills, hooks,...
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
firecrawl
Firecrawl MCP Server enables web scraping, crawling, and content extraction.
servers
Model Context Protocol Servers
servers
Model Context Protocol Servers
cc-switch
All-in-One Assistant for Claude Code, Codex & Gemini CLI across platforms.