Content

# AuditLuma - Advanced Code Audit AI System 🔍 <div align="center"> ![Version](https://img.shields.io/badge/version-2.0.0-blue) ![License](https://img.shields.io/badge/license-MIT-green) ![Python](https://img.shields.io/badge/python-3.8+-yellow) ![Architecture](https://img.shields.io/badge/architecture-hierarchical_RAG-orange) </div> AuditLuma is an intelligent code audit system that adopts an innovative **hierarchical RAG architecture**, combining multiple AI agents and advanced technologies, including Haystack-AI orchestrator, txtai knowledge retrieval, R2R context enhancement, and Self-RAG validation, to provide comprehensive and accurate security analysis for codebases. ## 🌟 Architecture Highlights - 🏗️ **Hierarchical RAG Architecture** - Four-layer intelligent architecture: Haystack orchestration + txtai retrieval + R2R enhancement + Self-RAG validation - 🚀 **Haystack-AI Orchestrator** - Intelligent task decomposition and result integration, supporting traditional orchestrator fallback - 🔍 **Intelligent Knowledge Retrieval** - txtai-driven semantic retrieval and context understanding - 🎯 **Accurate Validation** - Self-RAG multi-model cross-validation, effectively reducing false positives - 🔄 **Adaptive Architecture** - Automatically selects the optimal architecture mode based on project size ## ✨ Core Features ### 🏗️ Hierarchical RAG Architecture - **Haystack Orchestration Layer** - Intelligent task decomposition, parallel execution, and result integration - **txtai Knowledge Retrieval Layer** - Semantic retrieval and context understanding - **R2R Context Enhancement Layer** - Dynamic context expansion and association analysis - **Self-RAG Validation Layer** - Multi-model cross-validation and false positive filtering ### 🚀 Intelligent Orchestration System - **Haystack-AI Orchestrator** - AI-based intelligent task orchestration (recommended) - **Traditional Orchestrator** - Rule-driven stable orchestration scheme - **Automatic Fallback Mechanism** - Automatic switching to traditional orchestrator when AI orchestrator is unavailable - **Dynamic Architecture Selection** - Automatically selects the optimal architecture based on project size ### 🔍 Advanced Analysis Capabilities - 🛡️ **Comprehensive Security Analysis** - Comprehensive detection of vulnerabilities and effective repair suggestions - 🌐 **Cross-File Security Analysis** - Detection of cross-file vulnerabilities that traditional single-file analysis cannot discover - 📊 **Global Context Construction** - Construction of code call graphs, data flow graphs, and dependency relationships - 🎯 **Taint Analysis** - Tracking user input propagation paths in code - 🔄 **MCP (Multi-Agent Cooperation Protocol)** - Enhanced coordination and cooperation between agents ### 🌐 Enterprise-Level Support - **Multi-LLM Vendor Support** - Supports OpenAI, DeepSeek, MoonShot, Tongyi Qianwen, and other vendors - **Automatic Vendor Detection** - Automatic identification and configuration of correct vendor API based on model name - **Asynchronous Parallel Processing** - Using asynchronous concurrency technology to improve performance and accelerate analysis - **Visualization** - Generation of dependency graphs and detailed security reports ## 📋 Table of Contents - [Quick Start](#-quick-start) - [Hierarchical RAG Architecture](#-hierarchical-rag-architecture) - [Documentation](#-documentation) - [Installation](#-installation) - [Usage](#-usage) - [Configuration](#-configuration) - [Supported Languages](#-supported-languages) - [Architecture](#-architecture) - [Report Format](#-report-format) - [Contributing](#-contributing) - [License](#-license) ## 🚀 Quick Start ```bash # 1. Clone the project git clone https://github.com/Vistaminc/AuditLuma.git cd AuditLuma # 2. Install dependencies pip install -r requirements.txt # 3. Analyze using hierarchical RAG architecture (recommended) python main.py --architecture hierarchical --haystack-orchestrator ai -d ./your-project # 4. View architecture information python main.py --show-architecture-info ``` ## 🏗️ Hierarchical RAG Architecture AuditLuma 2.0 introduces an innovative four-layer RAG architecture, significantly improving analysis accuracy and efficiency: ``` ┌─────────────────────────────────────────────────────────────┐ │ Hierarchical RAG Architecture │ ├─────────────────────────────────────────────────────────────┤ │ Layer 1: Haystack Orchestration Layer │ │ ├─ Haystack-AI Orchestrator (recommended) - Intelligent task │ │ └─ Traditional Orchestrator - Rule-driven stable scheme │ ├─────────────────────────────────────────────────────────────┤ │ Layer 2: txtai Knowledge Retrieval Layer │ │ ├─ Semantic retrieval and similarity matching │ │ └─ Context understanding and knowledge graph construction │ ├─────────────────────────────────────────────────────────────┤ │ Layer 3: R2R Context Enhancement Layer │ │ ├─ Dynamic context expansion │ │ └─ Association analysis and dependency tracking │ ├─────────────────────────────────────────────────────────────┤ │ Layer 4: Self-RAG Validation Layer │ │ ├─ Multi-model cross-validation │ │ └─ False positive filtering and confidence assessment │ └─────────────────────────────────────────────────────────────┘ ``` ### Architecture Advantages - **🎯 Accuracy Improvement** - Four-layer verification mechanism, significantly reducing false positives - **⚡ Performance Optimization** - Intelligent caching and parallel processing, improving analysis speed - **🔄 Adaptability** - Automatically selects the optimal configuration based on project size - **🛡️ Reliability** - Multiple fallback mechanisms, ensuring stable system operation ## 📚 Documentation ### 🚀 Getting Started - [Installation Guide](./docs/installation-guide.md) - Detailed installation steps and environment configuration - [User Guide](./docs/user-guide.md) - Complete usage tutorial from beginner to advanced - [Quick Reference](./docs/quick-reference.md) - Common commands and configuration quick reference manual ### 🏗️ Core Documentation - [Hierarchical RAG Architecture Guide](./docs/hierarchical-rag-guide.md) - Detailed hierarchical RAG architecture description and usage guide - [Configuration Reference](./docs/configuration-reference.md) - Complete configuration options and parameter description - [Best Practices](./docs/best-practices.md) - Usage suggestions, performance optimization, and security configuration ### 🔧 Technical Documentation - [Architecture Design](./docs/architecture-design.md) - System architecture and design philosophy - [Troubleshooting Guide](./docs/troubleshooting.md) - Common issues, error diagnosis, and solutions - [Project Structure](./项目结构.md) - Detailed project directory structure and module description ### 📖 Online Resources - [AuditLuma Documentation](https://iwt6omodfh0.feishu.cn/drive/folder/OwWqf7EYblaqTNdaDbtcnQcHnTt) - Complete online documentation and tutorials ## 🚀 Installation Clone the repository and install dependencies: ```bash git clone https://github.com/Vistaminc/AuditLuma.git cd AuditLuma pip install -r requirements.txt ``` ### Optional Dependencies **FAISS Vector Retrieval Library** By default, AuditLuma uses a simple built-in vector storage implementation. For large codebases, it's recommended to install FAISS for better performance: ```bash # CPU version pip install faiss-cpu # GPU version (supports CUDA) pip install faiss-gpu ``` After installing FAISS, the system will automatically detect and use it for vector storage and retrieval, significantly improving performance for large projects. ## 🛠 Usage ### Basic Usage ```bash # Use hierarchical RAG architecture (recommended) python main.py --architecture hierarchical -d ./your-project -o ./reports # Use Haystack-AI Orchestrator (default, recommended) python main.py --architecture hierarchical --haystack-orchestrator ai -d ./your-project # Use traditional orchestrator python main.py --architecture hierarchical --haystack-orchestrator traditional -d ./your-project # Automatic architecture selection (based on project size) python main.py --architecture auto -d ./your-project # Traditional RAG architecture (backward compatibility) python main.py --architecture traditional -d ./your-project ``` ### Advanced Usage ```bash # Enable performance comparison mode python main.py --architecture hierarchical --enable-performance-comparison -d ./your-project # View architecture information and configuration python main.py --show-architecture-info # Configuration migration (from traditional configuration to hierarchical RAG) python main.py --config-migrate # AI-enhanced cross-file analysis python main.py --architecture hierarchical --enhanced-analysis -d ./your-project ``` ### Command-Line Parameters #### Basic Parameters | Parameter | Description | Default Value | |------|------|--------| | `-d, --directory` | Target project directory | `./goalfile` | | `-o, --output` | Report output directory | `./reports` | | `-w, --workers` | Parallel worker threads | Configured `max_batch_size` | | `-f, --format` | Report format (html/pdf/json) | Configured `report_format` | #### Architecture Selection Parameters | Parameter | Description | Default Value | |------|------|--------| | `--architecture` | RAG architecture mode (traditional/hierarchical/auto) | `auto` | | `--haystack-orchestrator` | Haystack orchestrator type (traditional/ai) | `ai` | | `--force-traditional` | Force traditional RAG architecture | - | | `--force-hierarchical` | Force hierarchical RAG architecture | - | | `--enable-performance-comparison` | Enable performance comparison mode | - | | `--auto-switch-threshold` | Automatic switching threshold for file count | `100` | #### Hierarchical RAG Specific Parameters | Parameter | Description | Default Value | |------|------|--------| | `--enable-txtai` | Enable txtai knowledge retrieval layer | - | | `--enable-r2r` | Enable R2R context enhancement layer | - | | `--enable-self-rag-validation` | Enable Self-RAG validation layer | - | | `--disable-caching` | Disable hierarchical caching system | - | | `--disable-monitoring` | Disable performance monitoring | - | #### Traditional Function Parameters | Parameter | Description | Default Value | |------|------|--------| | `--no-mcp` | Disable multi-agent cooperation protocol | Enabled by default | | `--no-self-rag` | Disable Self-RAG retrieval | Enabled by default | | `--no-deps` | Skip dependency analysis | Not skipped by default | | `--no-remediation` | Skip generating repair suggestions | Not skipped by default | | `--no-cross-file` | Disable cross-file vulnerability detection | Enabled by default | | `--enhanced-analysis` | Enable AI-enhanced cross-file analysis | Disabled by default | #### Other Parameters | Parameter | Description | Default Value | |------|------|--------| | `--verbose` | Enable detailed logging | Disabled by default | | `--dry-run` | Dry run mode (no actual analysis) | - | | `--config-migrate` | Migrate configuration to hierarchical RAG format | - | | `--show-architecture-info` | Display current architecture information and exit | - | ## ⚙️ Configuration Configure the system by editing the `config/config.yaml` file. AuditLuma 2.0 supports hierarchical RAG architecture configuration. ### Hierarchical RAG Configuration ```yaml # Hierarchical RAG architecture model configuration hierarchical_rag_models: # Whether to enable hierarchical RAG architecture enabled: true # Haystack orchestration layer configuration haystack: # Orchestrator type selection: traditional (traditional) or ai (Haystack-AI, recommended) orchestrator_type: "ai" # Default uses Haystack-AI orchestrator # Default model (supports model@provider format) default_model: "qwen3:32b@ollama" # Task-specific model configuration task_models: security_scan: "gpt-4@openai" # Security scan uses a stronger model syntax_check: "deepseek-chat@deepseek" # Syntax check logic_analysis: "qwen-turbo@qwen" # Logic analysis dependency_analysis: "gpt-3.5-turbo@openai" # Dependency analysis # txtai knowledge retrieval layer model configuration txtai: retrieval_model: "gpt-3.5-turbo@openai" # Knowledge retrieval model embedding_model: "text-embedding-ada-002@openai" # Embedding model # R2R context enhancement layer model configuration r2r: context_model: "gpt-3.5-turbo@openai" # Context analysis model enhancement_model: "gpt-3.5-turbo@openai" # Enhancement model # Self-RAG validation layer model configuration self_rag_validation: validation_model: "gpt-3.5-turbo@openai" # Main validation model cross_validation_models: # Multiple models used for cross-validation - "gpt-4@openai" - "deepseek-chat@deepseek" - "gpt-3.5-turbo@openai" ``` ### Model Specification Format AuditLuma supports a unified model specification format `model@provider` to specify models and providers: ``` deepseek-chat@deepseek # Specify DeepSeek provider's deepseek-chat model gpt-4-turbo@openai # Specify OpenAI provider's gpt-4-turbo model qwen-turbo@qwen # Specify Tongyi Qianwen provider's qwen-turbo model ``` If no provider is specified (not using the @ symbol), the system will automatically infer the provider based on the model name. ### Architecture Selection Configuration ```yaml # Global settings global: # Default architecture mode: traditional, hierarchical, auto default_architecture: "hierarchical" # Automatic switching threshold (file count) auto_switch_threshold: 100 # Whether to enable performance comparison enable_performance_comparison: false ``` ### Multi-Vendor Support AuditLuma supports multiple LLM vendors and can automatically detect vendors based on model names: | Model Prefix | Vendor | |---------|------| | `gpt-` | OpenAI | | `deepseek-` | DeepSeek | | `qwen-` | Tongyi Qianwen | | `glm-` or `chatglm` | Zhipu AI | | `baichuan` | Baichuan | | `ollama-` | ollama | - Note: OpenAI vendors can connect to all OpenAI format transit platforms ## 💻 Supported Languages AuditLuma supports analysis of the following programming languages: ### Main Languages (including top 10) - Python (.py) - JavaScript (.js, .jsx) - TypeScript (.ts, .tsx) - Java (.java) - C# (.cs) - C++ (.cpp, .cc, .hpp) - C (.c, .h) - Go (.go) - Ruby (.rb) - PHP (.php) - Lua (.lua) ### Other Supported Languages - Rust (.rs) - Swift (.swift) - Kotlin (.kt) - Scala (.scala) - Dart (.dart) - Bash (.sh, .bash) - PowerShell (.ps1, .psm1) ### Markup and Configuration Languages - HTML (.html, .htm) - CSS (.css) - JSON (.json) - XML (.xml) - YAML (.yml, .yaml) - SQL (.sql) ## 🏛 Architecture AuditLuma uses a multi-agent architecture, consisting of the following components: ![Architecture](https://via.placeholder.com/800x400?text=AuditLuma+Architecture) 1. **Agent Orchestrator** - Coordinates all agents in the workflow 2. **Code Analysis Agent** - Analyzes code structure and extracts dependencies 3. **Security Analysis Agent** - Identifies security vulnerabilities 4. **Remediation Suggestion Agent** - Generates targeted vulnerability repair plans 5. **Visualization Component** - Generates intuitive reports and dependency graphs ## 📊 Report Formats AuditLuma supports the following report formats: - 📋 **HTML Report** - Includes vulnerability details, statistics charts, and interactive visualizations - 📄 **PDF Report** - Suitable for printing and sharing - 🔄 **JSON Report** - Machine-readable format suitable for further processing and integration ## 💬 Contributing Contributions are welcome! Please follow these steps: 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit changes (`git commit -m 'Add some amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Create a Pull Request ## 📞 Communication - QQ: 1047736593 ## 🤝 Partners - [Cotton Candy Network Security Circle](https://vip.bdziyi.com/?ref=711) ## Support and Appreciation If you find AuditLuma helpful, please consider supporting us: - Your sponsorship will help us continue to improve and perfect AuditLuma! <div style="display: flex; justify-content: space-between; max-width: 600px; margin: 0 auto;"> <div style="flex: 1; margin-right: 20px;"> <img src="https://github.com/Vistaminc/Miniluma/blob/main/ui/web/static/img/zanshang/wechat.jpg"/> </div> <div style="flex: 1;"> <img src="https://github.com/Vistaminc/Miniluma/blob/main/ui/web/static/img/zanshang/zfb.jpg"/> </div> </div> ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=vistaminc/Auditluma&type=Date)](https://www.star-history.com/#) ## 📜 License MIT --- <div align="center"> <sub>Built with ❤️ by AuditLuma Team</sub> </div>

AuditLuma

Content

Connection Info

You Might Also Like

everything-claude-code

markitdown

servers

servers

cc-switch

Time

AuditLuma

Scan with WeChat to Share

Authentication Required

Content

Connection Info

You Might Also Like

everything-claude-code

markitdown

servers

servers

cc-switch

Time