Content

# Vectorworks RAG + MCP - Fetches Vectorworks' Python / VectorScript documentation locally and performs cross-sectional searches using FAISS. - Provides `/search` `/answer` `/get` via FastAPI, allowing search and source confirmation from a simple Web UI. - Simultaneously starts an MCP server with WebSocket (JSON-RPC 2.0) and provides `vw.search` / `vw.answer` / `vw.get` tools. - Simultaneously starts `app` (API/MCP) and `db` (Postgres 16) with docker compose. --- ## Requirements - Docker / Docker Compose (v2) ## Quick Start 1. Build - `docker compose build` 2. Fetch Documents - `docker compose run --rm app bash scripts/fetch_docs_minimal.sh` - `docker compose run --rm -e GITHUB_TOKEN="$GITHUB_TOKEN" app bash scripts/fetch_github_vectorworks.sh` 3. Generate Vectors (Embedding + FAISS) - `docker compose run --rm app python -m app.indexer` - If you add documents with an existing index, recreate it: - `docker compose run --rm app python -m app.indexer --rebuild` 4. Start (UI + MCP + Postgres) - `docker compose up` 5. Access - UI: `http://localhost:8000` - MCP: `ws://localhost:8765` Note - 2. and 3) can also be done in the following one-liner: - `docker compose run --rm app bash -lc 'scripts/fetch_docs_minimal.sh && python -m app.indexer'` ## Fetching Documents The following command fetches the minimum required documents (md/html only) to `data/`. - Dependencies: `git`, `curl` - Execution (inside container): - `docker compose run --rm app bash scripts/fetch_docs_minimal.sh` - If a 403 error occurs due to network issues, the script will skip the corresponding page and continue. - You can overwrite the UA if necessary: `docker compose run --rm -e UA="Mozilla/5.0 ..." app bash scripts/fetch_docs_minimal.sh` Target (key points only) - GitHub: Vectorworks/developer-scripting (Introduction/Guidance Markdown) - App Help: Basic Scripting Guidance (Key pages for 2022/2023/2024) - Japanese site: VectorScript function index page + example page - Developer Wiki: VS Function Reference category index (HTML) Note - The indexer only supports `.md` `.markdown` `.html` `.htm` `.txt`. PDF etc. are not supported. ## GitHub (Fetch only Vectorworks specified repositories) - Purpose: Clones (or updates) only the following three repositories to `data/github/vectorworks/`. - `Vectorworks/developer-scripting` - `Vectorworks/developer-sdk` - `Vectorworks/developer-worksheets` - Execution (inside container): - `docker compose run --rm app bash scripts/fetch_github_vectorworks.sh` - Destination: `data/github/vectorworks/<repo>` - If you want to add/change, you can specify it with the environment variable `REPOS` (space or comma separated). - Example: `docker compose run --rm -e REPOS="Vectorworks/developer-scripting,Vectorworks/developer-sdk" app bash scripts/fetch_github_vectorworks.sh` - Specification of update method (optional): `UPDATE_MODE=pull` (default) or `UPDATE_MODE=reset` - `pull`: Fast update by `git pull --ff-only --depth=1 --prune` (fails if there are local changes → fetch+reset in case of failure) - `reset`: Full synchronization by `fetch --depth=1` followed by `reset --hard` (discards local changes) Note - The indexer only targets text-based extensions (.md/.html/.txt etc.). ## Vector Generation (Embedding + FAISS Index) - Execution (inside container): - `docker compose run --rm app python -m app.indexer` After execution, `vw.faiss` and `meta.jsonl` will be created in `index/`. (Please make sure that the document is in `data/` by `bash scripts/fetch_docs_minimal.sh` etc. beforehand) ## Start (UI + MCP + Postgres) - `docker compose up --build` - UI: `http://localhost:8000` - MCP: `ws://localhost:8765` - Postgres is started on the internal container network (not published to the host) ## API Examples - Search: `GET /search?q=PushAttrs&k=6` - Example: `curl -s "http://localhost:8000/search?q=VectorScript&k=6" | jq .` - Answer (draft): `GET /answer?q=...&k=6` - Example: `curl -s "http://localhost:8000/answer?q=record+format" | jq .` - Get Chunk: `GET /get?doc_id=...&chunk_id=...` The UI is at `GET /` and you can check the equivalent results from the search form. ## MCP (Model Context Protocol) - Connect from VS Code: Add MCP with the following command - `code --add-mcp '{"name":"vw_docs_local","url":"ws://localhost:8765"}'` - Supported tools - `vw.search({ query, k? })` - `vw.answer({ query, k? })` - `vw.get({ doc_id, chunk_id })` The implementation is a JSON-RPC 2.0 based WebSocket server (`app/mcp_server.py`). ## Directory Structure - `app/` App body - `api.py` FastAPI app - `mcp_server.py` MCP(WebSocket) server - `indexer.py` Import/Vectorization (Embedding + FAISS creation) - `search.py` Core logic for search/answer - `chunking.py` Chunk splitting (approximately 700 tokens equivalent character length as a guide) - `templates/` Web UI template - `data/` Original md/html (`doc_id` is the relative path) - `index/` FAISS and metadata (`vw.faiss`, `meta.jsonl`) ## Environment Variables (Optional) - `DATA_DIR` Data placement directory (default: `data`) - `INDEX_DIR` Index output directory (default: `index`) - `EMBED_MODEL` Embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`) - `CHUNK_CHARS` Guideline for chunk character length (default: `2800` ≒ 700 tokens) - `CHUNK_OVERLAP` Chunk character overlap (default: `480`) - `API_HOST` / `API_PORT` FastAPI binding (default: `0.0.0.0:8000`) - `MCP_HOST` / `MCP_PORT` MCP binding (default: `0.0.0.0:8765`) Postgres (for future use) - `PGHOST` / `PGPORT` / `PGDATABASE` / `PGUSER` / `PGPASSWORD` ## Operational Notes - If you update the document, generate the vector again: `python -m app.indexer` - It may take some time to download the embedding model for the first time. - Since it is assumed to be executed on the CPU, FAISS uses `IndexFlatIP` + normalized vector (equivalent to cosine). ## License / Notes - This repository does not include documents. Please place it in `data/` according to your terms of use and copyright. - `answer` is a simple draft based on excerpts. Please check the original text for the final decision.

vectorworks-mcp

Content

Connection Info

You Might Also Like

markitdown

servers

Time

Filesystem

Sequential Thinking

git

vectorworks-mcp

Scan with WeChat to Share

Authentication Required

Content

Connection Info

You Might Also Like

markitdown

servers

Time

Filesystem

Sequential Thinking

git