Content
align=" <img src="/memento_mcp_logo_transparent.png" width="400" alt="Memento MCP Logop>
<p align="center">
<a href="https://github.com/JinHo-von-Choi/memento-mcp/releases">
<img src="https://img.shields.io/github/v/release/JinHo-von-Choi/memento-m=flat&label=release&color=4c8bf5" alt="GitHub Release" />
</a>
<a href="https://github.com/JinHo-von-Choi/memento-mcp/stargazers">
<img src="https://img.shields.io/github/stars/J-von-Choi-mcp?style=flat&color=f5c542" alt="GitHub Stars" />
</a>
<a href="LICENSE">
<img src="https://img.shields.io/badgeApache%202.0-blue?style=flat" alt="License" />
</ <a href="https://lobehub.com/mcp/jinho-von-choi-memento-mcp <img src="https://lobehub.com/badge/mcp/jinho-von-choi-memento-mcp" alt="MCP Badge" />
</a>
</p>
<p align="center">
<a href="README.en.md">📖 English Documentation</a>
</p>
# Tool List
> Give AI memories. And let them grow on that foundation.
Imagine a new employee whose memories reset every morning. They forget what you taught them yesterday, what problems they solved last and even their preferences. Memento MCP gives this new employee memories.
Memento MCP is an MCP (Model Context Protocol) based agent long server. Even if a session ends, it retains important decisions, error patterns, and procedures and restores them in the next session.
It's not just a library of memories. As feedback accumulates, connections become stronger, experiences are repeated, patterns are abstract and stories are created when sessions are connected. It aims for an AI that grows from experience, remembers.
> [!TIP]
> If you're not comfortable with manual installation and configuration, you can pass the following sentence to AI assistants like Claude Code, Cursor, or Codex:
>
> > "Install the memento-mcp repository in my environment `docs/INSTALL and `SKILL.md`, apply the recommended settings, and verify the operation."
>
> The guide you through dependency installation, `.env` configuration, MCP registration, and. For detailed delegation procedures, refer to [`docs/INSTALL.md`](docs/#ai).
## 30-Second Experience
This is a flow where you make an AI remember something and in the next session:
```
1]
User: "Our project uses PostgreSQL 15, and tests are run with Vitest"
→ AI calls remember → 2 fragments saved
[Session 2 — Next day]
→ AI calls context → "Using PostgreSQL 15", "Vitest tests" automatically restored
User: "How do I run tests again?"
→ AI calls recall → returns "Run tests with Vitest" fragment
→ AIThis project uses Vitest. Run with `npx vitest`."
```
No need to repeat the same explanation every session.
## Installation
Required: Node.js 20+, PostgreSQL (with pgvector extension)
```bash
cp .env.example.minimal .env
# Edit `.env` values and reflect them in the shell
export $(grep -v '^#' .env | grep '=' | xargs)
npm install
npm run migrate
node server.js
```
To use local embeddings without OpenAI API, add `EMBEDDING_PROVIDER=transformers` to `.env`. It automatically downloads the `Xenova/multilingual-e5-small` model at startup. However, if you mix it with OpenAI embeddings, the dimensions will not match, so use it only with a newly migrated DB the server starts, verify the operation with [First Memory Flow](-started/first-memory-flow.md).
Refer to the [Compatible Platforms](#compatible-platform table for other platform settings.
### Update
```bash
cd ~/memento-mcp
git pull origin main
npm install
npm run migrate
# Restart service (depending on environment, e.g., systemd, pm2, docker)
```
- `npm run migrate` automatically uses the DB settings from `.env`. No need to manually specify `DATABASE_URL`.
- The pgvector schema is automatically detectedVECTOR_SCHEMA` usually unnecessary.
### Claude Code Integration
Register with the `claude mcp add` CLI (HTTP-type MCP servers are not recognized even if manually entered in `settings.json`).
```bash
claude mcp add memento http://localhost:57332/mcp \
--transport --scope user \
--header "Authorization YOUR_ACCESS_KEY"
```
Registration is stored in `~/.claude.json:
```bash
claude mcp list
# memento: http://57332/mcp (HTTP) - ✓```
To share project-wide in the `.mcp.json` file in the repository root detailed settings, [Claude Code Configuration](docs/getting-started/claude-code.md).
### Supported Environments
| Environment | Recommendation | Getting Started Document |
| --- | --- |
| Linux / macOS | Recommended | [Quick Start-started/quickstart.md) |
| Windows + WSL2 | Most [Windows WSL2 Setup](docs/getting-started/windows-wsl2.md) |
| Windows + PowerShell | Limited Support | [Windows PowerShell Setup](-started/windows-powershell |
## Compatible Platformsemento is an MCP (Model Context Protocol). It can be used with all AI platforms that, not just Claude Code.
| Platform | Configuration Location | Connection Method |
| --- | --- | --- |
| Claude Code | `claude mcp add` CLI (`~/.claude.json`) or `.mcp.json` | Streamable HTTP |
| Claude Desktop | claude_desktop_config.json | Streamable HTTP |
| Claude.ai Web | Settings > OAuth (RFC1) |
| Cursor | .cursor/mcp.json | Streamable HTTP |
| Windsurfcodeium/windsurf/mcp_config.json | HTTP |
| Copilot | VS Marketplace | Streamable HTTP |
| Codex ~/.codex/config.toml | Streamable HTTP |
GPT Desktop | Developer Mode > Apps | OAuth (RFC1) |
| Continue | config.json | Streamable HTTP |
Common settings: Server URL `http://localhost:57332/mcp`, Authorization header with `Bearer YOUR_ACCESS_KEY`.
Claude.ai Web / integration uses OAuth. Enter the issued API key (`mmcp_xxx`) as `client_id` to connect without Dynamic Client Registration (RFC 7591). The for trusted domains (claude.ai, chatgpt.com) is automatically approved.
For platform-specific settings, refer to the [Integration Guide](docs/getting-started 7 Fragment Types
| Type | Description | Use Case |
| --- | --- |
fact` | Fact | like settings, paths, and versions |
| `decision Decision | Architecture choices, technology stack decisions, and rationale |
| `error` | Error | Occurred errors, causes, and solutions |
| `preference` | Preference | User style, coding rules, and work methods |
| `procedure` | Procedure | Repeatable steps like deployment, build testing |
relation` | Connections, dependencies, and ownership relationships |
| `episode` | Episode | Narrative memories with context (1000 others 300 characters## Core Features
| Feature | Description |
| --- | --- |
| `remember` | Stores important information as atomic fragments. With `MEMENTO_REMEMBER_ATOMIC=true`, quota check and INSERT are atomicized in a single transaction. |
| `recall` | Returns only using keyword + semantic 3-layer search. `SearchScope` applies scopes consistently across L1~L3 layers. |
| `context` | Automatically core context at session start |
| Auto-organizationerges duplicates, detects contradictions, attenuates importance, TTL-based forgetting |
adapter layer | Introduced `lib/storage/`. The `getStorage()` factoryPgVectorStore` (default)SqliteVecStore` (v4.1 planned) based `MEMENTO_STORAGE` ENV. |
| Link re-integration | `tool_feedback` feedback reflects in real-time on fragment_links' weight/confidence (ReconsolidationEngine). Contradictory links are automatically quarantined. |
| Spreading activation | When `recall` is called with `contextText`, it proactively boosts the activation_score of related fragments to prioritize contextually relevant results (SpreadingActivation). |
| Episode continuity | Automatically generates `preceded_by` edges between generated episode fragments to preserve experience flow as a graph (EpisodeContinuityService). |
| Management console | Memory exploration, knowledge graph, statistics dashboard, API key group/state filter, and daily-limit inline editing |
| OAuth integration | RFC 7591 Dynamic Client Registration, Claude.ai / ChatGPT Web integration support |
| Workspace isolation | Separates memories by project type, and client within the same key. Automatically `api_keys.default_workspace` and search. |
processing | `` uses multi-row single INSERT (256KB or 500-row `reflect` delegates 5 categories to batch call. EmbeddingWorker processes a queue bundle with a single generatedings and multi-row UPDATE. |
| Consistency Gate | Tracksme indexing completion with the `fragments.m_indexed` Unindexed fragments are automatically excluded from L3 mor search paths. |
| Mode preset | ` / `write-only` / `onboarding` / `audit` JSON preset. Limits tool exposure range with `X-Memento-Mode` header or `api_mode`. |
| Affective tagging | `fragments.affect` column (neutral / frustration / confidence / surprise / doubt / satisfaction). Filters by emotional labels during remember/recall. |
| Local embedding | Uses `@huggingface/transformers` pipeline-based embedding with `EMBEDDING_PROVIDERers` (no external API, `Xenova/multilingual-e5-small`, 384d by default). |
| Migration lint | Automatically checks new migration file number conflicts and convention violations before commit with run lint:migrations`. |
See [SK](SKILL.md) for the full MCP tool list.
##Remotely control server without a local Specify with the globalremote URL --key KEY` or environment variables `O_CLI_REM / `MEMENTO_CLI_KEY`.
```bash
# Recall from a remote server (environment variable method)
MEMENTO_CLI_REM://example.com/mcp MEMENTO_KEY=mmcp_xxx memento-mcp recall "query"
# Recall from a remote server method)
memento-mcp recall "query" --remote https/mcp --key mmcp_xxx
# Table output, 5 results
memento-mcp recall "query" --format table --limit 5
# Prevent duplicate storage with idempotency key
memento-mcp remember "content" --topic project_name --idempotency-key k1
```
Choose output format with `--format table|json|csv`, and 14 subcommands support `--help`/`-h`. For detailed flags, referdocs/cli.md](docs/cli.md).
## API Response Meta
`recall` / `context` responses include the `_meta: { searchEventId, hints, suggestion, serverTime }` field. `serverTime` exposes the server's current time in every response to prevent time anchoring for LLM clients.
```json
{
"fragments": [...],
"_meta": {
"searchEventId": "evt-abc123",
"hints": { "signal": "consider_context" "suggestion": { "code": "large_limit_no_budget", "message": "..." },
"serverTime": {
"iso" 2026-05T06:32.000Z",
epoch_ms" : 174729193100 "display_kst2026-15 (Thu) 15:32",
"timezone : "Asia/Seoul"
}
}
```
`remember` / `link` / `forget` / `amend` return only expected results with the `dryRun: true` parameter. All responses include `X-RateLimit-Limit` / `X-RateLimit-Remaining` / `X-RateLimit-Resource` headers, omitted with the master key or `limit=null` setting. `recall` limits return fields to 17 with the `fields` array. `remember` / `batch prevent duplicate storage within the same `idempotencyKey` range (up to characters).
## Security
- RBAC default-deny: Tool names not in the `TOOL_PERMISSIONS` map are immediately denied, regardless of permissions.
- Tenant isolation: `forget`, `amend`, `link`, and `fragment_history` are restricted to their respective tenants using SQL-level `key_id` conditions. "None" and "No permission" are treated as the same message to prevent exposure of existence.
- `injectSessionContext`: Re-injects internally generated fields like `_keyId` and `_permissions` from client requests into the session context to prevent session context tampering.
- Admin rate limit: IP-based rate limiting for `/auth`, `/keys` POST, and `/import` POST.
- OpenAPI: `GET /openapi.json` endpoint (enabled with `ENABLE_OPENAPI=true`). Master key returns full path, while API key returns permission filter specifications.
## Symbolic Verification Layer
Optional explainability, advisory link integrity, polarity conflict detection, and soft gating of policy rules. Comprises 9 core modules and 5 rule files. All flags are disabled by default.
## Smart Recall
- ProactiveRecall: Automatically links similar fragments based on keyword overlap when `remember()` is called.
- CaseRewardBackprop: Automatically backpropagates importance of evidence fragments during case verification events.
- SearchParamAdaptor: Automatically optimizes search thresholds based on usage patterns.
- CBR (Case-Based Reasoning): Searches for similar cases with `recall(caseMode=true)` to reuse past solution patterns.
- `depth` filter: Controls search depth for Planner and Executor roles (`"high-level"`, `"detail"`, and `"tool-level"`).
- Recall response `key_id`: Includes the owning tenant identifier in returned fragments.
- Reconsolidation: Updates `fragment_links` weights and confidence in real-time based on `tool_feedback` (`ENABLE_RECONSOLIDATION=true`).
- Spreading Activation: Pre-activates related fragments based on conversation context when `recall(contextText=...)` is called (`ENABLE_SPREADING_ACTIVATION=true`).
`fragments.id` is in the format `frag-{16-character hex}`. Note that it is not a UUID, so exercise caution when generating or parsing IDs externally.
The `/metrics` endpoint exposes metrics in Prometheus-compatible format. Users can freely configure collection and visualization.
## Memory vs. Rules
Memento-injected memory fragments have lower priority than system prompts. Factual memories like "We use PostgreSQL 15" work well, but behavioral rules like "Always use Given-When-Then pattern when writing tests" may be ignored if they conflict with system prompts.
It is recommended to configure behavioral rules in high-priority channels like CLAUDE.md, AGENTS.md, hooks, and skills.
## Benchmarks
Performance on LongMemEval-S (500 questions):
| Metric | Score | Comparison |
| --- | --- | --- |
| Search recall@5 | 88.3% | +8~18pp compared to LongMemEval paper's Stella 1.5B |
| QA accuracy | 45.4% | With temporal metadata (baseline 40.4%) |
| Fragment throughput | 89,006 fragments / 27 seconds | Entire pipeline including ingestion, embedding, and search |
Search achieves over 80% recall in 5 out of 6 question types. However, there is a significant gap between search recall (88.3%) and QA accuracy (45.4%), primarily due to limitations in the reader stage.
For detailed analysis, refer to the [Benchmark Report](docs/benchmark.md).
## Usage Patterns
Memento is optimized for factual memory. For situations where context is crucial:
- Store narratives in `episode` type for easier recall of entire context.
- Use `contextSummary` to include context in recall results.
- Consider a dual-structure approach with Memento for factual search and main memory for context restoration.
## Who is it for?
- Developers who frequently use AI agents like Claude Code, Cursor, or Windsurf.
- Individuals tired of repeating the same explanations for AI sessions.
- Users who want AI to remember their project context.
## Directory Structure
```
lib/
memory/
read/ # FragmentSearch, SearchScope, SearchSideEffects, CaseRecall, etc.
write/ # MemoryRememberer, BatchRememberProcessor, etc.
link/ # MemoryLinker, ReconsolidationEngine, etc.
consolidate/ # MemoryConsolidator
embedding/ # EmbeddingWorker, EmbeddingCache, MorphemeIndex
signals/ # SpreadingActivation, CaseRewardBackprop, etc.
processors/ # facade — MemoryRecaller, MemoryReflector, etc.
storage/ # PgVectorStore (default), SqliteVecStore (v4.1 planned) adapter layer
llm/ # dispatchChain, provider implementations
symbolic/ # SymbolicVerificationLayer (opt-in)
docs/
getting-started/ # Platform-specific installation guides
operations/ # Operational guides (llm-providers, symbolic-hard-gate, etc.)
features.md # Module ledger
configuration.md # Environment variable reference
```
## Further Reading
| Document | Content |
| --- | --- |
| [Quick Start](docs/getting-started/quickstart.md) | Detailed installation guide |
| [Architecture](docs/architecture.md) | System structure, DB schema, 3-tier search, TTL |
| [Configuration](docs/configuration.md) | Environment variables, MEMORY_CONFIG, embedding providers |
| [API Reference](docs/api-reference.md) | HTTP endpoints, prompts, resources |
| [CLI](docs/cli.md) | Terminal commands |
| [Internals](docs/internals.md) | Evaluators, integrators, contradiction detection |
| [Benchmark](docs/benchmark.md) | LongMemEval-S benchmark detailed analysis |
| [Features](docs/features.md) | Module ledger, experimental flags, ENV mapping |
| [SKILL.md](SKILL.md) | MCP tool reference |
| [INSTALL.md](docs/INSTALL.md) | Migration, hook setup, detailed installation |
| [CHANGELOG](CHANGELOG.md) | Version-by-version changes |
## Operations
- `/health`: Comprehensive check of DB, Redis, pgvector, and worker status. Degraded response in case of partial failure.
- Rate Limiting: 100 requests per minute per API key, 30 requests per minute per IP. Adjustable via environment variables.
- Worker recovery: Automatic retries with exponential backoff (1s→60s) for embedding and evaluation workers in case of errors.
- Graceful Shutdown: Waits for ongoing worker completion (30 seconds) and auto-reflects sessions upon SIGTERM.
- OAuth endpoint: Returns `WWW-Authenticate` header upon authentication failure to trigger OAuth client flow. Session TTL defaults to 240 minutes.
- Migration lint: Checks for numbering conflicts and convention violations with `npm run lint:migrations` before committing.
## Known Limitations
- L1 Redis cache only supports API key-based isolation. Multi-agent environments require L2/L3 isolation.
- Automatic quality evaluation only applies to decision, preference, and relation types. Fact, procedure, and error types are excluded.
- If `MEMENTO_ACCESS_KEY` is not set, authentication is disabled. Required for externally exposed environments.
- `ALLOWED_ORIGINS` — Browser-based MCP client whitelist. Not required for desktop/CLI/IDE extensions.
- `ADMIN_ALLOWED_ORIGINS` — Admin UI origin whitelist. Not required for same-origin requests.
- `TRUST_PROXY_HOPS` — Number of trusted reverse proxy hops. Defaults to existing behavior (XFF first entry).
- `OAUTH_TRUSTED_ORIGINS` — Whitelist for auto-approval of OAuth consent.
## Technical Stack
- Node.js 20+
- PostgreSQL 14+ (pgvector extension)
- Redis 6+ (optional)
- OpenAI Embedding API (optional) or `EMBEDDING_PROVIDER=transformers` (local low-cost mode)
- garu-ko / natural PorterStemmer / @node-rs/jieba / kuromoji (local morpheme analysis, CPU routing; `MEMENTO_MORPHEME_TOKENIZER=local` by default)
- Gemini CLI / Codex CLI / GitHub Copilot CLI (quality evaluation, auto-reflect; optional, configured with LLM_PRIMARY / LLM_FALLBACKS)
- @huggingface/transformers + ONNX Runtime (NLI contradiction classification + local embedding, CPU only)
- MCP Protocol 2025-11-25
PostgreSQL is required for core functionality. Redis adds L1 cache search and SessionActivityTracker. OpenAI API or `EMBEDDING_PROVIDER=transformers` enables L3 semantic search and auto-linking.
## Motivation
<details>
<summary>Expand/Collapse</summary>
In practical AI usage, I experienced inefficiencies from repeatedly explaining the same context. Using system prompts with notes had limitations. As the number of fragments grew, management, search, and information collisions became issues.
The main problem was repetition. Despite having authentication, detailed settings were often accessible. Thorough explanations only worked temporarily. Sessions restarting meant repeating the same process.
It felt like educating a new employee daily. Despite being highly intelligent, the AI lacked memory.
Memento aims to address these issues by breaking down memories into atomic units, searching them hierarchically, and simulating natural forgetting.
</details>
## License
Apache 2.0
<p align="center">
Made by <a href="mailto:jinho.von.choi@nerdvana.kr">Jinho Choi</a> |
<a href="https://buymeacoffee.com/jinho.von.choi">Buy me a coffee</a>
</p>
MCP Config
Below is the configuration for this MCP Server. You can copy it directly to Cursor or other MCP clients.
mcp.json
Connection Info
You Might Also Like
everything-claude-code
Complete Claude Code configuration collection - agents, skills, hooks,...
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
firecrawl
Firecrawl MCP Server enables web scraping, crawling, and content extraction.
cc-switch
All-in-One Assistant for Claude Code, Codex & Gemini CLI across platforms.
servers
Model Context Protocol Servers
servers
Model Context Protocol Servers