Skip to content

MemPalace

★ New
assess
AI / ML open-source MIT open-source

At a Glance

Local-first open-source AI memory system using a hierarchical palace metaphor (Wings/Rooms/Halls) over ChromaDB vector search and SQLite knowledge graph, with an MCP server exposing 19 tools; headline benchmarks primarily measure embedding quality rather than the palace architecture itself.

Type
open-source
Pricing
open-source
License
MIT
Adoption fit
small
Top alternatives

What It Does

MemPalace is a Python library and MCP server that gives AI assistants persistent cross-session memory by storing conversation history verbatim in a locally-hosted ChromaDB vector database. The core design metaphor is the ancient “method of loci” mnemonic: conversations are organized into a hierarchy of Wings (per-person or per-project containers), Rooms (topic areas), Halls (memory type corridors: facts, events, discoveries, preferences, advice), Closets (summaries), and Drawers (verbatim files). Retrieval uses ChromaDB’s default all-MiniLM-L6-v2 embeddings with optional metadata filtering by wing and room to narrow search scope.

A four-layer memory stack controls token budget: L0 identity (~50 tokens, always loaded), L1 critical facts (~120 tokens via AAAK compression, always loaded), L2 room recall (on-demand), and L3 deep semantic search (on-demand). A secondary SQLite-based knowledge graph stores temporal entity-relationship triples with validity windows. An MCP server exposes 19 tools compatible with Claude, ChatGPT, Cursor, and Gemini CLI. An experimental “AAAK dialect” applies lossy text abbreviation for compression, but degrades benchmark performance by 12.4 percentage points and is not recommended for production use.

Key Features

  • Verbatim storage with no LLM writes: Writes are fully offline, deterministic, and free — no API calls during ingestion
  • Hierarchical namespace filtering: Wing and room metadata filtering narrows ChromaDB search scope, improving retrieval precision on large collections
  • Four-layer progressive loading: Predictable 170-token wake-up context with deeper layers loaded on demand
  • Temporal knowledge graph: SQLite triples with start/end validity windows for point-in-time queries (partially implemented — contradiction detection not yet wired in)
  • 19 MCP tools: Search, memory management, agent operations, and knowledge graph queries via Model Context Protocol
  • Multi-mode mining: CLI commands for ingesting project files, conversation exports, or general auto-classified content
  • Session splitting: Handles large conversation exports by splitting on configurable thresholds
  • Cross-client compatibility: Works with Claude, ChatGPT, Cursor, Gemini CLI, and local models via MCP or Python API
  • Zero operational cost: No cloud dependency, no subscription; ChromaDB and SQLite run locally

Use Cases

  • Solo developer persistent context: A developer using Claude Code who wants decisions, errors, and preferences remembered across sessions without connecting to a managed cloud service
  • Local-first privacy requirement: Environments where sending conversation history to a third-party memory API (Mem0, Zep) is not acceptable for data residency or confidentiality reasons
  • Low-cost long-term memory experiment: Teams evaluating verbatim-storage approaches for AI memory before committing to a production memory infrastructure
  • MCP tool integration prototyping: Developers exploring how to expose agent memory as MCP tools for multi-client compatibility

Adoption Level Analysis

Small teams (<20 engineers): Potential fit for personal or small-team use cases where local-first and zero-cost are the primary requirements. The MCP integration and CLI setup are accessible. However, the project launched April 2026 with 170 commits, 4 test files for 21 modules, and multiple corrected benchmark claims — production reliability is unverified. Treat as early-stage experimental tooling.

Medium orgs (20–200 engineers): Does not fit. ChromaDB’s single-node architecture limits scale; there are no multi-user access controls, no role-based permissions, no audit logs, and no compliance certifications. The verbatim storage model also has no forgetting/decay mechanism — memories accumulate indefinitely. Better alternatives exist at this scale (Mem0 managed, Zep, Weaviate Engram).

Enterprise (200+ engineers): Does not fit. No enterprise features, no SLA, no data governance controls, no integration with enterprise identity providers. Not designed for this use case.

Alternatives

AlternativeKey DifferencePrefer when…
Hippo MemoryTypeScript, biologically-inspired decay, BM25+embedding hybridYou want TypeScript and memory that naturally expires unused entries
HonchoDialectic user modeling, peer-entity architecture, cloud-optionalYou need user-centric relationship modeling beyond conversation storage
Weaviate EngramManaged cloud memory on Weaviate, MCP integration, previewYou already use Weaviate and want managed memory infrastructure
OpenVikingFilesystem paradigm, tiered context, AGPL, ByteDanceYou want filesystem-native context management with stronger typing
Mem019 vector store backends, graph memory, cloud + self-host, SOC 2You need production-ready memory with compliance and multi-backend support
Zep / GraphitiNeo4j temporal knowledge graph, managed or self-hostedYou need strong temporal reasoning with entity relationship tracking
CLAUDE.md / MEMORY.mdFile-based, zero tooling, natively understood by Claude CodeYou want simplest possible persistent context with zero external dependencies
Mastra Observational MemoryNo vector DB needed, text-only compression agents, 94.87% LongMemEvalYou want SOTA benchmark performance without managing a vector database

Evidence & Sources

Notes & Caveats

  • Benchmark attribution is the central problem: The headline “96.6% LongMemEval” measures ChromaDB’s all-MiniLM-L6-v2 embeddings on verbatim text, not the palace architecture. Independent reproducers confirmed the benchmark runner never exercises wings, rooms, or any structural code. This is not a minor caveat — it invalidates the primary marketing claim.
  • AAAK compression is lossy and degrades performance: Despite initial “zero information loss” claims, AAAK uses sentence truncation and regex substitution. The decode() method cannot reconstruct original text. Performance drops 12.4 points vs. raw mode. The project corrected this post-launch. Use raw mode if recall quality matters.
  • Contradiction detection claimed but not implemented: knowledge_graph.py only blocks exact-duplicate triples. Conflicting facts accumulate silently. Any workflow that depends on contradiction detection (e.g., tracking fact updates over time) will produce incorrect results.
  • No decay or forgetting mechanism: Memories accumulate indefinitely. For long-running agents, storage will grow unbounded and retrieval signal may degrade over time as the collection grows.
  • ChromaDB single-node ceiling: ChromaDB is designed for prototyping under ~10 million vectors. For large-scale production with many agents or heavy memory accumulation, the underlying storage is not designed for that workload.
  • Celebrity-driven star inflation: 38k+ GitHub stars within days largely reflect Milla Jovovich’s media profile rather than technical community validation. Star count is not a proxy for production readiness here.
  • LoCoMo benchmark methodology flaw acknowledged: The LoCoMo dataset has 19–32 sessions per conversation. When MemPalace set top_k=50, it retrieved more sessions than exist, guaranteeing the ground-truth answer was always in the candidate pool. The corrected LoCoMo score without reranking is 88.9%, not the headline figure.
  • Early stage: Created April 5, 2026. 170 commits, 4 test files for 21 modules. No production case studies published. The rapid corrections post-launch indicate an honest team but also an immature release process.
  • No named individual with established track record: Ben Sigman (technical lead) does not have a publicly verifiable track record in AI memory research. The project lacks academic citations or peer-reviewed validation of its architectural claims.

Related