Name: Cerememory
Author: Cerememory

Abstract

Today’s LLMs lose context with every conversation reset, forcing users to repeat themselves endlessly. Cerememory is an LLM-agnostic memory database built on five specialized memory stores grounded in neuroscience research. Memories are not merely stored – they decay over time, reactivate when related memories fire, and have their retention rates modulated by emotional intensity. Every record carries a structured meta-memory plane that records why it exists – intent, rationale, evidence, alternatives, and a typed context graph. This is not a database. It is a living memory system. With a user-sovereign, local-first design, full ownership of memory data is guaranteed to the user.

Keywords Memory Database · LLM · Neuroscience · Spreading Activation · Decay Model · Meta-Memory · Raw Journal · Rust · CMP Protocol

§ 1

The Problem: LLM Amnesia

Every LLM today suffers from a fundamental flaw. Each time a conversation resets, the context window flushes, and users are forced to re-explain themselves from scratch. Existing memory solutions are shallow, text-only, model-specific, and vendor-controlled.

Cerememory addresses this with three principles. Memory must be alive – not frozen at write-time, but evolving over time. It must be LLM-agnostic – a standardized protocol (CMP) allows any LLM to read and write to the memory layer. And it must be user-sovereign – local-first and fully exportable by design.

§ 2

Five-Store Architecture

Just as the human brain processes different types of memory in distinct regions, Cerememory distributes memories across five specialized stores.

Store	Brain Analog	Function
Episodic	Hippocampus	Temporal event logs. Records what happened, when, and where. Persistent via redb.
Semantic	Neocortex	Graph of facts, concepts, and relationships. Stores what things mean. Typed-edge graph structure.
Procedural	Basal Ganglia	Behavioral patterns, preferences, and skills. Learns how things are done.
Emotional	Amygdala	Cross-cutting affective metadata. Modulates decay rates and retrieval priority across all stores.
Working	Prefrontal Cortex	Volatile, capacity-limited, high-speed active context cache. LRU-evicted, in-memory.

Table 1. Five memory stores and their neuroscientific analogs

LLM Layer · Clients & Adapters

Claude

GPT

Gemini

↓ CMP Protocol

Transport Bindings

HTTP / REST (Axum)

gRPC (Tonic)

MCP (rmcp)

↓

Cerememory Engine (orchestrator)

↓

Hippocampal Coordinator · cross-store indexes

Tantivy Full-Text

Vector Index (redb cosine)

Association Graph

↓

Storage Plane · 5 curated stores + raw journal

Episodic

Semantic

Procedural

Emotional

Working

Raw Journal · verbatim + Tantivy

↔

Supporting Engines

Decay Engine

Association Engine

Evolution Engine

Supporting engines run alongside the engine and operate on every store.

Fig. 1. Cerememory system architecture · meta-memory is a cross-cutting plane attached to every record

§ 3

Living Memory Dynamics

In a traditional database, data is static. In Cerememory, memories breathe, decay, and reactivate.

Decay – The Forgetting Curve

Memory fidelity decreases over time following a modified power-law curve. This is not "forget everything" – it is gradual, realistic degradation. Emotional intensity modulates decay rates, and repeated access increases stability.

F(t) = F₀ · (1 + t / S)^−d · E_mod

F₀ : initial fidelity S : stability parameter d : decay exponent E_mod : emotional modulation factor

Eq. 1 – Power-Law Decay

Noise – Interference Accumulation

Similar memories blur each other’s details over time. This reproduces the interference phenomenon observed in human memory research.

N(t) = N₀ + λ · √t · (1 − F(t))

λ : interference rate Noise increases as fidelity degrades

Eq. 2 – Noise Accumulation

Emotional Modulation

An 8-dimensional emotion vector is attached to every memory, influencing decay rates, retrieval priority, and association strength. Emotionally intense memories are retained longer.

joy

trust

fear

surprise

sadness

disgust

anger

anticipation

Fig. 2. 8-dimensional emotion vector based on Plutchik’s model of emotions

Dream Processing – Sleep-Like Compression

During sleep, the brain replays the day and consolidates ephemeral experience into lasting knowledge. Cerememory mirrors that pipeline. The raw journal preserves verbatim conversation, tool I/O, and scratchpad content. The dream_tick lifecycle groups those raw entries by topic (time gaps and lexical shift), summarizes each group into episodic memory, and conditionally promotes factual content to the semantic store — always with backlinks to the verbatim source.

Dream pipeline

Raw journal · verbatim

↓

Topic grouping · time + lexical shift

↓

dream_tick · secrecy-aware summary

↓

Episodic + semantic · with backlinks

Secrecy & visibility aware

Only `Normal` visibility records feed the summary. `secret` secrecy is fully excluded, and `sealed` / `private_scratch` visibility entries are counted in the stats but never summarized.

Topic grouping

Records sharing an explicit `topic_id` are grouped first. Otherwise the engine splits per session by time gap (>45 min hard split, or >10 min with <8% token overlap) and infers a topic hint from the top tokens.

Conditional semantic promotion

A summary is promoted to the semantic store when `promote_semantic=true` AND the group has at least two `Normal` records AND a topic signal (explicit topic_id or inferred hint).

Backlinks both ways

Every raw record gets `derived_memory_ids` pointing to the dream summary (and semantic record if promoted). The summary keeps an Episodic→Semantic association so forensic recall is one hop away.

Background or on-demand

Runs autonomously on `dream.background_interval_secs` (default 86400 s = 24 h) and on demand via the `lifecycle.dream_tick` CMP operation, the MCP tool, or the `cerememory dream-tick` CLI.

Reactivation

Firing of related memories temporarily restores decayed ones. Based on the spreading activation model.

Reconsolidation

Recalled memories are subtly modified and reintegrated with current context.

iii

Consolidation

Like sleep, episodic memories are periodically integrated and migrated into semantic storage.

§ 4

Cerememory Protocol (CMP)

CMP is the single, transport-agnostic protocol spoken by Cerememory. HTTP, gRPC, and MCP are three transport bindings that carry the same CMP messages – not competing APIs. HTTP and gRPC expose the full protocol surface; MCP is a curated 15-tool binding tailored for LLM agents.

LLM Clients

Claude · GPT · Gemini · Agents

HTTP

HTTP / REST

Full surface · browsers, services

gRPC

Full surface · streaming, low-latency

MCP

Curated 15-tool subset · Claude Code, Codex CLI, Cursor, any MCP client

CMP – Cerememory Protocol

Unified message format: encode · recall · lifecycle · introspect

Cerememory Engine

Five memory stores, decay, association, evolution

Fig. 3. CMP vs. MCP – MCP is one transport for CMP, not a separate protocol

Encode

Write Memories

encode.store – Store a single record
encode.batch – Batch store with auto-association
encode.update – Update an existing record
encode.store_raw – Store verbatim journal entry
encode.batch_raw – Batch store journal entries

Recall

Retrieve Memories

recall.query – Multimodal retrieval
recall.associate – Get associations
recall.timeline – Time-series retrieval
recall.graph – Subgraph retrieval
recall.raw_query – Forensic journal recall

Lifecycle

Memory Lifecycle

lifecycle.consolidate – Trigger consolidation
lifecycle.decay_tick – Run decay engine
lifecycle.dream_tick – Summarize journal to memory
lifecycle.forget – Permanently delete
lifecycle.set_mode – Human / Perfect mode

Introspect

Observe State

introspect.stats – System-wide statistics
introspect.record – Inspect decay state
introspect.decay_forecast – Fidelity prediction
introspect.evolution – Evolution engine metrics

^* Recall has two modes: Human (realistic recall with fidelity-weighted noise) and Perfect (complete retrieval of original data). Spreading activation depth is configurable. Every transport returns query metadata, x-request-id correlation, and retry hints to make production debugging straightforward.

§ 5

Quick Start

Build one binary from source, start the shared HTTP server, then point every MCP client at it.

# Build the binary from source
git clone https://github.com/co-r-e/cerememory.git
cd cerememory
cargo build -p cerememory-cli --release

# Start the one long-lived server that owns the data directory
target/release/cerememory serve --data-dir ~/.cerememory/data

# Point every MCP client (Claude Code, Codex CLI, Cursor, ...) at that shared server
target/release/cerememory mcp --server-url http://127.0.0.1:8420

# ~/.codex/config.toml — point every MCP client at one shared server
[mcp_servers.cerememory]
command = "/absolute/path/to/target/release/cerememory"
args = ["mcp", "--server-url", "http://127.0.0.1:8420"]

# Claude Code uses the same shape in ~/.claude/claude_desktop_config.json
# {
#   "mcpServers": {
#     "cerememory": {
#       "command": "/absolute/path/to/target/release/cerememory",
#       "args": ["mcp", "--server-url", "http://127.0.0.1:8420"]
#     }
#   }
# }

§ 6

Integration Points

Transports

MCP

MCP clients

Recommended path. cerememory mcp --server-url proxies to the shared HTTP server. Works with Claude Code, Codex CLI, Cursor, Cline, Windsurf, Zed, Continue, and any other MCP-compatible client.

HTTP

HTTP / REST

Full CMP surface for browsers, services, and any HTTP client.

gRPC

Streaming, low latency, TLS-enforced transport for production.

LLM Adapters

Anthropic Claude

OpenAI GPT

Google Gemini

Capabilities

Multimodal

Text, image, audio, and structured blocks are supported today, with provider-backed image/audio recall and auto-embedding.

Secure Defaults

Localhost-first HTTP, Bearer auth, trusted-proxy-aware rate limiting, optional at-rest store encryption (ChaCha20-Poly1305), tamper-evident JSONL audit log, and enforced gRPC TLS on exposed deployments.

Observability

Opt-in protected Prometheus metrics, /health and /readiness probes, plus x-request-id correlation for production debugging.

Vector Search

Deterministic redb-backed exact cosine scan, paired with Tantivy full-text search for hybrid retrieval. Backend and record counts surface in introspect.stats.

Spreading Activation

Weighted breadth-first traversal for associative recall. Configurable decay factor, threshold, and depth.

Workflow Stability

Persisted inferred associations, safe CMA export/import flows, and rebuilt coordinators before stateful CLI operations.

Meta-Memory Plane

Every record carries structured intent, rationale, evidence, alternatives, decisions, and a typed context graph. recall.query indexes the why plane so agents can search reasoning, not just content.

Raw Journal & Dream Tick

Verbatim conversation, tool I/O, and scratchpad capture in a separate forensic plane. dream_tick groups raw entries by topic and summarizes them into curated episodic and semantic memory.

§ 7

Technical Foundation

Cerememory’s core engine is implemented in Rust – the only choice that delivers memory safety, zero-cost concurrency, and predictable performance. Tokio handles async I/O while Rayon powers CPU-intensive operations like decay computation and spreading activation, with thread pools optimally separated by workload characteristics.

redb

Embedded key-value store with ACID transactions. Zero-copy reads for maximum throughput.

Tantivy

Rust-native Lucene equivalent. High-performance full-text search index.

redb cosine vector index

Deterministic exact cosine similarity over a redb-backed embedding store. No approximation, no rebuilt graph, identical results across replicas.

MessagePack

Compact binary serialization. Fast internal data transfer with minimal overhead.

Axum + Tonic + rmcp

HTTP/REST, gRPC, and MCP stdio transports all served from the same engine.

JSON Lines CMA

Inspectable single-file archive bundle with optional ChaCha20-Poly1305 + Argon2id encryption. Full data portability.

§ 8

Design Philosophy

Memory is the foundation of identity. A system that stores, evolves, and retrieves the accumulated context of a person’s interaction with AI is too important to be controlled by any single entity.

– Cerememory Whitepaper

User Sovereignty

Local-first. Fully exportable. Ownership of memory data always belongs to the user.

No Vendor Lock-in

LLM-agnostic protocol. Claude, GPT, Gemini — any model can share the same memory layer.

III

Living Data

Memories breathe. Decay, interference, reactivation, consolidation — not static storage, but a dynamic memory system.

Brain-Inspired

Grounded in neuroscience research. Faithfully models the five subsystems of human memory.

A Living Memory Database for the Age of AI