Memory Layers

the-brain organizes memory across three layers, each processing information at different timescales.

⚡ Layer 1: INSTANT (Working Memory)

Plugin: plugin-graph-memory

Intercepts prompts and injects real-time biases before they reach the LLM.

Detection Pipeline

The AFTER_RESPONSE handler runs a 6-stage pipeline using language-agnostic structural heuristics (no English regex):

Correction detection — structural heuristics with dynamic confidence weights
- Short prompt + long explanatory response (ratio > 2.5:1) → likely correction
- Very short prompt (< 50 chars) + substantial response (> 100 chars)
- High token novelty (response introduces new vocabulary not in prompt)
Preference detection — cross-interaction cluster tracking
- Short declarative statements (15-150 chars) → preference candidates
- Token overlap with previously detected preferences → reinforced
- Repeated vocabulary across 2+ interactions → emerging preference
Pattern detection — keywords appearing ≥3 times in recent interactions
Concept node creation — new keywords not matching any existing node
Node interconnection — bidirectional links between all new nodes
Weight decay — periodic decay for >24h unmatched nodes

Quality Filters

Unicode-aware tokenizer — splits on non-letter boundaries, works for all scripts (Latin, Cyrillic, CJK, Arabic, etc.) — no English stop words needed
Weight decay: every ~10 interactions, nodes >24h old lose 2% weight (floor: 0.05)
Weight boost: matched nodes gain +0.05 per use
Correction weights: 0.5-0.85 based on structural heuristic confidence
Preference weight: always 0.7
Concept weight: always 0.4

Context Injection

BEFORE_PROMPT extracts keywords → searches graph nodes → filters by minWeight → boosts matched nodes → fetches connected nodes → injects formatted context.

Edge cases: empty keywords → no injection. Connected nodes also respect minWeight. Metadata from BEFORE_PROMPT (matchedNodeIds, promptKeywords) flows to AFTER_RESPONSE.

⚖️ Layer 2: SELECTION (The Gatekeeper)

Plugin: plugin-spm-curator (surprise gate) + plugin-data-curator (quality gate)

The Selection Layer has two gates that run in sequence:

Data Curator (plugin-data-curator) — quality gate, runs first
- Rejects system noise, empty responses, off-topic content via heuristics
- Scores remaining interactions 1-10 via LLM Judge (local Ollama model)
- Only quality-passing interactions reach the surprise gate
SPM Curator (plugin-spm-curator) — surprise gate, runs second
- Composite surprise score (TF-IDF or EMA-Gaussian)
- Promotes surprising interactions to Deep Memory

Dual-Mode Architecture

TF-IDF mode (default, useTfidf: true): Wider spread, +93% better discrimination
- Builds vocabulary from production memories at daemon startup
- initTfidfFromTexts(texts) → finalizeTfidf() lifecycle
- Falls back to EMA-Gaussian before vocabulary is locked
EMA-Gaussian mode (fallback): Running mean/variance per feature
- Uses 6 scalar features: promptLen, responseLen, totalLen, lexicalDiversity, hourOfDay, dayOfWeek

Composite Score Formula

composite = scalarWeight × scalarScore + embeddingWeight × embScore + noveltyWeight × noveltyScore

All sub-scores normalized to [0, 1]. Default weights: scalar=0.35, embedding=0.40, novelty=0.25.

Quality Features

Duplicate detection: djb2 hash of first 200 chars; Set of 5,000 recent hashes; duplicates get score 0
N-gram novelty: character n-grams (default n=4) compared against 50,000-cache FIFO
Z-score clamping: capped at [0, 5] to prevent outlier dominance
TF-IDF seed: finalizeTfidf(seedTexts) primes centroid to avoid all scores being 0.5 at startup

Runtime Introspection

// Access via hook
hooks.callHook("spm-curator:getInstance", (instance) => {
  instance.setThreshold(0.5);   // Dynamic threshold
  instance.getStats();          // gaussians, centroidDim, promoteRate, etc.
});

🌌 Layer 3: DEEP (Long-Term)

Plugins: trainer-local-mlx + plugin-auto-wiki

Two complementary outputs from consolidated memories:

Plugin	What it does	Output
`trainer-local-mlx`	LoRA fine-tunes a base model on your patterns	`adapter.safetensors` (~2-5 MB)
`plugin-auto-wiki`	Generates interlinked Markdown knowledge graph	`~/wiki/` (Karpathy-style wiki)

Important — LoRA adapters are model-bound. Adapter trained on mlx-community/Llama-3.2-1B-Instruct-4bit only works with that exact base model. Not standalone.

How they work together

After overnight consolidation (2 AM cron), two things happen in parallel:

MLX LoRA: Promoted DEEP memories → training data → adapter.safetensors. Reloading this adapter in Ollama/LM Studio makes the model consistently follow your preferences.
Auto-Wiki: Promoted DEEP memories → extracted concepts → interlinked .md pages with entity nodes, cross-references, and a searchable index.

Promotion Flow

Layer Configuration

All three layers are pluggable — swap built-in plugins with custom implementations.

See Plugin Contracts for the interfaces each layer plugin must implement.

On this page