Memory Layers
How INSTANT, SELECTION, and DEEP layers work — internals and configuration
the-brain organizes memory across three layers, each processing information at different timescales.
⚡ Layer 1: INSTANT (Working Memory)
Plugin: plugin-graph-memory
Intercepts prompts and injects real-time biases before they reach the LLM.
Detection Pipeline
The AFTER_RESPONSE handler runs a 6-stage pipeline using language-agnostic structural heuristics (no English regex):
- Correction detection — structural heuristics with dynamic confidence weights
- Short prompt + long explanatory response (ratio > 2.5:1) → likely correction
- Very short prompt (< 50 chars) + substantial response (> 100 chars)
- High token novelty (response introduces new vocabulary not in prompt)
- Preference detection — cross-interaction cluster tracking
- Short declarative statements (15-150 chars) → preference candidates
- Token overlap with previously detected preferences → reinforced
- Repeated vocabulary across 2+ interactions → emerging preference
- Pattern detection — keywords appearing ≥3 times in recent interactions
- Concept node creation — new keywords not matching any existing node
- Node interconnection — bidirectional links between all new nodes
- Weight decay — periodic decay for >24h unmatched nodes
Quality Filters
- Unicode-aware tokenizer — splits on non-letter boundaries, works for all scripts (Latin, Cyrillic, CJK, Arabic, etc.) — no English stop words needed
- Weight decay: every ~10 interactions, nodes >24h old lose 2% weight (floor: 0.05)
- Weight boost: matched nodes gain +0.05 per use
- Correction weights: 0.5-0.85 based on structural heuristic confidence
- Preference weight: always 0.7
- Concept weight: always 0.4
Context Injection
BEFORE_PROMPT extracts keywords → searches graph nodes → filters by minWeight →
boosts matched nodes → fetches connected nodes → injects formatted context.
Edge cases: empty keywords → no injection. Connected nodes also respect minWeight.
Metadata from BEFORE_PROMPT (matchedNodeIds, promptKeywords) flows to AFTER_RESPONSE.
⚖️ Layer 2: SELECTION (The Gatekeeper)
Plugin: plugin-spm-curator (surprise gate) + plugin-data-curator (quality gate)
The Selection Layer has two gates that run in sequence:
- Data Curator (
plugin-data-curator) — quality gate, runs first- Rejects system noise, empty responses, off-topic content via heuristics
- Scores remaining interactions 1-10 via LLM Judge (local Ollama model)
- Only quality-passing interactions reach the surprise gate
- SPM Curator (
plugin-spm-curator) — surprise gate, runs second- Composite surprise score (TF-IDF or EMA-Gaussian)
- Promotes surprising interactions to Deep Memory
Dual-Mode Architecture
-
TF-IDF mode (default,
useTfidf: true): Wider spread, +93% better discrimination- Builds vocabulary from production memories at daemon startup
initTfidfFromTexts(texts)→finalizeTfidf()lifecycle- Falls back to EMA-Gaussian before vocabulary is locked
-
EMA-Gaussian mode (fallback): Running mean/variance per feature
- Uses 6 scalar features: promptLen, responseLen, totalLen, lexicalDiversity, hourOfDay, dayOfWeek
Composite Score Formula
composite = scalarWeight × scalarScore + embeddingWeight × embScore + noveltyWeight × noveltyScoreAll sub-scores normalized to [0, 1]. Default weights: scalar=0.35, embedding=0.40, novelty=0.25.
Quality Features
- Duplicate detection: djb2 hash of first 200 chars; Set of 5,000 recent hashes; duplicates get score 0
- N-gram novelty: character n-grams (default n=4) compared against 50,000-cache FIFO
- Z-score clamping: capped at [0, 5] to prevent outlier dominance
- TF-IDF seed:
finalizeTfidf(seedTexts)primes centroid to avoid all scores being 0.5 at startup
Runtime Introspection
// Access via hook
hooks.callHook("spm-curator:getInstance", (instance) => {
instance.setThreshold(0.5); // Dynamic threshold
instance.getStats(); // gaussians, centroidDim, promoteRate, etc.
});🌌 Layer 3: DEEP (Long-Term)
Plugins: trainer-local-mlx + plugin-auto-wiki
Two complementary outputs from consolidated memories:
| Plugin | What it does | Output |
|---|---|---|
trainer-local-mlx | LoRA fine-tunes a base model on your patterns | adapter.safetensors (~2-5 MB) |
plugin-auto-wiki | Generates interlinked Markdown knowledge graph | ~/wiki/ (Karpathy-style wiki) |
Important — LoRA adapters are model-bound. Adapter trained on
mlx-community/Llama-3.2-1B-Instruct-4bitonly works with that exact base model. Not standalone.
How they work together
After overnight consolidation (2 AM cron), two things happen in parallel:
- MLX LoRA: Promoted DEEP memories → training data →
adapter.safetensors. Reloading this adapter in Ollama/LM Studio makes the model consistently follow your preferences. - Auto-Wiki: Promoted DEEP memories → extracted concepts → interlinked
.mdpages with entity nodes, cross-references, and a searchable index.
Promotion Flow
Layer Configuration
All three layers are pluggable — swap built-in plugins with custom implementations.
See Plugin Contracts for the interfaces each layer plugin must implement.