memory-pipeline
Complete agent memory + performance system.
Installation
npx clawhub@latest install memory-pipelineView the full skill documentation and source below.
Documentation
Memory Pipeline
Give your AI agent a memory that actually works.
AI agents wake up blank every session. Memory Pipeline fixes that — it extracts what matters from past conversations, connects the dots, and generates a daily briefing so your agent starts each session primed instead of clueless.
What It Does
| Component | When it runs | What it does |
| Extract | Between sessions | Pulls structured facts (decisions, preferences, learnings) from daily notes and transcripts |
| Link | Between sessions | Builds a knowledge graph — connects related facts, flags contradictions |
| Brief | Between sessions | Generates a compact BRIEFING.md loaded at session start |
| Ingest | On demand | Imports external knowledge (ChatGPT exports, etc.) into searchable memory |
| Performance Hooks | During sessions | Pre-game briefing injection, tool discipline, output compression, after-action review |
Why This Is Different
Most "memory" solutions are just vector search over chat logs. This is a cognitive architecture — inspired by how human memory actually works:
- Extraction over accumulation — Instead of dumping everything into a database, it identifies what's worth remembering: decisions, preferences, learnings, commitments. The rest is noise.
- Knowledge graph, not just embeddings — Facts get linked to each other with bidirectional relationships. Your agent doesn't just find similar text — it understands that a decision about your tech stack relates to a project deadline relates to a preference you stated three weeks ago.
- Briefing over retrieval — Rather than hoping the right context gets retrieved at query time, your agent starts every session with a curated cheat sheet. Active projects, recent decisions, personality reminders. Zero cold-start lag.
- No mid-swing coaching — Borrowed from performance psychology. Corrections happen between sessions, not during. The after-action review feeds into the next briefing. The loop is closed — just not mid-execution.
Quick Start
Install
clawdhub install memory-pipeline
Setup
bash skills/memory-pipeline/scripts/setup.sh
The setup script will detect your workspace, check dependencies (Python 3 + any LLM API key), create the memory/ directory, and run the full pipeline.
Requirements
- Python 3
- At least one LLM API key (auto-detected):
OPENAI_API_KEY or ~/.config/openai/api_key)
- Anthropic (ANTHROPIC_API_KEY or ~/.config/anthropic/api_key)
- Gemini (GEMINI_API_KEY or ~/.config/gemini/api_key)
Run Manually
# Full pipeline
python3 skills/memory-pipeline/scripts/memory-extract.py
python3 skills/memory-pipeline/scripts/memory-link.py
python3 skills/memory-pipeline/scripts/memory-briefing.py
Automate via Heartbeat
Add to your HEARTBEAT.md for daily automatic runs:
### Daily Memory Pipeline
- **Frequency:** Once per day (morning)
- **Action:** Run the memory pipeline:
1. `python3 skills/memory-pipeline/scripts/memory-extract.py`
2. `python3 skills/memory-pipeline/scripts/memory-link.py`
3. `python3 skills/memory-pipeline/scripts/memory-briefing.py`
Import External Knowledge
Already have years of conversations in ChatGPT? Import them so your agent knows what you know.
ChatGPT Export
# 1. Export from ChatGPT: Settings → Data Controls → Export Data
# 2. Drop the zip in your workspace
# 3. Run:
python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip
# Preview first (recommended):
python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip --dry-run
What it does:
- Parses ChatGPT's conversation tree format
- Filters out throwaway conversations (configurable:
--min-turns,--min-length) - Supports topic exclusion (edit
EXCLUDE_PATTERNSto skip unwanted topics) - Outputs clean, dated markdown files to
memory/knowledge/chatgpt/ - Files are automatically indexed by OpenClaw's semantic search
Options:
--dry-run— Preview without writing files--keep-all— Skip all filtering--min-turns N— Minimum user messages to keep (default: 2)--min-length N— Minimum total characters (default: 200)
Adding Other Sources
The pattern is extensible. Create ingest-.py, parse the format, write markdown to memory/knowledge//. The indexer handles the rest.
How the Pipeline Works
Stage 1: Extract
Script: memory-extract.py
Reads daily notes (memory/YYYY-MM-DD.md) and session transcripts, then uses an LLM to extract structured facts:
{"type": "decision", "content": "Use Rust for the backend", "subject": "Project Architecture", "confidence": 0.9}
{"type": "preference", "content": "Prefers Google Drive over Notion", "subject": "Tools", "confidence": 0.95}
Output: memory/extracted.jsonl
Stage 2: Link
Script: memory-link.py
Takes extracted facts and builds a knowledge graph:
- Generates embeddings for semantic similarity
- Creates bidirectional links between related facts
- Detects contradictions and marks superseded facts
- Auto-generates domain tags
Output:
memory/knowledge-graph.json + memory/knowledge-summary.md
Stage 3: Briefing
Script: memory-briefing.py
Generates a compact daily briefing (< 2000 chars) combining:
- Personality traits (from
SOUL.md) - User context (from
USER.md) - Active projects and recent decisions
- Open todos
Output:
BRIEFING.md (workspace root)
Performance Hooks (Optional)
Four lifecycle hooks that enforce execution discipline during sessions. Based on a principle from performance psychology: separate preparation from execution.
User Message → Agent Loop
├── before_agent_start → Briefing packet (memory + checklist)
├── before_tool_call → Policy enforcement (deny list)
├── tool_result_persist → Output compression (prevent context bloat)
└── agent_end → After-action review (durable notes)
Configuration
{
"enabled": true,
"briefing": {
"maxChars": 6000,
"checklist": [
"Restate the task in one sentence.",
"List constraints and success criteria.",
"Retrieve only the minimum relevant memory.",
"Prefer tools over guessing when facts matter."
],
"memoryFiles": ["memory/IDENTITY.md", "memory/PROJECTS.md"]
},
"tools": {
"deny": ["dangerous_tool"],
"maxToolResultChars": 12000
},
"afterAction": {
"writeMemoryFile": "memory/AFTER_ACTION.md",
"maxBullets": 8
}
}
Hook Details
| Hook | What it does |
before_agent_start | Loads memory files, builds bounded briefing packet, injects into system prompt |
before_tool_call | Checks tool against deny list, prevents unsafe calls |
tool_result_persist | Head (60%) + tail (30%) compression of large results |
agent_end | Appends session summary to memory file with tools used and outcomes |
Output Files
| File | Location | Purpose |
BRIEFING.md | Workspace root | Daily context cheat sheet |
extracted.jsonl | memory/ | All extracted facts (append-only) |
knowledge-graph.json | memory/ | Full graph with embeddings and links |
knowledge-summary.md | memory/ | Human-readable graph summary |
knowledge/chatgpt/*.md | memory/ | Ingested ChatGPT conversations |
Customization
- Change LLM models — Edit model names in each script (supports OpenAI, Anthropic, Gemini)
- Adjust extraction — Modify the extraction prompt in
memory-extract.pyto focus on different fact types - Tune link sensitivity — Change the similarity threshold in
memory-link.py(default: 0.3) - Filter ingestion — Edit
EXCLUDE_PATTERNSiningest-chatgpt.pyfor topic exclusion
Troubleshooting
| Problem | Fix |
| No facts extracted | Check that daily notes or transcripts exist; verify API key |
| Low-quality links | Add OpenAI key for embedding-based similarity; adjust threshold |
| Briefing too long | Reduce facts in template or let LLM generation handle it (auto-constrained to 2000 chars) |
See Also
- Setup Guide — Detailed installation and configuration