Self-Growing Tag-Dictionary RAG for Any Device
Give your AI long-term memory that grows smarter over time.
No GPU. No vector database. No external dependencies.
Just pip install and go.
from sandclaw_memory import BrainMemory brain = BrainMemory(tag_extractor=my_ai_func) brain.save("User loves Python and React") context = brain.recall("what does the user like?") # -> Returns relevant memories as Markdown, ready for LLM injection
| Feature | sandclaw-memory | Vector DB (Pinecone, Weaviate) | mem0 |
|---|---|---|---|
| Setup | pip install (done) |
Docker + API keys + config | pip install + API key |
| GPU required | No | Often yes | No |
| Cost over time | Decreases (self-growing) | Constant | Constant |
| Search | FTS5 + BM25 (proven) | Cosine similarity | Embedding-based |
| Privacy | 100% local | Cloud required | Cloud optional |
| Dependencies | 0 (stdlib only) | Many | Several |
| Size | ~50KB | 100MB+ Docker | ~5MB |
Why tags instead of vectors? Vector embeddings are powerful but opaque. When a wrong result appears, you can't explain why. sandclaw-memory uses AI-powered tag extraction, so the AI understands synonyms and context (e.g., "vehicle" and "car" get the same tag), while every result is traceable to human-readable tags. Same semantic understanding, zero infrastructure, and you can debug it.
Traditional RAG calls AI for every search. sandclaw-memory learns:
Day 1: "Built a page with React" → AI extracts: [react, frontend, page]
keyword_map registers: "React" → react, "page" → page
Day 30: "React Native app" → keyword_map instant match! No AI needed.
Cost: 0ドル.00
Day 90: 80% of saves match keyword_map → AI calls reduced by 80%
The more you use it, the cheaper it gets.
pip install sandclaw-memory
from sandclaw_memory import BrainMemory with BrainMemory(tag_extractor=my_tag_func) as brain: brain.save("Important: migrate to FastAPI by Q2") print(brain.recall("what's the migration plan?"))
import json, openai from sandclaw_memory import BrainMemory def tag_extractor(content: str) -> list[str]: resp = openai.chat.completions.create( model="gpt-4o-mini", messages=[{ "role": "system", "content": "Extract 3-7 keyword tags. Return JSON array only." }, {"role": "user", "content": content}], temperature=0, ) return json.loads(resp.choices[0].message.content) with BrainMemory( db_path="./my_memory", tag_extractor=tag_extractor, ) as brain: brain.start_polling() # Background tag extraction every 15s brain.save("User prefers Python and TypeScript") brain.save("Decided to use PostgreSQL", source="archive") context = brain.recall("what tech stack?") # Inject 'context' into your LLM's system prompt
import json, anthropic from sandclaw_memory import BrainMemory client = anthropic.Anthropic() def tag_extractor(content: str) -> list[str]: resp = client.messages.create( model="claude-haiku-4-5-20251001", max_tokens=200, messages=[{"role": "user", "content": f"Extract 3-7 tags as JSON array:\n{content}"}], ) return json.loads(resp.content[0].text) brain = BrainMemory(tag_extractor=tag_extractor)
See
examples/for LangChain, custom callbacks, and scheduling patterns.
┌─────────────────────────────────────────────────┐
│ BrainMemory │
│ save() → recall() → start_polling() │
├─────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌──────────┐ ┌────────────────┐ │
│ │ L1 │ │ L2 │ │ L3 │ │
│ │ Session │ │ Summary │ │ Archive │ │
│ │ │ │ │ │ │ │
│ │ 3-day │ │ 30-day │ │ Permanent │ │
│ │ rolling │ │ AI/text │ │ SQLite + FTS5 │ │
│ │ Markdown│ │ summary │ │ + self-growing │ │
│ │ logs │ │ │ │ tag dict │ │
│ └────┬────┘ └────┬─────┘ └───────┬────────┘ │
│ │ │ │ │
│ ┌────┴────────────┴────────────────┴────────┐ │
│ │ Intent Dispatcher │ │
│ │ CASUAL → L1 │ STANDARD → L1+L2 │ │
│ │ │ DEEP → L1+L2+L3 │ │
│ └───────────────┴───────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Self-Growing Tag Pipeline │ │
│ │ Stage 1: keyword_map (instant, free) │ │
│ │ Stage 2: AI callback (async, queue) │ │
│ │ → keyword_map grows → Stage 2 shrinks │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
| Layer | What | Retention | Storage | Speed |
|---|---|---|---|---|
| L1 Session | Conversation logs | 3 days (rolling) | Markdown files | Instant |
| L2 Summary | Period summaries | 30 days | In-memory | Instant |
| L3 Archive | Important memories | Forever | SQLite + FTS5 | ~1ms |
The dispatcher auto-detects how deep to search:
brain.recall("what time is it?") # → CASUAL (L1 only) brain.recall("summarize this month") # → STANDARD (L1 + L2) brain.recall("why did we pick React?") # → DEEP (L1 + L2 + L3) # Or override manually: brain.recall("anything", depth="deep")
BrainMemory( db_path="./memory", # Where to store files tag_extractor=my_func, # REQUIRED: (str) -> list[str] promote_checker=None, # Optional: (str) -> bool depth_detector=None, # Optional: (str) -> str duplicate_checker=None, # Optional: (str, str) -> bool conflict_resolver=None, # Optional: (str, str) -> str polling_interval=15, # Seconds between maintenance cycles rolling_days=3, # L1 retention period summary_days=30, # L2 summary period max_context_chars=15_000, # Context budget for LLM injection encryption_key=None, # Optional SQLCipher encryption )
| Method | Description |
|---|---|
save(content, source="chat", tags=None) |
Save to memory. source="archive" forces L3. |
recall(query, depth=None) |
Recall relevant memories as Markdown. |
search(query, limit=20) |
Search L3 archive, returns list[MemoryEntry]. |
promote(content, tags=None) |
Manually save to L3. Returns memory ID. |
summarize(llm_callback=None) |
Generate a 30-day summary. |
start_polling() |
Start background maintenance loop. |
stop_polling() |
Stop background maintenance loop. |
run_maintenance() |
Run one maintenance cycle manually. |
get_stats() |
Get stats from all layers. |
get_tag_stats() |
Get tag usage counts. |
export_json(path=None) |
Export all data as JSON. |
on(event, callback) |
Register an event hook. |
close() |
Stop polling and close DB. |
| Callback | Signature | Default |
|---|---|---|
tag_extractor |
(str) -> list[str] |
REQUIRED |
promote_checker |
(str) -> bool |
len(content) > 200 |
depth_detector |
(str) -> str |
Keyword matching |
duplicate_checker |
(str, str) -> bool |
SequenceMatcher > 0.85 |
conflict_resolver |
(str, str) -> str |
Keep newer text |
brain.on("before_save", lambda content, source: ...) brain.on("after_save", lambda content, source: ...) brain.on("before_promote", lambda content: ...) brain.on("after_promote", lambda content, mem_id: ...) brain.on("after_recall", lambda query, depth, result: ...) brain.on("after_cycle", lambda stats: ...)
| File | Description |
|---|---|
basic_usage.py |
Simplest setup with OpenAI |
with_openai.py |
All 5 callbacks with OpenAI |
with_anthropic.py |
Claude API integration |
with_langchain.py |
LangChain agent + memory |
custom_callbacks.py |
Custom callbacks + hooks |
scheduling.py |
Polling, FastAPI, Celery |
- Python: 3.9, 3.10, 3.11, 3.12, 3.13
- OS: Windows, macOS, Linux
- Dependencies: None (Python stdlib only)
- SQLite: Uses built-in
sqlite3module (FTS5 optional, graceful fallback)
See CONTRIBUTING.md for guidelines.
MIT License. See LICENSE for details.
This library was born from the multi-layer RAG engine built for SandClaw — an AI-powered trading IDE. We extracted and open-sourced the memory system so any developer can use it.