-
Notifications
You must be signed in to change notification settings - Fork 7.2k
OpenCode plugin with MemPalace persistence — auto-sync conversations, KG extraction #1522
-
I built an OpenCode plugin that uses MemPalace as its persistence backend — it saves every conversation turn in real-time, auto-categorizes by wing type and extracts Knowledge Graph facts.
What it does:
- Listens to OpenCode's chat.message and session.idle events
- Each turn (question + answer) is saved to MemPalace as a drawer
- Auto-categorizes into wings: developer, creative, emotions, family, consciousness
- Extracts KG facts from conversations (decisions, milestones, problems, preferences)
- Async mining via Node.js exec — never blocks the UI
- Pure TypeScript, ~250 lines, published on npm
Complete feedback loop:
- Model searches MemPalace via MCP before answering (guided by AGENTS.md)
- Plugin saves the response after the model delivers it
- Next session, the model remembers
Repo: https://github.com/geco/opencode-mempalace-persistence
npm: opencode-mempalace-persistence
Would love feedback from the MemPalace community!
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 7 comments
-
Beta Was this translation helpful? Give feedback.
All reactions
-
|
@geco — this is a great pattern, and the read-side bit "Model searches MemPalace via MCP before answering (guided by AGENTS.md)" is the place we have empirical data worth sharing. We've been measuring exactly this — how often agent harnesses actually invoke Same retrieval substrate, same instructions, same task: orchestrator-LLM choice determines invocation rate. On a 30-question Cat-9a-shaped diagnostic at fixed 4B parameter count:
Same wrapper, same backend palace, same questions. The gap (30pp recall) is almost entirely an invocation-rate gap — Practical implications for OpenCode + MemPalace integration:
We're running Step 2 (forced + grounded invocation experiments ×ばつ n=200 expanded corpus) on katana right now; full numbers will land on the SME #3 thread today/tomorrow. Happy to cross-link once they settle. Three weeks of operator experience on an adjacent pattern (palace-daemon HTTP gateway + RLM-orchestrated reads against the same 🫏 |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
|
Quick follow-up worth surfacing for anyone landing on this thread: there's a complementary write-side path to geco's plugin already in flight upstream —
Complementary rather than competing — the plugin captures new sessions in real-time; the adapter ingests historical sessions. Anyone running OpenCode + MemPalace who wants both retroactive and forward-going coverage can install the plugin AND run If geco's plugin and the RFC 002 adapter end up in the same release window, worth a section in the user-facing docs that walks through "install plugin for live capture, run source-mine once for backfill, never think about it again." Closes the OpenCode integration loop end-to-end. 🫏 |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
@jphein this is gold — thank you for the empirical data. A few things I'll act on:
- Model recommendation in README: I'll add a "Recommended orchestrators" section referencing your data — Qwen 3.5 4B+ as the floor for reliable read-side invocation, with the caveat that smaller models may skip mempalace_search ~60% of the time.
- Forced invocation config: Love the belt-and-suspenders idea. I'll add a plugin config flag (e.g. forceMemorySearch: true) that injects an invocation prefix into the system prompt, on top of AGENTS.md. Will credit your Tau2 data in the docs.
- PR feat(sources): OpenCode adapter on RFC 002 contract #1484 complementarity: You're right — the push plugin + pull source adapter close the loop. Once both land, I'll add a section in the README walking through "install plugin for live capture, run mempalace mine --source opencode once for backfill, never think about it again."
Will cross-link your SME thread once I publish the updated docs. Thanks again for the rigorous data — this makes the integration significantly stronger.
Beta Was this translation helpful? Give feedback.
All reactions
-
Quick follow-up: I've updated the plugin README based on this discussion:
https://github.com/geco/opencode-mempalace-persistence#recommendations
- Model recommendations with your invocation-rate table — Qwen 3.5 4B+ as the recommended floor
- PR feat(sources): OpenCode adapter on RFC 002 contract #1484 complementarity documented (push + pull = full coverage)
- Forced invocation config flag noted as "being evaluated" — I'll implement it properly in the next release
Thanks again for the rigorous data — it made the README significantly stronger.
Beta Was this translation helpful? Give feedback.
All reactions
-
Quick update: based on @jphein's data, I've strengthened the AGENTS.md with mandatory search instructions and created a branch with the changes:
https://github.com/geco/opencode-mempalace-persistence/tree/feat/aggressive-agents-md
The key change is moving from suggestive language ("always search your memory") to imperative step-by-step with explicit "This is mandatory. Never skip this step." I'll run with this for a while and see if it measurably improves invocation rate on low-discipline models.
Thoughts on the approach? Happy to adjust based on your experience with forced invocation prefixes.
Once we're happy with it, I'll merge into main and update the integration PR docs accordingly.
Beta Was this translation helpful? Give feedback.
All reactions
-
The invocation-rate data @jphein shared is valuable — model-dependent recall reliability is a real issue that most memory integrations paper over.
One architectural note on the read-side pattern: searching MemPalace before answering is correct, but the search quality depends heavily on the retrieval method. Pure keyword search misses semantic matches ("how did we handle the auth refactor?" won't match a drawer about "OAuth2 migration"). Pure vector similarity misses exact matches (searching for a specific function name returns semantically similar but wrong results).
Hybrid retrieval (BM25 fulltext + vector similarity in a single query) catches both cases and is the difference between 60% and 90%+ recall in our benchmarks. We tested this across 1,540 temporal reasoning questions — hybrid outperforms either approach alone by 3-5 points specifically on queries that combine exact terms with semantic intent.
The auto-categorization into wings (developer, creative, emotions) is a nice UX touch. The equivalent in a general-purpose memory system is memory_type tagging (episodic, semantic, procedural, working) with per-type decay curves — procedural memories (learned rules) persist indefinitely, while episodic memories (specific events) decay naturally.
Hybrid search example (BM25 + vector in one query): https://github.com/Dakera-AI/dakera-py/blob/main/examples/hybrid_search.py
Beta Was this translation helpful? Give feedback.