OpenCode plugin with MemPalace persistence — auto-sync conversations, KG extraction · MemPalace/mempalace · Discussion #1522

geco
May 15, 2026

I built an OpenCode plugin that uses MemPalace as its persistence backend — it saves every conversation turn in real-time, auto-categorizes by wing type and extracts Knowledge Graph facts.

What it does:

Listens to OpenCode's chat.message and session.idle events
Each turn (question + answer) is saved to MemPalace as a drawer
Auto-categorizes into wings: developer, creative, emotions, family, consciousness
Extracts KG facts from conversations (decisions, milestones, problems, preferences)
Async mining via Node.js exec — never blocks the UI
Pure TypeScript, ~250 lines, published on npm

Complete feedback loop:

Model searches MemPalace via MCP before answering (guided by AGENTS.md)
Plugin saves the response after the model delivers it
Next session, the model remembers

Repo: https://github.com/geco/opencode-mempalace-persistence
npm: opencode-mempalace-persistence

Would love feedback from the MemPalace community!

Replies: 7 comments

geco
May 15, 2026
Author

https://dev.to/gecojs/give-your-ai-persistent-memory-opencode-mempalace-in-10-minute-dl7

0 replies

jphein
May 16, 2026
Collaborator

@geco — this is a great pattern, and the read-side bit "Model searches MemPalace via MCP before answering (guided by AGENTS.md)" is the place we have empirical data worth sharing.

We've been measuring exactly this — how often agent harnesses actually invoke mempalace_search when instructed via system-prompt/AGENTS.md-style guidance — on the SME framework's Cat 9a thread (M0nkeyFl0wer/multipass-structural-memory-eval#3). One finding bears directly on your plugin's effectiveness:

Same retrieval substrate, same instructions, same task: orchestrator-LLM choice determines invocation rate. On a 30-question Cat-9a-shaped diagnostic at fixed 4B parameter count:

Model	Zero-call rate	Mean recall
`gemma4:e4b` (4B, agentic-tuned)	18/30 = 60%	0.417
`qwen3.5:4b` (4B, Tau2-tuned for tool use)	4/30 = 13%	0.717

Same wrapper, same backend palace, same questions. The gap (30pp recall) is almost entirely an invocation-rate gap — gemma4 answers from prior knowledge on 60% of questions even when instructed to use the memory tool; qwen3.5 invokes 87% of the time. This matches the published Tau2 tool-use benchmark gap (37.7 pts in qwen's favor) almost exactly on an independent corpus.

Practical implications for OpenCode + MemPalace integration:

Document a "minimum recommended orchestrator" in the plugin's README. Users running the plugin with low-tool-use-discipline local models will see most questions answered from priors regardless of how AGENTS.md is structured. Qwen 3.5 4B or above is the floor for reliable read-side invocation in our measurements; smaller / older models hit ~60% zero-call.
System-prompt augmentation can recover some of the gap. We tested prepending a mandatory mempalace_search directive at the system-prompt layer (one constructor kwarg in our RlmAdapter: invocation_mode="forced"). On gemma4:e4b the directive lifted n=5 recall from 0.417 → 0.567 (+15pp). Worth considering: a plugin-level config flag that injects an invocation-forcing prefix into the model's system prompt, on top of AGENTS.md guidance. Belt-and-suspenders.
The KG-extraction write side is more model-tolerant than the search read side. Our gemma4 numbers above are read-side only; on write-side classification tasks (wings, hall keywords) most 4B models converge to similar accuracy. So the plugin's write loop is sturdier across model choices than the read loop.

We're running Step 2 (forced + grounded invocation experiments ×ばつ n=200 expanded corpus) on katana right now; full numbers will land on the SME #3 thread today/tomorrow. Happy to cross-link once they settle.

Three weeks of operator experience on an adjacent pattern (palace-daemon HTTP gateway + RLM-orchestrated reads against the same techempower-org/mempalace backend you're targeting — not your plugin, but the same read-side question) says the model-choice axis is the single biggest variable for read-side memory effectiveness — bigger than embedding model, bigger than rerank, bigger than wing/room structure. AGENTS.md gets you most of the way; the rest is base-model tool-use training.

🫏

0 replies

jphein
May 16, 2026
Collaborator

Quick follow-up worth surfacing for anyone landing on this thread: there's a complementary write-side path to geco's plugin already in flight upstream — MemPalace/mempalace#1484 — feat(sources): OpenCode adapter on RFC 002 contract (reviewed by @igorls on 2026年05月13日, awaiting final merge). The two solve the same OpenCode→MemPalace problem space from opposite directions:

Architecture	Direction	Captures
geco's plugin (`opencode-mempalace-persistence`)	OpenCode plugin listening to `chat.message` / `session.idle`	Push	Live conversation turns as they happen
`mempalace#1484` (RFC 002 source adapter)	`mempalace sources/opencode` adapter on the RFC 002 contract	Pull	Retrospective ingest of existing OpenCode session files

Complementary rather than competing — the plugin captures new sessions in real-time; the adapter ingests historical sessions. Anyone running OpenCode + MemPalace who wants both retroactive and forward-going coverage can install the plugin AND run mempalace mine --source opencode once #1484 lands.

If geco's plugin and the RFC 002 adapter end up in the same release window, worth a section in the user-facing docs that walks through "install plugin for live capture, run source-mine once for backfill, never think about it again." Closes the OpenCode integration loop end-to-end.

🫏

0 replies

geco
May 19, 2026
Author

@jphein this is gold — thank you for the empirical data. A few things I'll act on:

Model recommendation in README: I'll add a "Recommended orchestrators" section referencing your data — Qwen 3.5 4B+ as the floor for reliable read-side invocation, with the caveat that smaller models may skip mempalace_search ~60% of the time.
Forced invocation config: Love the belt-and-suspenders idea. I'll add a plugin config flag (e.g. forceMemorySearch: true) that injects an invocation prefix into the system prompt, on top of AGENTS.md. Will credit your Tau2 data in the docs.
PR feat(sources): OpenCode adapter on RFC 002 contract #1484 complementarity: You're right — the push plugin + pull source adapter close the loop. Once both land, I'll add a section in the README walking through "install plugin for live capture, run mempalace mine --source opencode once for backfill, never think about it again."
Will cross-link your SME thread once I publish the updated docs. Thanks again for the rigorous data — this makes the integration significantly stronger.

0 replies

geco
May 19, 2026
Author

Quick follow-up: I've updated the plugin README based on this discussion:
https://github.com/geco/opencode-mempalace-persistence#recommendations

Model recommendations with your invocation-rate table — Qwen 3.5 4B+ as the recommended floor
PR feat(sources): OpenCode adapter on RFC 002 contract #1484 complementarity documented (push + pull = full coverage)
Forced invocation config flag noted as "being evaluated" — I'll implement it properly in the next release
Thanks again for the rigorous data — it made the README significantly stronger.

0 replies

geco
May 19, 2026
Author

Quick update: based on @jphein's data, I've strengthened the AGENTS.md with mandatory search instructions and created a branch with the changes:
https://github.com/geco/opencode-mempalace-persistence/tree/feat/aggressive-agents-md
The key change is moving from suggestive language ("always search your memory") to imperative step-by-step with explicit "This is mandatory. Never skip this step." I'll run with this for a while and see if it measurably improves invocation rate on low-discipline models.
Thoughts on the approach? Happy to adjust based on your experience with forced invocation prefixes.
Once we're happy with it, I'll merge into main and update the integration PR docs accordingly.

0 replies

ferhimedamine
Jun 13, 2026

The invocation-rate data @jphein shared is valuable — model-dependent recall reliability is a real issue that most memory integrations paper over.

One architectural note on the read-side pattern: searching MemPalace before answering is correct, but the search quality depends heavily on the retrieval method. Pure keyword search misses semantic matches ("how did we handle the auth refactor?" won't match a drawer about "OAuth2 migration"). Pure vector similarity misses exact matches (searching for a specific function name returns semantically similar but wrong results).

Hybrid retrieval (BM25 fulltext + vector similarity in a single query) catches both cases and is the difference between 60% and 90%+ recall in our benchmarks. We tested this across 1,540 temporal reasoning questions — hybrid outperforms either approach alone by 3-5 points specifically on queries that combine exact terms with semantic intent.

The auto-categorization into wings (developer, creative, emotions) is a nice UX touch. The equivalent in a general-purpose memory system is memory_type tagging (episodic, semantic, procedural, working) with per-type decay curves — procedural memories (learned rules) persist indefinitely, while episodic memories (specific events) decay naturally.

Hybrid search example (BM25 + vector in one query): https://github.com/Dakera-AI/dakera-py/blob/main/examples/hybrid_search.py

0 replies

OpenCode plugin with MemPalace persistence — auto-sync conversations, KG extraction #1522

Uh oh!

Uh oh!

geco May 15, 2026

Replies: 7 comments

Uh oh!

geco May 15, 2026 Author

Uh oh!

jphein May 16, 2026 Collaborator

Uh oh!

jphein May 16, 2026 Collaborator

Uh oh!

geco May 19, 2026 Author

Uh oh!

geco May 19, 2026 Author

Uh oh!

geco May 19, 2026 Author

Uh oh!

ferhimedamine Jun 13, 2026

geco
May 15, 2026

geco
May 15, 2026
Author

jphein
May 16, 2026
Collaborator

jphein
May 16, 2026
Collaborator

geco
May 19, 2026
Author

geco
May 19, 2026
Author

geco
May 19, 2026
Author

ferhimedamine
Jun 13, 2026