Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: matevip/mateclaw

MateClaw 1.5.0

05 Jun 15:41
@matevip matevip

Choose a tag to compare

v1.5.0

Stable Β· 2026εΉ΄06月04ζ—₯ Β· Previous stable v1.4.0

Three big things this release

Let me say it straight.

In v1.4.0 we made the employee more autonomous β€” you set a goal, it locks on, self-checks, keeps itself going. But that self-check was fuzzy: the evaluator gave a 0–1 score, and you couldn't see what was missing or how many sub-tasks were left.

This release is about three things: making autonomy verifiable, making the knowledge base self-maintaining, and making the whole system genuinely multi-user.

First, goals grew a checklist. No more fuzzy score. The employee breaks a goal into a few independently verifiable criteria and the evaluator ticks them off one by one, every turn. Done means every box ticked β€” no "95% is close enough." The ring around the avatar, on hover, is now a checklist you can read box by box.

Second, the Wiki learned to maintain itself. Pages interlink with [[wikilinks]]; renames and deletes cascade-fix the links; a one-click broken-link lint. Knowledge splits into a fact layer and an experience layer β€” change a fact page and the experience pages that depend on it auto-flag as "needs review." Page types (pageType) carry schemas and permissions, so which employee can read/write which kind of page is controlled. You can attach processing pipelines that fire when a page hits some condition. And a local directory can be mounted as a knowledge source with scheduled incremental sync.

Third, memory knows who's who. Before, an employee's memory was one big pot β€” whoever chatted, it all piled into the same MEMORY.md. Now every memory carries an owner_key and a visibility scope (personal / team / global). One employee serving a group keeps each person's private memory separate; third-party APIs can even pass through an end-user identity to isolate memory per end user.

Plus two medium things: each employee can bind one primary knowledge base, and model selection actually honors your preferred provider. And a pile of polish.

That's it.


1. Goals grew a checklist β€” from "a score" to "ticked boxes"

In v1.4.0 the evaluator returned a completion score (0–1) and a one-line "what's missing" each turn. The problem: what does 0.8 mean? Which parts are done, which aren't, how many steps remain β€” you couldn't see it, and the employee was deciding whether to continue off a fuzzy number too.

This release replaces that with a checklist.

A goal = a set of independently verifiable criteria. You say "deploy the blog to fly.io" and the employee (or evaluator) breaks it into concrete criteria: DNS resolves correctly, SSL cert valid, health check passes, smoke tests green. Each one is a sentence a human can read and an LLM can judge.

The evaluator has two modes:

Mode When What it does
bootstrap No criteria yet Decomposes the goal into a checklist; each starts "not passed"
verdict Criteria exist Judges each one: satisfied? with evidence

Both modes use structured output β€” the evaluator must return a typed object (criterion id + passed + evidence), not free text we have to parse.

Completion is now deterministic. There used to be a fuzzy threshold. Now: completion only when every criterion passes. 19 of 20 passed (a 0.95 score) is still "continue," not "done." Miss one, and one is missing.

Auto-followup targets the remaining criteria. With autoFollowup on, if the employee finishes a turn without ticking everything, the injected follow-up prompt lists the criteria still open β€” "5/8 done, remaining: 1 ... 2 ..., take the next step on these" β€” instead of a vague "continue."

The ring, on hover, is a checklist card. With no checklist it's a one-line tooltip (title + what's missing); with a checklist it's a card: title + X/Y progress, then each criterion prefixed by ○しろまる (open) or βœ“ (green, done, struck through). While evaluating, a sand-gold breathing halo surrounds the avatar.

Evaluator SPI. The evaluation logic implements Spring AI's Evaluator interface β€” it does both goal-specific checklist verdicts and can be reused as a generic evaluator. Failed evaluator calls still count against the LLM budget (no free rides), so the budget accounting stays honest.

One new tool + one new API endpoint:

  • Agent tool addGoalCriterion β€” append a criterion to a live goal without restarting it
  • REST POST /api/v1/goals/{id}/criteria β€” append a criterion programmatically
  • Goal creation can carry an initial checklist directly: criteria: ["...", "..."] (skips the bootstrap round)

New config keys (mateclaw.goal.*): default-auto-followup (create-time default for auto-followup), allow-auto-followup (runtime master switch β€” off means no goal injects a follow-up), max-followups-per-run (hard cap on auto-followups within a single graph run, default 8).

Full details in Persistent Goals.

The v1.4.0 goal was "the employee remembers what it's doing." The v1.5.0 goal is "the employee knows exactly which boxes are still open." From a score to a checklist you can tick.


2. The Wiki learned to maintain itself

This is the heaviest chunk of the release. The Wiki grew from "a searchable knowledge base" into "a knowledge engine that maintains its own consistency, layers itself, and runs its own pipelines."

Pages can interlink β€” [[wikilinks]]

Write [[target-slug]] or [[slug|display text]] in a page body to link to another page.

  • Slug-first resolution β€” links match by slug exactly (case-insensitive), no fuzzy guessing. [[...]] inside fenced (```) or inline code is left alone.
  • Rename / delete cascade β€” rename a page's slug and in the same transaction every wikilink across the KB pointing at the old slug is rewritten, alias text preserved ([[oldslug|x]] β†’ [[newslug|x]]). Cascade delete cleans up references too.
  • Broken-link lint β€” POST .../lint/broken-links starts an async scan job; results are persisted onto the page rows, so they survive a restart. GET .../lint/broken-links returns the aggregate (how many pages have broken links, total broken refs).
  • Clickable wikilinks in chat β€” wikilinks rendered in a chat answer are clickable; a cross-KB lookup navigates to the target page.

Knowledge is layered β€” fact vs experience

Each page can carry a knowledge layer:

  • fact β€” "what is": foundational fact pages (unlabeled defaults to fact)
  • experience β€” "what it means": synthesis, analysis, insight, which depends on a set of fact pages

Staleness propagates. An experience page declares which fact pages it depends on (edges stored by page id, so renames don't break them). When a fact page is updated during ingest, every experience page depending on it is auto-marked stale (needs review) + a reason. The wiki_stale_pages tool lists everything currently flagged.

Search can filter by knowledge layer (facts only / experience only / all).

Page types now have profiles and permissions

pageType profile (KB-scoped) β€” defines which page types a KB has (e.g. "concept / tutorial / decision record"), each carrying: a structured-field schema, route/create/merge-stage prompts, and a Markdown template. New pages get their metadata schema-validated on save, with the validation status recorded. At most one enabled profile per KB; unconfigured KBs use a built-in default.

pageType permissions (per-agent) β€” for "this agent + this KB + this page type" you can set read/create/update/delete flags plus a write policy (allow immediate / approval_required / deny). page_type='*' is the KB-wide default; exact matches beat the wildcard. Read and write fall back differently: an unmatched read falls back to the KB-level default read policy (allow_all by default, so KBs stay fully readable after upgrade); write is opt-in tightened β€” allow with no rules, but once any rule exists the KB is "locked down" and a page type with no matching rule resolves to deny (fail-safe).

Knowledge bases can run pipelines

Wiki Pipeline β€” define a processing flow for a KB, fired automatically by page events:

  • Triggers: page_type_count (a page-type count crosses a threshold), page_created (a page of a given type is created), stale_marked (pages get flagged stale)
  • Step executors: llm (run input through the model, output becomes the step result), skill (run a skill from a restricted set, as the owner agent)
  • Definitions are written in YAML or JSON, with CRUD + validate endpoints; every run and every step is persisted and queryable (.../pipelines/{id}/runs, .../pipeline-runs/{runId}), deduplicated by (definition, trigger, subject, bucket) for idempotency.

A local directory can be a knowledge source β€” pluggable + scheduled incremental

Ingest-Source SPI β€” knowledge sources are a pluggable interface (WikiIngestSourceProvider) with a built-in filesystem provider: give a KB a source_directory and files in it get ingested.

  • Scheduled incremental sync β€” a background scheduler (@SchedulerLock so only one node runs per cycle) scans periodically, detects changes by content hash, and re-ingests only new/modified files (text and binary).
  • Security is fail-closed β€” paths are normalized then toRealPath()-resolved to follow symlinks (closing TOCTOU), and validated against an allowed-roots allowlist; under the production profile an empty allowlist rejects everything by default.
  • Status + manual trigger β€” GET .../source-watcher shows watcher status, POST .../source-watcher/scan runs a scan immediately.

New Wiki tools

  • wiki_update_page β€” in-place edit of a page (keeps the slug), gated by the pageType "update" permission
  • wiki_stale_pages β€” list every page currently flagged for review

All wiki write tools (create / update / archive / delete) now pass through pageT...

Read more
Assets 16

MateClaw 1.4.0

25 May 02:06
@matevip matevip

Choose a tag to compare

v1.4.0

Stable Β· 2026εΉ΄05月23ζ—₯ Β· Previous stable: v1.3.0

Five things

Let me cut to it.

In v1.3.0 we assembled employees into business processes β€” workflows and triggers, a group of employees collaborating on a procedure. But the procedure was fixed by you, and each employee still stopped after one turn.

This release we put the focus back on the employee: make a single employee more autonomous, able to build and lead a sub-team, with a toolset that scales with the task β€” then make the whole system multi-user and turn Feishu into a first-class citizen.

One β€” Persistent Goals are in. State a goal once; the employee locks it, self-evaluates every turn, keeps itself going until done or out of budget.
Two β€” Subagent delegation became a tree. Employees delegate to employees, three levels deep; plus async delegation and a "digital-employee builder" that spins up a whole team from one sentence.
Three β€” Progressive tool/skill disclosure. Only core tools are advertised by default; the employee calls enable_tool / load_skill when it needs more. However many tools you install, the context doesn't blow up.
Four β€” Workspace RBAC. Owner / Admin / Member / Viewer, four roles plus capability gating; menus and endpoints close down by role. The first time MateClaw is usable by a team.
Five β€” Feishu is a first-class citizen. Interactive cards, approval cards, streaming cards, voice transcription, file/audio/video both ways, channel-native tools β€” anything you can do in Feishu, the employee can do.

That's it.


1. Persistent Goals β€” the employee follows through, you don't nag every turn

You said "deploy this blog to fly.io," the employee answered one turn and stopped. Next turn you had to ask again: "Is DNS set up? The cert? Did the tests run?" β€” you were keeping the goal for it.

This release we flipped it. You say it once, the employee locks the goal, and self-checks every turn: what's still missing? Should I take another step myself?

It's not a new button in the chat. It's a state of the employee β€” a ring of light around the assistant avatar; how full it is, is how close it is to done. Done, the ring vanishes. Hover the avatar for the full tooltip (title + what's missing); don't hover, it doesn't nag you.

Three ways to set a goal:

  • Let the employee set it β€” when first describing the task, signal it's a long one and ask explicitly to "lock it with setGoal, turnBudget=8, autoFollowup on." The employee recognizes the signals and creates the goal.
  • Command the tool directly β€” tell it to call setGoal, with which params, and "don't ask for pre-confirmation."
  • Create it via API β€” POST /api/v1/goals, for automation and external scripts.

Four built-in tools every employee has by default (agent-wide, system-level, no manual binding): setGoal (create), addGoalCriterion (append a criterion), completeGoal (mark done), getGoalStatus (check progress).

Auto-followup is the key. With autoFollowupEnabled on, after the employee answers a turn a lightweight evaluator scores completion (0–1) and "what's missing"; if it judges "continue," it injects a follow-up at the end of the conversation and the employee takes the next step on its own. What you feel: it answers a bit β†’ pauses half a beat β†’ keeps going, like a person finishing a step, thinking, and continuing.

A few deliberate constraints:

  • Evaluation happens after the answer streams β€” it never blocks you from reading the reply; the ring updates a moment after the answer appears
  • One goal per conversation β€” at most one active goal at a time, kept concurrency-safe by a generated column + unique index
  • Subagents can't see the goal tools β€” the goal is the parent conversation's state; children are stateless executors (see section 2)
  • Budget exhausted means stop β€” turnsUsed >= turnBudget or LLM-call budget spent flips the state to exhausted, the ring turns red-orange, you decide whether to add budget or let go
  • Terminal states don't revive β€” completed / exhausted / abandoned are the end; to continue, open a new goal, avoiding the budget-accounting mess of "restart"

Point the evaluator at a cheap small model (mateclaw.goal.evaluator-model); on completion the employee syncs the goal summary into long-term memory. Full guide: Persistent Goals.

A Goal isn't a feature added to the employee. It changes the employee's state. The old employee "forgot when it answered." Now it remembers one thing across many turns: what it's doing, what's missing, when it's done.


2. Subagent delegation became a tree β€” employees build and lead teams

Since v1.1.0 an employee could delegate to another employee. But that was "single-level, synchronous, one at a time."

This release we made it a tree.

Recursive delegation, up to 3 levels deep. A parent delegates to a child, the child can delegate further β€” a "project manager" employee can spin up "frontend / backend / QA" employees, each delegating again. Every subagent has a stable subagentId + parentSubagentId + depth, with events relayed to the root conversation in real time.

Three delegation tools:

Tool Behavior
delegateToAgent Synchronously delegate one child, return after its final result; optional inheritParentContext carries recent parent context over
delegateParallel Fan out several children at once, return after they all collect
delegateAsync Delegate in the background, get a task_id immediately, fetch later with taskOutput β€” long tasks don't block the parent conversation

Async delegation has an attribution gate β€” taskOutput only lets the same conversation and same user fetch the result of a task they spawned; another conversation can't peek.

Children deny a default set of tools β€” delegateToAgent / delegateParallel (recursion guard), the setGoal family and the remember family (goal and memory ownership stay with the parent), create_employee (no recursive team-building). Tunable via mateclaw.delegation.child-denied-tools.

You can see the whole tree in the UI. A nested subagent timeline in the chat stream + an always-on plan panel β€” delegation start, each child's name / depth / task excerpt, completion badges on finish (success / timeout / error / duration / content length). The multi-level shape reads at a glance.

The "digital-employee builder" skill β€” spin up a team from one sentence. Describe a need, and this built-in skill: clarifies the requirement β†’ designs 2–6 roles β†’ creates each as a real employee via create_employee β†’ chains them into a workflow draft for you to review. Its companion list_capability_catalog lets it look up which skills/tools are bindable before it acts. Employees are enabled on creation; binding mirrors template apply.

Long tasks no longer lose context to "prompt too long." Structured compaction runs a four-stage strategy (soft trim β†’ hard clear β†’ pre-prune β†’ LLM structured summary), always preserving the prefix (system prompt + goal anchor), injecting the summary as a UserMessage. Delegation tool results are never compacted (child execution isn't reproducible). A 10-minute cooldown after a failed summary prevents cascades.

One employee working alone is a tool. One employee working with a team it built itself is an organization.


3. Progressive tool/skill disclosure β€” however many tools, the context doesn't blow up

The old approach: take every tool an employee can use, with every tool's description, and dump it all into the system prompt. With many tools, just "what can I use" eats thousands of tokens β€” before the model does any work, half the context is gone.

That was the engineer's shortcut.

This release we switched to two-tier disclosure:

  • Core tier (CORE) β€” always visible to the model, callable out of the box
  • Extension tier (EXTENSION) β€” by default the system prompt lists only a compressed directory (name + source + one-line description), without the full schema. The employee opens it when needed.

Two new built-in tools are the switches:

  • enable_tool(toolName) β€” activate an extension-tier tool for the rest of the conversation. It validates the tool is in this employee's effective set; once active it's callable on the next reasoning turn of the current ReAct loop (ReasoningNode recomputes the toolset each turn, so it takes effect immediately)
  • load_skill(skillName, filePath?) β€” load a skill's SKILL.md on demand. The content is injected through message history (not the system prompt), which keeps the prompt cache stable and pins loaded skills to the top of later turns so it doesn't reload

Default tiering: generative tools (image_generate / music_generate / video_generate / model3d_generate) and browser_use default to extension; everything else defaults to core. The Tools page has tier UI β€” built-in and channel tools get a per-row tier toggle so admins can move tools between core / extension (MCP / ACP sources are locked).

An escape hatch for conservative deployments: mateclaw.tools.disclosure.mode=legacy turns tiering off and advertises all tools again; mateclaw.skill.disclosure.load-skill-tool.enabled=false falls back to the old readSkillFile.

The system prompt should scale with the task, not with the total tool count. Install 50 tools on an employee and it shouldn't burn thousands of extra tokens every turn for the privilege.


4. Workspace RBAC β€” the first time MateClaw is usable by a team

MateClaw used to be a single-person system β€” one admin who saw and changed everything. Want to bring a colleague in? There was no "read-only," no "manage only your own area."

This release we laid the multi-user foundation. Four roles, capability gating.

Role What it can do
*Viewer...
Read more
Loading

MateClaw 1.3.0

14 May 02:23
@matevip matevip

Choose a tag to compare

v1.3.0

Stable Β· 2026εΉ΄05月13ζ—₯ Β· Previous stable: v1.2.0

Five things

Let me cut to it.

In v1.2.0 we renamed agents to "digital employees." But an employee who works alone is just a starting point β€” real work needs orchestration.

This release we filled in the missing piece.

One β€” Workflow is in. Compose multiple employees plus system actions into a linear business process.
Two β€” Triggers are in. Let things that happen in the system start a workflow or talk to an employee, automatically.
Three β€” Wiki is no longer just a search index. It's a processing pipeline. Template-driven transformations turn every raw material and every page into a structured product.
Four β€” Each employee binds MCP tools independently; multimodal traffic gets a sidecar. No more "one MCP server, everybody sees it."
Five β€” Office files come straight out of the chat. Docx / Xlsx / Pptx / PDF β€” four document generation tools, no subprocess, no npm, no Office install.

That's it.


1. Workflow β€” MateClaw graduates from chatbot to business-process OS

You wanted to say: "let the data analyst enrich a customer record, then let the enterprise-sales employee run VIP onboarding, then fan out to Feishu and email at the same time, then write the result to memory once both channels acknowledge."

Before this release, that meant: stitch the prompt yourself, write the cron yourself, handle the approval yourself.

This release we did the wiring for you.

Open the Workflow menu. You see a linear array of steps plus a mode field. That's it. Not a 30-node Dify-style canvas. Not a drag-and-drop if/else maze. Intentionally minimal.

Seven step modes cover 90% of business flows:

  • sequential β€” run in order; previous step's output flows to the next as {{input}}
  • fan_out / collect β€” run a group of steps in parallel, then collect
  • conditional β€” Pebble expression decides whether to run
  • await_approval β€” pause the run, request approval, resume after sign-off. Persisted across server restarts.
  • dispatch_channel β€” fan out the previous output to multiple channels
  • write_memory β€” write the result into the employee's MEMORY.md (four merge strategies: append / replace_section / upsert_kv / overwrite)

Two ways to edit:

  • JSON-first β€” Monaco + JSON schema + Pebble static checks + template dropdowns. For people who can write DSL. Red squiggles while typing, compile diagnostics before publish.
  • @vue-flow/core canvas β€” for people who want to see the shape. Read-only linear render in this release; drag-to-edit ships in the next.

Don't know the DSL? Open natural-language draft generation (POST /workflows/draft/generate). Describe the flow in one sentence, an agent generates graph_json plus compile diagnostics. It does not auto-publish. You review, you hit publish.

A few engineering details, invisible but load-bearing:

  • Integer revisions β€” publish writes an immutable new row; drafts and published versions are decoupled. Your in-flight runs won't suddenly change semantics because someone edited the draft.
  • Inline payload storage β€” large inputs/outputs spill to a payload:// URI, the database doesn't bloat
  • Cross-workspace ACL β€” publish-time validation that every agent / channel / employeeId reference belongs to this workspace
  • Run history β€” every step's input / output / duration / token count / failure chain is recorded; you can open any past run and replay it
  • Async dispatch + GC schedulers β€” long-running workflows don't pin a request thread

Workflow is not a replacement for ReAct or Plan-and-Execute. Single-employee multi-turn reasoning still runs on those engines. Workflow is for assembling employees into a business process β€” promoting "a task" into "a procedure."


2. Triggers β€” events drive workflows now

OK, the workflow is written. Who starts it?

Before, the answer was: you do. Manually. Or you write a cron yourself. Or you ship a webhook endpoint yourself.

This release we unified that. Triggers wire "an event happening somewhere" to "an action that should run."

Six pattern types. They cover every triggering scenario you can think of:

Pattern Fires when
cron A cron expression matches β€” reuses the existing cron module's ShedLock + Spring TaskScheduler
webhook A generic event passes through β€” POST /api/v1/triggers/events
channel_message A channel receives a message β€” filterable by channelType + senderEquals
agent_lifecycle Employee lifecycle event β€” spawned / terminated / crashed
content_match Substring match (case-insensitive) on the event content
workflow_completion An upstream workflow enters a terminal state

Two action targets: start a workflow, or send a message directly to an employee.

Safe-by-default governance β€” this is the part that matters:

  • Event dedup β€” events with a dedup_key already seen within the default 60s window get dropped
  • Per-trigger rate limit β€” default cap of 10/min keeps one chatty trigger from drowning the queue
  • Bot self-msg filter β€” Feishu / DingTalk / WeCom echo bot messages back as events? The default-bound SPI lets the channel adapter recognize and discard them
  • Recursion guard β€” workflow_completion β†’ workflow β†’ another workflow_completion... dispatch chains past depth 5 get cut + alert
  • Unknown pattern types fail closed β€” a typo or a future-added pattern won't silently fire every trigger in the workspace

Cross-instance consistency β€” in a multi-instance deployment, the pattern_version self-cancel mechanism plus periodic syncFromDatabase keeps every node converged. cron triggers grab ShedLock through CronDelegationPort so they fire exactly once.

Webhook ACK is fire-and-forget by design β€” receive β†’ envelope wrap β†’ dedup check β†’ bot-self check β†’ rate-limit check β†’ ACK 200 β†’ async dispatch. Upstream gateways see 200 and stop retrying.

The UI has a structured form per pattern. No hand-writing patternJson. Pick cron β†’ cron expression input + timezone dropdown + next-fire preview. Pick content_match β†’ substring input. Pick agent_lifecycle β†’ agent dropdown + phase dropdown.


3. Wiki is no longer just a search index. It's a processing pipeline.

The old Wiki worked one direction. You threw a document in, it got chunked, embedded, and could be retrieved by semantic search. One way. Raw in, recall out.

This release Wiki learned to process.

The transformations engine β€” attach a "template" to a raw material or a page, run it through an LLM, save the structured output back to the Wiki.

Concretely:

  • User-defined templates β€” write a prompt, pick a model, pick an output format (markdown / json), decide whether to auto-save as a synthesis page
  • Per-template model picker β€” analyze contracts with Claude Opus, do summaries with a cheap Flash. No more "one LLM does everything."
  • Run templates against pages β€” not just raw materials. Feed an existing synthesis page back through a template, produce a new one.
  • Cross-material aggregator (map-reduce) β€” run a template against every raw material in a KB, then map-reduce the runs into one KB page. That's real synthesis.
  • Reverse-citation extractor β€” a synthesis page is bound to the exact source chunks it cited. Click the page, see where every claim came from.
  • Structured JSON output + optional JSON Schema β€” downstream code can consume the output directly without parsing markdown
  • Cancel a running transformation + re-run any past run β€” long task halfway done and you spot a prompt bug? Cancel, edit prompt, rerun.
  • Token usage recorded per run β€” you can see exactly how much each template is costing you
  • Side-by-side compare modal β€” tweaked the prompt, want to see the difference? Open compare.

Seven seeded templates aligned with enterprise scenarios: contract clause extraction, account intel, risk summary, KPI distillation, meeting-notes structuring, knowledge-page structuring, Q&A pair generation. Install once, ready to run.

Synthesis pages themselves get embedded at page level. So now search hits don't just return raw chunks β€” they also return products that you (or an agent) already processed.

The Wiki UI was rebuilt too β€” library home + workspace split. The library home is the entry into every KB in your workspace. Inside a KB: four tabs β€” raws / pages / templates / runs.

Wiki went from "passive retrieval" to "active processing." Raw materials don't just sit there waiting to be recalled β€” they get transformed into more useful artifacts, and those artifacts get recalled too.


4. MCP per-agent tool binding + multimodal sidecar

Before, connecting an MCP server meant every employee in the workspace saw all of its tools. You installed GitHub MCP for your executive assistant β€” and customer support also thought it could open PRs.

That was the engineer's shortcut.

This release we fixed it.

Each employee binds MCP tools independently:

  • The tool picker groups by server and tags by status β€” connected / stale / unavailable / orphan
  • Validation on save: do the tools you're binding actually exist on the current MCP server? If not, save is rejected β€” no publishing an agent that's going to crash
  • Namespace collisions auto-prefix β€” two servers both have a search tool? Becomes server-a:search / server-b:search; calls are unambiguous
  • Rename an MCP server, bindings follow automatically
  • Stable prefixed callback names + per-server tool cache persisted to disk β€” no re-probe on restart

MCP-derived skills and tools flow through the same picker endpoints that built-in tools use. The "tool catalog" and "what this employee can use" are the same view.

Multimodal sidecar routing β€” issue #87

Before, you sent an image to ...

Read more
Loading

MateClaw 1.2.0

05 May 12:27
@matevip matevip

Choose a tag to compare

v1.2.0

Stable Β· 2026εΉ΄05月05ζ—₯ Β· Previous stable: v1.1.137

Four things

Let me cut to it.

Not 60 new features. Not an architecture rewrite. Four things.

One β€” they're called "digital employees" now.
Two β€” skills aren't an alias for tools anymore. They're the skeleton.
Three β€” Claude Code, Codex, and other top-tier coding agents now show up as employees.
Four β€” for the first time, you can actually see what each employee is doing.

That's it.


1. They're called digital employees now

We used to call them "agents."

That's a name an engineer picked.

An engineer hears "agent" and nods β€” "yep, agent, got it." A regular person hears it and frowns. What's an agent? Why do I care? What do I do with it?

When you hire a person, you don't tell them "you are an agent." You tell them β€” your name, what you do, what your task is today, who to come to when something breaks.

So this release we changed the name. Every "智能体 / agent" in the back office is now digital employee.

But this is not a vocabulary cleanup. It's a worldview change.

A digital employee has:

  • Role β€” "I'm the product researcher." "I'm customer support." "I'm the legal assistant." One sentence.
  • Goal β€” "I help you see how the market is moving." "I catch every customer question." Plain language.
  • Backstory β€” where they came from, why they exist, what they care about.

These aren't decoration. They get spliced directly into the system prompt and shape every answer.

We also shipped five career templates β€” open one, it works:

  • Product Researcher
  • Customer Support
  • Knowledge Curator
  • Data Analyst
  • Executive Assistant

Each comes with a role, goal, backstory, the right tools, a pixel-art avatar, and a color that belongs to that role.

You don't start from a blank page. You hire a coworker who already knows how to work.


2. Skills aren't an alias for tools anymore

The old "skill" was, honestly, "a list of tools plus a prompt."

A shortcut.

This release we rebuilt it.

A skill is now a full manifest. It has a name, a version, a feature matrix, the tools it needs, the external dependencies it requires, its prompt, its scripts, and a thing called LESSONS.md β€” what the skill learned during runs that it should remember next time.

Skills can grow. That's what LESSONS.md is for β€” the more a skill gets used, the better it knows when to step in and when to stay out. It writes a line: "last time the user didn't like that, don't do it again." It writes. It reads. It evolves.

Skills have a marketplace. Eight starter templates. Before installing, a preflight check runs automatically β€” which API key is missing, which CLI you need to install, which feature flag you need to enable. Told upfront. No more "install, hit error, debug yourself."

Skills have a creation wizard. Don't know how to write a SKILL.md? Open the wizard, click through a few steps, and you get a complete bundle β€” multi-file packaging, secret store, starter library, all in one.

Skills wire up to MCP. Tools declared on an MCP server automatically show up as virtual skill cards on the Skills page. Same-name skills get deduplicated β€” install a real one, it shadows the virtual one. That's the behavior you want.

Details all live in one drawer. Tools, features, memory, activity, lessons β€” five tabs in one place. The card got slimmer β€” six fields and one status pill. Clear beats comprehensive.

Skills are the difference between an AI that uses tools and one that develops a craft. Give it tools, it uses tools. Give it skills, it has playbooks. Give it skills with LESSONS, it has experience.


3. Claude Code is now one of your employees

This is the biggest pivot in this release.

I've said it before β€” Apple controls the whole widget. From silicon to software to OS to services, all us. I still believe that.

But some jobs, somebody else in the world is already doing better than us.

Claude Code is very, very good at writing code. So is Codex. In their own arena, they are the best.

So do we go fight them? Or do we hire them?

This release we picked the second one. We integrated ACP (Agent Client Protocol).

ACP is a protocol that lets external coding agents plug into MateClaw via OAuth or API key. Once installed, Claude Code becomes a skill card in MateClaw. Your digital employees call it the same way they call any built-in tool.

Concretely:

  • ACP endpoints auto-bridge β€” configure an endpoint, it shows up on the Skills page with a wrapper toolset
  • Visual env editor β€” every endpoint tells you which keys it needs, right in the UI
  • Per-session cwd β€” every ACP session gets its own working directory
  • Errors translated β€” when upstream returns "Request not allowed," the UI translates it into something you can act on: "your OAuth got hijacked by another app on your keychain"
  • claude-code-helper, codex-helper templates β€” install and go

The judgment underneath this is simple:

MateClaw is the personal AI your IT department can sign off on. It doesn't need to write every line of code itself. It needs to be the place where β€” when your developer uses Claude Code in their IDE, the company can manage it, audit it, and switch it off.

ACP is what makes that possible.


4. You can finally see what every employee is doing

The old MateClaw was a factory with no windows.

You threw a task in. Some time later, a result came out. What happened in between β€” black box. Which agent was running, which step it was on, where it got stuck, why it was slow β€” invisible.

That's the engineer's blind spot. Engineers don't need to see, because engineers have logs. People who actually use the product don't have logs. They only have trust.

This release we opened all the windows.

The biggest window is the Admin Runtime Console (Settings β†’ System β†’ Runtime).

Open it, and you see every digital employee currently working anywhere in MateClaw β€”

  • who's running, for whom
  • how long they've been running, what step they're on
  • what they're doing right this second β€” reasoning, calling a tool, waiting for approval
  • how many tokens used, how much memory
  • stuck? One-click force-recycle β€” no service restart needed

Before, you'd open the server logs, grep an agentId, line up timestamps. Now you open a browser tab.

The whole back office got remade too:

  • Avatar status ring β€” busy, idle, errored β€” visible on the avatar
  • Hero focus panel β€” open an employee, the most important info is the largest text at the top
  • Runbook status line β€” what it plans to do next, written there
  • Brand-tone time dial β€” multiple employees' activity laid out by time, the rhythm visible at a glance
  • Ticket-style ID tag β€” every task gets a service-desk-style ID you and the team can reference
  • Tool chip β€” the card shows which tools this employee uses most
  • Bento metadata tiles β€” important info organized as tiles, not tables, not lists

Dangerous actions go through mcConfirm. Delete, force-recycle, change permission β€” used to be the browser's native confirm dialog. Ugly, easy to misclick. Now it's a unified confirmation, consistent with MateClaw's visual language. You don't lose things by hitting the wrong button.


Streaming changed β€” you don't stare at "..." anymore

The typing animation is a lie. It pretends the AI is thinking. Most of the time the AI hasn't started thinking yet β€” the network is queued, the model is loading, the API is rate-limited. You watch the dots and assume something is happening. It isn't.

This release we made streaming an honest signal.

  • Thinking phase, tool-call phase, answer phase β€” shown separately. You know which step it's actually on.
  • Each SSE event has its own ID. Network drops and reconnects don't replay the same chunk β€” and don't lose chunks either.
  • Repetition detector got smarter. Used to miss it when an agent was stuck looping on the same markdown list with a transition paragraph in between. Catches it now.
  • Multi-agent delegation no longer fights itself. Each child agent runs in its isolated session, progress batched back to the parent stream β€” the main conversation doesn't stutter.
  • Zombie streams self-heal. Timed out, abandoned, child task crashed β€” cleaned up by the runtime.

Long-task "fake answers" got cut. The old behavior, on long tasks, was to guess an answer early so the user wouldn't wait. That's an engineer's misread of what UX means. Users would rather wait for the real one than get a fake one. Long tasks now require evidence-grounded answers.


A few more things

Wiki keeps moving forward:

  • Hot cache β€” frequently-used knowledge gets injected into the agent's system prompt at startup, no per-call query. Each KB has a "view / regenerate / reset" panel
  • Vision providers expanded β€” Zhipu GLM-V, Volcano Doubao, on top of the previous ones
  • Chat-LLM one-hop fallback β€” primary provider is wedged, switch to a healthy one β€” once, not infinite retry loops
  • Delete actually deletes β€” drop a raw material, the linked pages, vector chunks, and disk file go with it. No more orphans

Multimodal:

  • Tencent Hunyuan 3D β€” generates 3D models, previewed inline with <model-viewer>, Pro / Rapid routed by use case
  • Generative tools on a unified async pipeline β€” music, video, image, all on one SSE delivery channel

Security and enterprise:

  • Personal Access Tokens β€” long-lived credentials for headless scripts, CI, automation. Issuable, revocable, rotatable
  • Outbound webhook signing β€” HMAC-SHA-256 body signing so the receiver can verify a message came from you
  • Cron distributed lock β€” multi-instance deployments don't double-fire the same scheduled job

Channels:

  • Feishu image download on by default β€” vision agents can see images now
  • **Feishu long messages chunked, not tr...
Read more
Loading

MateClaw 1.1.137

29 Apr 12:15
@matevip matevip

Choose a tag to compare

v1.1.137

Released: 2026εΉ΄04月29ζ—₯

Four things

Let me cut to it.

Not 27 new features. Not an architecture overhaul. Four things.

One, it learns from yesterday now.
Two, when one piece breaks, the whole thing doesn't break.
Three, the parts that were "almost good" β€” they're good now.
Four, the knowledge base became a library you can open.

That's it.


1. It starts remembering you

The MateClaw before this release was an amnesiac. Every morning, you had to tell it again what you told it yesterday.

That's not what a personal AI should be.

Here's what it does now β€”

At night, while you sleep, it goes through the day's conversations. It picks out what mattered. It throws out the noise.

We have a name for this. We call it Dreaming.

In the morning, you open it. It shows you a card:

Yesterday you brought up these things. I think these five are worth keeping.

This one contradicts what you told me last week β€” which version is right?

These three sound like one-offs. Forget them?

You tap a few times. It does what you said.

A friend remembers you don't drink coffee. A coworker remembers your standing 1:1 is Wednesday. Your spouse remembers your mom's birthday. Why should an AI forget?

Before, it forgot because we couldn't make it remember. Now we can.


2. When something breaks, nothing breaks

I've used a lot of "AI apps." They have one thing in common β€”

The API key expires. The whole product dies.
The model gets rate-limited. The whole product dies.
The cloud provider has a bad afternoon. The whole product dies.

That's engineering thinking. Not product thinking.

Your user does not care which one of your APIs is having a bad day. Your user cares about one thing: does it work right now.

So here's what we did.

You configured OpenAI. And Claude. And Qwen. They are not three separate things anymore. They're a Provider Pool, with a Health Tracker keeping score β€” which vendor failed how many times, when it last recovered, when to try it again.

OpenAI hiccups? Roll to Claude. Automatic.
Claude rate-limits you? Roll to Qwen. Automatic.
Qwen drops a model? Pick another one that works. Automatic.

You don't see any of this. You just see: it works.

Oh β€” and signing in with Claude doesn't require copying around an sk-ant-... token anymore. If you have a Claude subscription, you sign in. In the browser. Like you'd sign in to anything else.

The way it should have been.


3. The "almost good" parts β€” they're good now

I've said this before. Details are not the details. Details are the product.

This release was a lot of details.

Code blocks have line numbers now. Long ones collapse. Math equations are no longer a tangled mess β€” they're equations. A flowchart is a flowchart. Links are safe. Pasting a long log doesn't shove your conversation off the screen.

Voice actually transcribes. Before, you'd hold to talk and half the time nothing got captured. Now Chinese routes to a Chinese model, English to an English model. What you said is what shows up.

The knowledge base stopped being a punishment. Before, you dropped a PDF in, waited ten minutes, and got told "3 pages failed, the whole document is discarded."

Not anymore. Drop it in β€” search works immediately. Pages are produced when something asks for them, not before. Two at a time, so it doesn't melt your small model. Got interrupted? Hit Resume. It picks up where it stopped.

Docker installs once and you're done. The browser tool is bundled. The search engine is bundled. You don't install Playwright. You don't configure anything. You open it. It runs.

All of this adds up to one thing.

This is the first version of MateClaw that lets you forget it's software.


4. The knowledge base became a library

We used to call it a "knowledge base." But it didn't behave like a library β€” it behaved like a scanner.

You'd drop in 100 PDFs and get 100 piles of shredded vector fragments. Ask a question, it would rummage through the pieces and stitch something together. Want to read it yourself? You couldn't open it. Want to know what it actually understood? Nobody could tell you.

This release we changed the idea.

The material you drop in gets read once, digested once β€” and written into a book you can open. Every page has a summary, bidirectional links, citations that go all the way back to the source paragraph. You can read it. You can edit it. Agents read this book too β€” not vector fragments.

The biggest change is lazy mode.

Before: you uploaded a document and the system burned a stack of LLM calls to digest the whole thing into pages. Slow. Expensive. Now you can choose to just index it β€” searchable immediately, and pages get compiled only when an agent actually needs one.

90%+ fewer LLM calls. At scale, the difference is dramatic.

While we were at it:

  • Wikilink alias form [[concept|display text]] parses correctly now β€” used to take the whole thing as a slug
  • Got interrupted? Hit Resume. Only the unfinished pages re-run; the ones already written are untouched.
  • Every KB now has two system pages β€” overview and log. The lobby and the audit trail. Can't be deleted, don't pollute search.
  • Pages you wrote by hand are protected β€” locked tells the AI to leave them alone; reingest won't overwrite them.
  • Pages you don't want anymore β€” soft archive, not delete. They disappear from the default list/search but the page, citations, and backlinks stay. Reversible.
  • The "which model for which step" config in the UI now actually wires through to the backend β€” used to be cosmetic.
  • Chunk hits include pageNumber and section β€” agents can cite "page 12, Setup / Linux" instead of pasting a context-free paragraph.

In April, Karpathy turned "LLM Wiki" into a meme with a single Gist. Within a month, nine llm-wiki single-file clones appeared on GitHub. They all solve the same problem: one person, one machine, one pile of files, organized into a readable book.

MateClaw made the same idea into the knowledge layer of a product β€” team-shared, on-demand compilation, agents using it continuously, threaded through memory and channel delivery.

They built a clone. We built a home.


What this means for you

If you're a regular user β€”

It remembers you. Tomorrow it'll know you better than today. The day after, better than tomorrow.

If you've connected MateClaw to WeChat, Slack, Discord, Telegram, or any other messenger β€”

One bad API doesn't take you offline. A customer asks a question. The question gets answered.

If you've been feeding it PDFs to build a knowledge base β€”

You don't have to babysit it anymore. Drop them in. They'll be ready when you need them.

If you tried MateClaw before and gave up because "one part wasn't quite there yet" β€”

Come back. There's a real chance the part you gave up on is exactly what we fixed.


A few smaller things

  • Claude 4.7. DeepSeek V4 with thinking mode. Alibaba Bailian token plans. SiliconFlow. More.
  • MiniMax video β€” China endpoint
  • Word documents render faster (no more spawning a Node.js subprocess; pure Java now)
  • Desktop app finds your browser smarter on Windows
  • Tool calls in the background no longer crash the whole conversation when one orphaned response shows up
  • Skill marketplace β€” paginated, bilingual display names, security scan results visible β€” no more scrolling to find the one you want
  • File writes inside your own workspace stopped asking for approval β€” they didn't need to ask in the first place
  • A pile of small fixes you won't notice but will mean you swear less

Full list: git log v1.1.0..v1.1.137.


Upgrading

Most users β€” do nothing. Restart. Everything migrates itself.

Production deployers β€” you no longer need to set DASHSCOPE_API_KEY in .env. Configure it in the UI, like every other model.

The new memory system β€” off by default in the open-source build. Turn it on with:

mateclaw:
 memory:
 dream-v2:
 enabled: true

One more thing.

We call this version 1.1.137.

Because we changed 277 things.

But you shouldn't need to know that.

You should just open it, and notice it got better.

That's how products work.

Loading

MateClaw 1.1.0

17 Apr 16:45
@matevip matevip

Choose a tag to compare

Released: 2026εΉ΄04月17ζ—₯ Β· πŸ“– δΈ­ζ–‡ε‘εΈƒθ―΄ζ˜Ž Β· πŸ“– English release notes Β· ⬆️ UPGRADING.md

Downloads

Platform File
macOS (Apple Silicon) MateClaw_1.1.0_arm64.dmg / .zip
macOS (Intel) MateClaw_1.1.0_x64.dmg / .zip
Windows x64 MateClaw_1.1.0_Setup.exe / MateClaw_1.1.0_x64_Setup.exe
Windows ARM64 MateClaw_1.1.0_arm64_Setup.exe
Server (Docker) docker compose up -d from the repo root

Auto-update metadata: latest-mac.yml / latest.yml + .blockmap files are attached for differential updates.


The release that makes MateClaw feel like a real personal OS

1.0 brought the new look. 1.1 makes it work the way you expect. Chat replies stream live across every channel, long tasks keep running when you switch conversations, the Wiki is semantic now, and β€” for the first time β€” agents can write their own skills.

Under the hood this is the biggest 1.x release yet: 98 commits, 25 features, 44 fixes, 20 docs updates. This page is the shortlist. See UPGRADING.md if you're coming from 1.0.x.


Headline features

Agents that write their own skills β€” Auto Skill Synthesis

When the agent discovers a useful workflow β€” a recurring way to query a database, a particular report layout, the exact commands to SSH into your box β€” it can now ask to turn it into a skill, get your approval, and save it for next time. No more re-typing "remember that I like tables sorted this way." The agent's memory grows with you.

  • skill_manage tool supports create / edit / patch / delete
  • Auto security scan before saving (blocks dangerous patterns)
  • Clean approval flow in ChatConsole
  • Migrate between agents, share as ZIP

Multi-agent delegation, in parallel

One agent can now delegate to another β€” or to three at once. Hand the coding agent a Jira ticket while the research agent pulls competitor data while the writing agent drafts the Slack reply. Each runs in its own isolated session, results stream back to the orchestrator.

  • delegateToAgent and delegateParallel tools
  • Per-child conversation tracking + event relay so you see progress in real time
  • Smart routing hints in system prompt

Wiki goes semantic β€” two-phase digest + deep research

The Wiki knowledge base you fed your PDFs into is no longer a search box. It's now a retrieval engine:

  • Semantic search on every page and every chunk β€” ask "what did we decide about auth?" and get the decision, not just pages with "auth" in them
  • pgvector-style chunk embeddings with mean-pool sub-segmentation for documents that exceed model context
  • Two-phase digest: phase A extracts route + metadata, phase B merges per-page β€” order of magnitude faster on large imports (60+ pages processed in parallel)
  • Progress bar per raw material β€” stop staring at "processing..." and see pages done / total
  • Resumable: interrupt mid-import, hit "reprocess", only the unfinished pages re-run
  • New wiki_search_pages (hybrid), wiki_semantic_search (chunk-level), wiki_read_page, wiki_trace_source tools

Deep thinking

You can now turn on Anthropic extended thinking / DashScope qwq reasoning / OpenAI o1 reasoning_effort=high per agent and per turn. thinkingLevel: off / low / medium / high / max. The thinking block streams into the UI as a collapsible panel β€” you see the model reason, you don't see tokens wasted on tasks that don't need it.

Anthropic prompt caching

System prompts, agent personas, tool definitions β€” all automatically marked as cache_control: ephemeral on Anthropic-compatible endpoints. First request warm, every follow-up cached. Dashboard now tracks cache_read_tokens / cache_write_tokens daily.

Declarative hook system

A 5-milestone hook lifecycle: before_tool, after_tool, before_llm, after_llm, on_error. Hooks run in-process, can transform arguments / results / mask sensitive fields / add audit log entries. Tool guard rules are now one family of hooks.

Plugin SDK

Third-party plugins can now extend MateClaw without forking. SPI for ChannelAdapter, Tool, MemoryProvider, Hook. Discovery from JAR drop-in plugins/ dir.

Voice for every channel

IM channels (WeCom / WeChat / DingTalk) now support voice input. ASR via DashScope / OpenAI Whisper, multi-path fallback for WeChat's encrypted voice CDN. Voice replies synthesized via TTS and sent as audio messages.


The ChatConsole, reimagined for multi-channel

  • Realtime sync for external channels β€” a WeChat user talks to your agent, you see the reasoning, tool calls, and streaming reply happen in the ChatConsole sidebar. No F5.
  • Running indicator β€” amber pulse on every conversation with an active agent run
  • Switch doesn't kill β€” flip to a different conversation mid-stream, the previous one keeps running in the background; flip back, you reconnect to the live buffer
  • No duplicate bubbles β€” reconcile layer matches client-uuid placeholders with DB-persisted messages via ID promotion
  • Actionable error cards β€” "does not support tools" from Ollama now shows the specific guidance ("switch to qwen3 / qwen2.5:7b+ / llama3.1:8b+"), not a generic "unknown error"

WeChat / WeCom stability

Personal WeChat long-connection bot was the flakiest channel in 1.0. 1.1 rebuilds it:

  • pollLoop watchdog β€” no more silent pollers that stopped reconnecting
  • Jittered exponential backoff on token expiry / network blips
  • touchActivity per adapter β€” per-account staleness detection
  • Token persistence across restarts
  • Voice ASR with 3 fallback paths for WeChat's CDN schemes
  • WeCom markdown tables + refusal message detection

Under the hood

  • UI-configurable embedding models β€” embedding models are now regular rows in mate_model_config, no more env vars
  • LLM provider hardening β€” DashScope url-error classification, Ollama auto-discovery rewrites :latest tags, no-tools model blocklist
  • MySQL migrations fixed β€” Gitee #IIYHLJ β€” all ADD COLUMN IF NOT EXISTS rewritten as INFORMATION_SCHEMA guards; fresh MySQL 8.x deploys succeed; existing users self-heal via FlywayRepairConfig
  • Auto-create database on first connection (createDatabaseIfNotExist=true)
  • Defensive hardening β€” tool arg redaction, stream fallback timer, approval placeholder cleanup, agent cache invalidation, channel adapter stale-eviction
  • Docker compose refuses default passwords β€” missing DB_PASSWORD / DB_ROOT_PASSWORD / DASHSCOPE_API_KEY β†’ fails fast with a clear error. The old hardcoded mateclaw123 default is gone. See .env.example.

Fixes worth calling out

  • Tool guard β€” now scans procedures defined in migrations too
  • Flyway version collisions β€” V8/V9 and V9/V10 had parallel PR merge collisions; renumbered clean
  • Skill reinstall β€” SKILL.md overwrites correctly, ClawHub fetchBundle retries on transient failure
  • MCP autodetect β€” Node.js PATH discovery for Desktop app users on macOS/Linux without node in GUI launch env
  • Filesystem MCP β€” disabled by default, ~ placeholder made explicit
  • Wiki slug canonicalization β€” same concept across multiple chunks no longer wastes LLM calls
  • Wiki failure routing β€” DNS/TLS/connection-refused now fail fast with root cause in UI
  • ChatConsole β€” auto-scroll doesn't hijack mouse; sidebar shows channel-specific icons
  • TalkMode WebSocket β€” no more stray error on chat page load
  • MessageBubble β€” backend error translation surfaces in the failed-message card

See git log v1.0.418..v1.1.0 for the complete list.


Upgrading from 1.0.x

See UPGRADING.md. Most users: no manual steps. FlywayRepairConfig heals the mysql migration checksum drift, OllamaAutoDiscoveryRunner rewrites the broken :latest default, stale mate_model_config rows converge on next restart.

Production deployers must set new required env vars in .env:

  • DB_PASSWORD (strong, not mateclaw123)
  • DB_ROOT_PASSWORD
  • JWT_SECRET (recommended)
  • MATECLAW_CORS_ALLOWED_ORIGINS (recommended)

Thanks

To everyone who reported bugs on Gitee (#IIYHLJ and earlier), to the community running MateClaw on flaky home networks and corner-case environments, to the folks patient enough to restart their Ollama four times β€” this release is shaped by you.

Next up: v1.2 roadmap focuses on channel tunnel / unified queue and deeper agent autonomy. Watch the RFC folder.

Loading
cuixinhui629 and oour2017 reacted with thumbs up emoji
2 people reacted

MateClaw 1.0.418

11 Apr 15:55
@matevip matevip

Choose a tag to compare

v1.0.418

Released: 2026εΉ΄04月11ζ—₯

A Brand New Look

This is the most beautiful version of MateClaw we've ever made.

We rebuilt the entire interface from scratch β€” frosted glass effects, unified design tokens, responsive layout, collapsible panels. Every pixel has been rethought. This isn't a reskin. It's a reimagining of how you interact with your AI assistant.

The chat experience has been radically simplified: we removed the status noise you didn't need, made the approval flow feel natural, and introduced segmented message display with progressive loading and real-time persistence β€” you can watch the AI think step by step, like standing at a whiteboard with a brilliant colleague.

Three Big Things

1. True Workspace Isolation

Every workspace is now its own world. Its own agents, its own permissions, its own file boundaries.

  • Completely redesigned information architecture β€” navigation, settings, security all restructured
  • Workspace active directory (basePath) restricts AI to operate only within your designated folder, with symlink escape prevention
  • Permission hierarchy: owner β†’ admin β†’ member β†’ viewer β€” who sees what, who changes what, crystal clear

2. Full Internationalization

MateClaw now speaks two languages.

Every tool description, error message, security rule, and approval prompt on the backend β€” all bilingual Chinese/English. Not machine-translated. Structured i18n infrastructure. English-speaking users finally see the interface they deserve.

3. Chat-Based Scheduled Tasks

New CronJobTool β€” just say "check my server status every morning at 9am" in the chat, and the agent creates a cron job for you. No menus to find. No cron syntax to learn. This is what an AI operating system should feel like.

More Highlights

  • First-Run Experience β€” New onboarding flow, system health check (Doctor), one-click agent templates
  • Skill ZIP Import β€” Drag a ZIP file in, skill installed. That simple
  • Multi-Layer Memory β€” Pluggable provider architecture for short-term, long-term, and consolidation diary
  • ChatGPT Tool Calling β€” Native support for ChatGPT's tool calling protocol
  • Context Compression Upgrade β€” Smarter context pruning, 429 retry, iteration limits, repetition detection
  • Model Selector β€” Standalone component with grouped layout and instant search
  • Webchat Channel β€” Embeddable chat widget now fully configurable

Security Hardening

  • JWT secret validation enforced in production
  • CORS policy tightened
  • H2 Console auto-disabled in production
  • Login rate limiting
  • SQL injection fix
  • Internal error details no longer exposed to frontend

Under the Hood

  • Flyway Migration Framework β€” Replaces 5 manual migration classes, database versions finally trackable
  • Segmented Message Persistence β€” Thinking process saved in real-time, survives page refresh

Fixes

  • 16 modal components closing on backdrop click
  • Plan-Execute planning failures and persistent task list
  • Cross-conversation message leakage and tool call duplication
  • Stale approval UI after SSE disconnect
  • ThinkingSegment display and MessageBubble rendering
  • Wiki page Markdown rendering
Loading

v1.0.314

08 Apr 15:23
@matevip matevip

Choose a tag to compare

v1.0.314

Released: 2026εΉ΄04月08ζ—₯

Added

  • LLM Wiki Knowledge Base β€” Three-layer knowledge management system (raw β†’ wiki β†’ agent). AI automatically digests raw documents into structured Wiki pages with [[bidirectional links]]. Supports text/PDF/DOCX import, local directory batch scanning, full-text search, source tracing, manual edit protection, and backlink browsing. Agents auto-inject Wiki summaries into prompts and read full pages on demand
  • Text-to-Speech (TTS) β€” New speech synthesis with 3 providers: DashScope CosyVoice, OpenAI TTS, MiniMax T2A. Chat messages support one-click read-aloud
  • Speech-to-Text (STT) β€” New speech recognition with 3 providers: DashScope Paraformer, OpenAI Whisper, MiniMax ASR
  • Music Generation β€” AI music creation with DashScope and MiniMax providers
  • Image Generation Upgrades β€” Added Google Imagen and MiniMax providers, sync/async dual mode. 6 providers total (DashScope, OpenAI DALL-E, fal.ai Flux, Zhipu CogView, Google Imagen, MiniMax)
  • Video Generation Upgrades β€” Added Runway and MiniMax (Hailuo) providers for text-to-video and image-to-video. 4 providers total
  • Search System Upgrades β€” SearchProvider chaining with keyless fallback (DuckDuckGo + SearXNG), advanced parameters (freshness/language/count), result caching, security wrapping, coexistence of provider-native search and tool search
  • ChatGPT OAuth Login β€” Log in to OpenAI ChatGPT Plus/Pro accounts via OAuth to use GPT-4o, o3, o4-mini models directly
  • Steve Jobs Persona Skill β€” New builtin persona skill for analyzing products and evaluating decisions through Jobs' perspective
  • Memory System Enhancements β€” Dreaming recall tracking with scored emergence, active retrieval tracking, multi-gate filtering, DREAMS.md consolidation diary, and Dreaming status API
  • Agent Runtime Enhancements β€” Context pruning, thinking recovery, channel health monitor, smart truncation, stale stream cleanup, configurable tool timeouts, and fine-grained phase status hints
  • Database Schema Unification β€” H2 and MySQL schema/seed data fully aligned (26 tables). SchemaMigration classes refactored to only perform incremental column migrations

Fixed

  • Wiki kbId Resolution β€” Agent wiki tools no longer require kbId parameter; auto-resolved from agentId, fixing LLM repeatedly guessing wrong IDs
  • Wiki JSON Parsing β€” Enhanced LLM response parsing: control character cleanup, JSON block extraction, per-chunk content_filter error skipping instead of full document failure
  • Wiki Data Safety β€” Fixed raw endpoint cross-KB mutation, tool access control null agentId bypass, and deletion operations not syncing counts
  • Async Media Attachments β€” Fixed async-generated images/videos/music attaching to current assistant message instead of creating new ones
  • Tool Discovery β€” Changed tool registration to blacklist mode to prevent new tools from being accidentally excluded
  • Steve Jobs Skill β€” Fixed incorrect beanName for tool dependencies
  • Memory Consolidation β€” Code review fixes for Dreaming and tool execution
  • Memory Performance β€” Fixed N+1 queries, removed redundant SHA-256 computation, capped DREAMS.md growth
  • Agent Stability β€” Prevented premature long-task exits, improved execution stability
  • MySQL Seed Data β€” Added 3 missing tools (Video/Image/Wiki) and 1 missing skill (sql_query) to MySQL data scripts

Changed

  • Wiki Processing Status β€” Added partial status for documents where some chunks succeeded but others failed
  • SchemaMigration Architecture β€” All 4 SchemaMigration classes simplified; table creation moved to schema.sql, migrations only handle incremental column additions
  • README β€” Updated snapshot highlights with Wiki knowledge base, multimodal creation, search upgrades
  • application.yml β€” Added mate.wiki configuration block with 8 configurable properties
Loading

MateClaw 1.0.108

06 Apr 12:29
@matevip matevip

Choose a tag to compare

MateClaw v1.0.108

What's New

Datasource SQL Query with ECharts Visualization

A new SQL query skill enables natural language database queries. Results are automatically rendered as interactive ECharts charts (bar, line, pie, etc.) β€” just ask a question about your data and get a visualization.

Multimodal Image & Video Support

  • Chat now supports image injection, drag-and-drop upload with improved UX
  • Video file upload, preview, and multimodal analysis support
  • Scanned PDFs are automatically processed with OCR fallback for text extraction
  • Authenticated attachment display for secure file access

Desktop Enhancements

  • Dynamic Port Allocation: Backend now dynamically allocates an available port, preventing startup failures when the default port is occupied
  • Native Menus: macOS/Windows native application menus with About dialog and Check for Updates

WeChat & WeCom File Upload

File upload support added for WeChat personal and WeChat Work (WeCom) channels, enabling richer interactions on IM platforms.

Ollama Auto-Discovery & Activation

  • Auto-detect local Ollama instance on startup and set as default model
  • Auto-activate default model when provider API key is configured β€” no manual selection needed

OpenRouter Free Vision Models

Three free, vision-capable models added out-of-the-box:

  • Qwen3.6 Plus (qwen/qwen3.6-plus:free)
  • Gemini 2.5 Flash (google/gemini-2.5-flash:free)
  • Llama 4 Maverick (meta-llama/llama-4-maverick:free)

Perfect for getting started without API costs.

Bug Fixes

  • OpenRouter Init Failure: Fixed duplicate INSERT statement in MySQL data scripts that caused OpenRouter provider and all its models to silently fail during initialization
  • ReAct Loop Termination: Prevented agents from entering degenerate repetitive cycles with hardened loop detection
  • PDF OCR Accuracy: Fixed OCR trigger misjudgment, false success reports, and duplicate image injection
  • SVG Handling: SVG files are now skipped in multimodal injection; 400 errors no longer trigger infinite retries
  • IM Channel Images: Fixed mediaId-to-image-path resolution for IM channel multimodal messages
  • Browser Tool Stability: Fixed idle watchdog resource leak on repeated start/stop cycles; added Windows and Docker Chromium launch arguments
  • Ollama Connectivity: Added placeholder API key for Ollama provider to prevent call failures
  • Model Validation: Removed legacy DashScope-only restriction that blocked other providers
  • H2 Schema Migration: Check column existence before ALTER TABLE to suppress duplicate column errors
  • Skill Sync: Workspace now syncs correctly when builtin skills are updated or version-bumped

Changes

  • Copyright updated to 2026; version number now read dynamically from package.json
  • User guide documentation fully rewritten to match current codebase
  • Mobile responsive layout with slide-in drawer navigation
  • Hardcoded i18n strings replaced with proper locale keys

Downloads

Platform File Notes
macOS (Apple Silicon) MateClaw_1.0.108_arm64.dmg Recommended for M1/M2/M3/M4/M5
macOS (Apple Silicon) MateClaw_1.0.108_arm64.zip zip format
macOS (Intel) MateClaw_1.0.108_x64.dmg Intel Mac
macOS (Intel) MateClaw_1.0.108_x64.zip zip format
Windows (x64) MateClaw_1.0.108_Setup.exe Most Windows PCs
Windows (x64) MateClaw_1.0.108_x64_Setup.exe Explicit x64
Windows (ARM64) MateClaw_1.0.108_arm64_Setup.exe ARM Windows

All installers bundle JRE 21 β€” no Java installation needed. Existing users will receive this update automatically.

Loading

MateClaw v1.0.101

05 Apr 11:59
@matevip matevip

Choose a tag to compare

MateClaw v1.0.101

What's New

Mobile Responsive UI

  • Sidebar transforms to slide-in drawer with hamburger menu on mobile (<=768px)
  • Conversation panel becomes a toggleable overlay on mobile
  • Welcome screen centers properly with auto text wrapping, single-column suggestion cards
  • Chat header auto-simplifies: icon-only agent badge, adaptive model selector
  • Reduced padding/gaps across all chat components for mobile screens

Drag & Drop File Upload

  • Drag-and-drop files and folders directly into the chat area
  • Electron: directory references via local path; Web: recursive file collection and upload

Multi-Agent Collaboration

  • DelegateAgentTool for agent-to-agent task delegation

LLM Context Awareness

  • Current datetime automatically injected into LLM context for time-aware responses

MCP Server

  • Pre-configured GitHub MCP Server in seed data (ready to use out of the box)

Ollama Auto-Discovery

  • Auto-detect local Ollama instance on startup
  • Pre-configured 6 popular local models (Qwen3, Llama, DeepSeek, Gemma, Phi, Mistral)
  • Local providers sorted first in model management UI

Model Management Enhancements

  • Provider list grouped by Local / Cloud with section headers
  • Zhipu AI models updated to GLM-5 series (GLM-5-Turbo / GLM-5V-Turbo / GLM-5 / GLM-5.1)
  • 20+ model providers supported

API Docs

  • Replaced Knife4j with SpringDoc OpenAPI 2.8.16 (/swagger-ui.html)

Bug Fixes

  • Security: Fixed SPA frontend route refresh returning 401
  • i18n: Window title dynamically set from language pack instead of hardcoded
  • i18n: Fixed 5 hardcoded Chinese strings in approval bar
  • i18n: Fixed hardcoded time formatting (locale-aware now)
  • Guard: Aligned tool guard rule names with runtime @Tool method names
  • LLM: Fixed Zhipu connection test 404
  • UI: Fixed suggestion cards grid misalignment with longer text
  • Upload: File upload size limit raised to 100MB

Download

Platform File Note
macOS Apple Silicon MateClaw_1.0.101_arm64.dmg M1 / M2 / M3 / M4 / M5
macOS Intel MateClaw_1.0.101_x64.dmg Intel Mac
Windows MateClaw_1.0.101_Setup.exe Windows 10/11 (x64+arm64)
Windows x64 MateClaw_1.0.101_x64_Setup.exe Windows 10/11 x64
Windows ARM64 MateClaw_1.0.101_arm64_Setup.exe Windows ARM64

zip / blockmap / yml files are for auto-update support.

Links

Full Changelog: v1.0.0...v1.0.101

Loading
Previous 1
Previous

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /