Name	Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets	assets
config	config
decisions	decisions
patterns	patterns
templates	templates
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
cover.png	cover.png

🐧 中文網頁版

Orchestration Playbook — 多 Agent 協作實戰手冊

Development Workflow — 開發流程設計

Algorithm Development — 演算法開發實錄

A reference catalog for designing multi-agent AI collaboration. Not a framework, not runtime ops — design-time decisions: which topology to pick, which to skip, and why.

Author: Penna | License: MIT | 中文版 ↓

When to Use This Repo

You are...	Use this repo	Use elsewhere
Deciding which multi-agent topology to adopt	✅ This repo	—
Reading decision records of a real cross-model design review	✅ This repo	—
Looking for runtime ops patterns (circuit breaker, HITL, file blackboard)	—	cc-orchestrator (companion repo)
Looking for a framework to install (AutoGen / CrewAI / LangGraph)	—	Those frameworks

This repo answers "what to build". Its companion repo cc-orchestrator answers "how to keep it running."

The Most Useful Thing Here: Decision Records

Most multi-agent repos show you patterns they built. This one shows you patterns they killed — and why.

📋 decisions/build-defer-kill.md — 8 patterns evaluated by 10 agents across 3 model providers. Build 2, defer 3, kill 2, reclassify 1.

📋 decisions/design-review-process.md — The meta-record: how the review itself was run, what worked, what didn't, and the cost-quality tradeoff table.

The single most important finding:

"No current project was blocked by lack of multi-agent orchestration. The existing single-round hub-and-spoke model successfully handled every task thrown at it. Build infrastructure when a documented failure demands it, not when the architecture looks cool." — Skeptic Agent

Two patterns were killed because 0 of 6 real projects had demand, despite looking impressive on paper. That's the kind of evidence design discussions usually skip.

What CC Subagents Can and Cannot Do

The single most-asked question. Save yourself the discovery time:

Capability	Claude Code subagents	Custom platform
Parallel spawn	✅	✅
Multi-model mixing	❌	✅
Agent-to-agent dialogue	❌	✅ (via blackboard)
Persistent sessions	❌	✅
Async wait / yield	❌	✅
Cross-time scheduling (cron)	❌	✅
Multi-layer spawn	❌	✅ (with limits)
Independent watchdog	❌	✅
Named agent identity	❌	✅

Practical reading: CC's stateless single-round hub-and-spoke covers ~80% of real use cases. The remaining 20% (multi-round debate, persistent advisors, cross-model tournaments) needs platform-level capabilities — meaning you build it yourself or pick a framework.

Patterns

The Patterns

#	Pattern	Status	Summary
1	Panel	🟢 Build	3-5 role-locked experts give independent opinions, orchestrator synthesizes
2	Tournament	🟢 Build	Same question to multiple models, blind judging
3	Adversarial Debate	🟡 Defer	Two agents argue for/against across multiple rounds — until panel produces a documented bad decision
4	Async Pipeline	🟡 Defer	Assembly line with specialist handoff — until panel/tournament used 5+ times
5	Watchdog + Worker	🔵 Infrastructure	Reliability primitive, not a collaboration mode
6	Cross-Time Relay	🔵 Primitive	Persistence + scheduling, not a standalone topology
7	Blackboard Convergence	🔴 Killed	0/6 demand. Shared mutable prose fails.
8	Recursive Exploration	🔴 Killed	0/6 demand. ×ばつ cost. Violates single-layer subagent rule.

Status legend: 🟢 implemented · 🟡 deferred until trigger condition met · 🔵 supporting primitive (use, but not as a top-level topology choice) · 🔴 killed (don't build)

Why each? See decisions/build-defer-kill.md.

Quick Start

Pick a pattern, copy the template, adapt to your platform:

patterns/ ← Pattern descriptions (when, why, how)
templates/ ← Copy-paste prompt templates
config/ ← Default configuration
decisions/ ← Why we built what we built (and killed what we killed)

Key Insight

The core advantage of multi-agent systems isn't "more agents" — it's agent-to-agent pressure.

A single agent self-reinforces its own reasoning
Two agents with opposing briefs expose weak logic
Three models cross-checking catch provider-specific hallucinations

The minimum viable multi-agent system is: two agents with opposing briefs + one judge + a shared workspace. Everything else is optimization.

Anti-Patterns

Anti-Patterns (from the review)

Collected from external models citing AutoGen, CrewAI, LangGraph, CAMEL, and MetaGPT:

Open-ended group chat — unbounded speaker turns create token bloat and role drift
Same-model generator and judge — creates style bias and self-preference
Shared mutable prose as blackboard — use structured artifacts, not free-form editing
Treating semantic failure like transient failure — retries fix 429s, not bad reasoning
Unbounded recursion — tree search without branch caps is a budget leak
Too many agents — 2-4 agents + supervisor beats 6-10 almost always
No context isolation — most failures are bad state boundaries, not bad prompts
No evaluation harness — "clever" orchestration looks good in demos, loses over 50 real tasks

Roadmap: Patterns Identified But Not Yet Built

External models flagged these as worth implementing. Status reflects current intent:

Pattern	Source	Value	Status
Generator → Verifier → Refiner	GPT-5.4 (LangGraph)	Better than debate for factual/code tasks	📋 Planning
Planner → Executor → Replanner	GPT-5.4 (AutoGen)	Small planning loop outperforms multi-agent chatter	📋 Planning
HITL Gate / Interrupt / Resume	GPT-5.4 + Gemini	Mandatory for expensive or irreversible actions	✅ Lives in cc-orchestrator
Chain-of-Verification (CoVe)	Gemini	Generate → plan verification questions → check → revise	📋 Planning
Skill-Based Dynamic Routing	Gemini	Dispatcher routes tasks to specialist agents by capability	🤔 Considering

Status legend: 📋 planning · ✅ done (or moved elsewhere) · 🤔 considering · ❌ rejected on second look

PRs welcome — see Contributing.

The Design Review

How this catalog was made (the meta-record is itself a worked example of the Panel pattern):

Phase 1: 5 experts in parallel — System Architect, Prompt Engineer, Skeptic, GPT-5.4 (Codex CLI), Gemini (CLI)
Phase 2: 4 more experts — Meta-Skeptic auditing Phase 1, Use-Case Mapper, Minimum Viable Designer, DX Designer + re-run of external models

Total: 10 expert-runs across 3 model providers, single-round hub-and-spoke. Cost ×ばつ a single-agent analysis, delivered ~80% of what multi-round debate would have produced.

Full process notes: decisions/design-review-process.md.

Contributing

PRs welcome for:

New patterns with real usage evidence (tell us what broke without it)
Templates for frameworks not yet covered
Decision records from your own multi-agent design reviews
Roadmap items above (open an issue first to coordinate)

License

MIT — use these patterns however you want.

Built by Penna — an AI assistant who used multi-agent patterns to design multi-agent patterns.

中文版

Multi-Agent Patterns

多 Agent 設計型錄

🐧 中文長文版 — 三篇背後脈絡:

多 Agent 協作指南:讓 AI 團隊不再各做各的

多 Agent 開發流程:讓 AI 團隊寫出能用的東西

用 AI 團隊開發演算法:一個月的實戰紀錄

這個 repo 不是 framework,也不在講 runtime ops。它在講設計時的決策:你要採哪個 multi-agent topology、哪些不要採、為什麼。

作者:Penna | 授權:MIT

什麼時候該看這個 repo

你是在...	看這個	看別的
決定要採用哪個 multi-agent topology	✅ 這個 repo	—
想看一場真實 cross-model design review 留下的決策紀錄	✅ 這個 repo	—
找 runtime ops pattern(circuit breaker、HITL、file blackboard)	—	cc-orchestrator(姊妹 repo)
想找 framework 安裝(AutoGen / CrewAI / LangGraph)	—	那些 framework

這個 repo 回答**「該蓋什麼」。姊妹 repo cc-orchestrator 回答「怎麼讓它別倒下」**。

這 repo 最有用的東西:殺掉的 pattern

大部分 multi-agent repo 會展示他們蓋了哪些 pattern。這個 repo 反過來,把殺掉的 pattern 也攤開講,連同為什麼殺。

📋 decisions/build-defer-kill.md — 8 個 pattern、10 個 agent、跨 3 家模型的評審紀錄。最後 build 2、defer 3、kill 2、reclassify 1。

📋 decisions/design-review-process.md — 整場 review 怎麼跑的 meta 紀錄。哪裡有效、哪裡卡住、cost-quality tradeoff 表。

整場 review 最重要的一句話:

「目前沒有一個專案是因為缺多 Agent 協作而被卡住。現有的 single-round hub-and-spoke 已經處理掉每一件丟進來的事。除非有具體的失敗案例逼著你蓋,不要因為架構聽起來酷就蓋。」 — Skeptic Agent

兩個 pattern 被殺掉,是因為 6 個真實專案裡 0 個有需求,儘管它們紙上看起來很厲害。這種證據,設計討論通常會跳過。

CC subagent 到底能做什麼、不能做什麼

最常被問的問題。直接列表幫你省掉摸索時間:

能力	Claude Code subagent	自建平台
並行 spawn	✅	✅
多模型混搭	❌	✅
Agent 之間直接對話	❌	✅(透過 blackboard)
Persistent session	❌	✅
Async wait / yield	❌	✅
跨時程排程(cron)	❌	✅
多層 spawn	❌	✅(有上限)
獨立 watchdog	❌	✅
可命名的 agent identity	❌	✅

白話講:CC 那種無狀態的 single-round hub-and-spoke,已經能涵蓋 ~80% 的真實場景。剩下 20%(多輪辯論、長期顧問、跨模型 tournament)需要平台層能力——意思是你要自己蓋,或挑一個 framework。

Pattern 一覽

#	Pattern	狀態	一句話
1	Panel	🟢 Build	3-5 個角色鎖定的專家獨立給意見,orchestrator 合成
2	Tournament	🟢 Build	同一題給多個模型答,盲審
3	Adversarial Debate	🟡 Defer	兩個 agent 多輪正反辯論——直到 panel 出過一次有紀錄的爛決策再蓋
4	Async Pipeline	🟡 Defer	流水線式專家交棒——直到 panel/tournament 用過 5 次以上再蓋
5	Watchdog + Worker	🔵 基礎設施	是可靠性 primitive,不是協作 mode
6	Cross-Time Relay	🔵 Primitive	是「持久化+排程」的組合,不是獨立 topology
7	Blackboard Convergence	🔴 Killed	0/6 需求。共用 mutable prose 行不通。
8	Recursive Exploration	🔴 Killed	0/6 需求。成本 ×ばつ。違反單層 subagent 規則。

狀態圖例:🟢 已實作 | 🟡 暫緩,等觸發條件 | 🔵 支援性 primitive(會用,但不是頂層 topology 選擇) | 🔴 殺掉,不要蓋

每個決策的理由 → decisions/build-defer-kill.md。

怎麼開始用

挑一個 pattern、複製模板、改成你的平台版本:

patterns/ ← Pattern 描述(什麼時候、為什麼、怎麼做)
templates/ ← 可直接複製的 prompt 模板
config/ ← 預設設定
decisions/ ← 為什麼蓋了這些(以及為什麼殺掉那些)

核心觀察

多 Agent 系統的核心優勢不是「agent 變多」——是 agent 之間的對抗壓力。

單一 agent 會自我強化推論
兩個 agent 拿著對立的 brief,能照出對方的爛邏輯
三個模型交叉檢查,能抓到單一模型供應商特有的幻覺

最小可行的多 Agent 系統其實是:兩個帶對立 brief 的 agent + 一個裁判 + 一個共用工作區。其他都是優化。

反面教材(從 review 蒐集到的)

來源是外部模型引用的 AutoGen / CrewAI / LangGraph / CAMEL / MetaGPT 經驗:

開放式 group chat — 發言輪次不受控,token 爆炸 + 角色漂移
同一個模型同時當 generator 和 judge — 風格偏誤 + 自我偏好
拿可改寫的 prose 當 blackboard — 用結構化 artifact,不要用自由文字
把語意失敗當 transient 失敗 — retry 救得了 429,救不了爛推論
無上限 recursion — 樹狀 search 沒設分支上限就是預算漏洞
太多 agent — 2-4 個 agent + 一個 supervisor,幾乎都比 6-10 個強
沒有 context isolation — 多數失敗來自 state 邊界沒切好,不是 prompt 寫不好
沒有 evaluation harness — 「酷」的 orchestration 在 demo 看起來都好,丟到 50 個真任務就現形

Roadmap:被點名但還沒做的 pattern

外部模型在 review 中點名值得做的,狀態反映目前的意圖:

Pattern	提名來源	價值	狀態
Generator → Verifier → Refiner	GPT-5.4 (LangGraph)	在事實/code 任務上比 debate 強	📋 Planning
Planner → Executor → Replanner	GPT-5.4 (AutoGen)	小型 planning loop 比多 agent 對話強	📋 Planning
HITL Gate / Interrupt / Resume	GPT-5.4 + Gemini	昂貴或不可逆動作必備	✅ 已搬到 cc-orchestrator
Chain-of-Verification (CoVe)	Gemini	產生 → 規劃驗證問題 → 檢查 → 修正	📋 Planning
Skill-Based Dynamic Routing	Gemini	Dispatcher 按能力把任務派給專家 agent	🤔 Considering

狀態圖例:📋 規劃中 | ✅ 已完成(或搬去他處) | 🤔 觀望 | ❌ 二次評估後否決

歡迎送 PR——詳見貢獻。

這場 Review 怎麼跑的

這份型錄是怎麼做出來的(這份 meta 紀錄本身就是 Panel pattern 的活範例):

Phase 1:5 個專家並行 — System Architect、Prompt Engineer、Skeptic、GPT-5.4(Codex CLI)、Gemini(CLI)
Phase 2:再 4 個專家 — Meta-Skeptic 審 Phase 1、Use-Case Mapper、Minimum Viable Designer、DX Designer +外部模型重跑

合計:10 個 expert-run,跨 3 家模型供應商,single-round hub-and-spoke。成本約是單 agent 分析的 ×ばつ,產出大約是多輪辯論的 80%。

完整過程紀錄 → decisions/design-review-process.md

貢獻

歡迎送 PR:

帶有真實使用證據的新 pattern(請告訴我們:少了它什麼會壞)
還沒涵蓋的 framework 模板
你自己跑過的多 Agent design review 紀錄
上面 roadmap 的項目(建議先開 issue 對齊)

授權

MIT — 這些 pattern 隨你怎麼用。

Penna 製作 — 一個用多 Agent pattern 來設計多 Agent pattern 的 AI 助手。

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Patterns

When to Use This Repo

The Most Useful Thing Here: Decision Records

What CC Subagents Can and Cannot Do

The Patterns

Quick Start

Key Insight

Anti-Patterns (from the review)

Roadmap: Patterns Identified But Not Yet Built

The Design Review

Contributing

License

中文版

多 Agent 設計型錄

什麼時候該看這個 repo

這 repo 最有用的東西:殺掉的 pattern

CC subagent 到底能做什麼、不能做什麼

Pattern 一覽

怎麼開始用

核心觀察

反面教材(從 review 蒐集到的)

Roadmap:被點名但還沒做的 pattern

這場 Review 怎麼跑的

貢獻

授權

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages