-
Notifications
You must be signed in to change notification settings - Fork 11.2k
plan-and-execute: the 'how' layer downstream of grill → prd → issues #300
-
I really like these skills, they're really good. grill-with-docs, to-prd, and to-issues are part of my daily workflow now. So first off, thanks.
One thing I kept wanting was something for the last step: actually executing a slice once it's been planned. There wasn't anything in the repo for that, so I made a skill to fill the gap. I've been running variations of it for the past couple weeks and it's been working great, so figured I'd share it in case it's useful to anyone. Suggestions very welcome too.
How vs what
I tried hard to keep this from turning into another "own the whole process" thing (GSD/BMAD/Spec-Kit, which your README pushes back on, rightly). It only owns the execution strategy, the how. The what/why still lives upstream in the grill → PRD → issues flow. So it's not competing with the alignment stuff, it just picks up where that leaves off.
Where it slots in
It reads whatever upstream artifact is already there, or plans fresh if there isn't one:
- a
/to-issuesissue → that's the work spec - a
/to-prdPRD-issue → reads it directly - a
/handoffdoc → picks up from a compacted context - or just the live session context right after a
/grill-with-docsconversation - or nothing, and it plans from scratch
In practice I'll kick it off with a PR/issue number, a handoff doc path, or just inline after a grilling session. It's pretty flexible about input. /grill-with-docs sits naturally upstream of it, that's where the what/why gets settled and this skill is careful not to re-open any of that.
What it actually does
Once there's a plan, it runs the work as a serial chain of fresh subagents passing a shared baton, with a verifier at the end:
- The plan breaks the work into sequential scopes, each sized to a token budget (target ~50K, hard ceiling 80K, counting everything a worker holds and not just lines changed). It leans toward more, smaller scopes.
- The orchestrator only ever holds the plan, the scope list, and one-line acks. It never holds the actual work product, so its context stays small even on a long run.
- Each scope gets a fresh worker that reads the baton plus the prior diffs, does just its scope, and reports back one line.
- The verifier checks each real diff against its acceptance criteria, runs build + tests, and can spawn one fix-worker if something fails.
The whole point is to stop long runs from either blowing out the orchestrator's context window or drifting away from the plan. Every worker starts clean and the baton carries just enough to keep going.
Rough edges (would love input here)
- The 50K / 80K budgets are just heuristics I tuned on my own runs. No idea how well they hold up elsewhere.
- The verifier caps at 2 fix-rounds before it surfaces to you. That number is basically a guess.
- The baton is a hand-rolled XML-ish format. Not sure it's the right shape.
Here's the full SKILL.md:
--- name: plan-and-execute description: Plan a task's execution strategy, then run it as a serial chain of fresh subagents passing a shared baton, ending with a verifier. Ingests an upstream artifact (issue/PRD/handoff doc) when present; plans from scratch when not. disable-model-invocation: true --- # Plan and execute Own execution strategy (the *how*), not requirements (the *what/why* — that's upstream in issues/PRDs). Preserve the plan's intent across the chain. ## Plan first Detect mode: - **Ingest** — upstream artifact specifies the work (issue, PRD-issue, or `/handoff` doc pointing at one). Read it; plan execution strategy only. Never re-litigate settled scope. - **Fresh** — no artifact. Plan from scratch. Then: - Plan mode on → the plan must name its execution strategy (below). - Plan mode off → ask if they want it before anything else; suggest nothing until they answer. If no: resolve open questions with AskUserQuestion, then draft a plan under the same rules. - No execution until the user approves the plan. ## Before executing, ask - How to split into **sequential scopes**. Default to *more, smaller* scopes — when a scope feels borderline, split it. Budget per scope: **target ~50K tokens, hard ceiling 80K**. The ceiling counts everything a worker must hold — files it reads, the diffs it produces, and its own reasoning/output — not just lines changed. Anything you project over 80K must be split *before* spawning, not discovered mid-run. Order scopes so each builds on the last. - Concrete splitting cues — break the scope when it spans more than one of: a new data model + its API surface; production code + its test suite; more than ~3–4 files of net-new logic; a migration + the code that depends on it. Each of these is its own scope. - Whether to commit once the verifier is clean. If yes: commit from baton summaries; commit only this session's files; if files are pre-staged, ask how to handle; once file count is known, ask single vs multiple commits. ## Pipeline Orchestrator holds only the plan, scope list, and one-line acks — never work product. Spawn each agent in turn: 1. Ingest artifact (or plan fresh) → decompose into sequential scopes (target ~50K, ceiling 80K each — bias toward smaller) → seed the baton: one `<agent-n-slug>` block per scope, each with `<acceptance>` filled in and the remaining tags as empty placeholders. 2. Per scope, in order, spawn one worker. Each: reads the baton + the diffs named in prior `<anchors>`, does exactly its scope, fills in the empty tags of its own block (never creates a new block), returns one line: `scope <n> done` | `scope <n> incomplete: <what's left>` | `scope <n> blocked: <why>`. 3. Spawn the verifier: for each scope, check the real diff (via `<anchors>`) against its `<acceptance>`; run build + tests; confirm the whole meets `<goal>`. Return pass, or fail with specifics. On finish: brief summary + baton path. ## Failure branches - **Incomplete** (worker finds the scope larger than planned and stops partway) → orchestrator splits the remainder into a new scope and continues from there. - **Blocked** → orchestrator halts the chain — does not spawn the next worker onto a broken diff. Spawn a fix worker or surface to the user. - **Verifier fails** → orchestrator spawns one fix worker (fresh; reads baton + the failures), then re-verifies. After 2 failed rounds, stop and surface to the user. ## Baton Shared file passed worker→worker. Write for agents, not humans. Create in TEMP (`mktemp`; PowerShell `New-TemporaryFile`). Lightweight — points to the plan doc for granular instructions, never restates them. Header `<context>`: `<plan-ref>`, `<goal>` (one line), `<approach>` (one line: the strategic/tonal steer, e.g. "surgical, precedent-driven, mirror #57"; no facts, file lists, or step detail — those live in the plan). The orchestrator seeds one `<agent-n-slug>` block per scope with `<acceptance>` filled and every other tag present but empty. The worker fills the empty tags in its own block — it never creates a new block: - `<acceptance>` — checkable done-criteria; orchestrator writes this when seeding; the worker never edits it - `<summary>` - `<anchors>` — handles to read real work: touched paths, new/changed signatures, migration IDs, test names - `<size-vs-plan>` — exactly one word: `smaller` | `as-expected` | `larger` (how the actual work compared to the planned scope; not a t-shirt size, no prose). On `larger`, the orchestrator must shrink every remaining scope before continuing — a `larger` reading means the plan is under-splitting - `<approx-context-used>` — absolute token estimate, e.g. `~30K`; never a percentage (ambiguous without window size — convert percen×ばつwindow if that's all you have). Over the ~50K target = re-examine later scopes for the same drift; over the 80K ceiling = flag the scope and split whatever remains downstream - `<concerns>` — problems hit or risks for later workers - `<instructions-for-next-agent>`
That's it. Take it, change it, break it, whatever. Let me know if it's useful or if I'm doing something dumb. Thanks again for the skills.
Beta Was this translation helpful? Give feedback.