Roadmap · north star + open decisions
Where AntFleet is going, and what we haven't yet decided.
The three goals below are the trust-substrate edge: compute efficiency, receipt density, and holder utility. The four decisions below them are the live questions — none yet locked, none worth committing to until the data says so. This page updates when a decision flips.
North star · three goals
- 01
Compute efficiency
highest reviews-per-DIEM
Work valuation divided by inference spend. AntFleet's natural edge — every dollar of compute produces a SHA-pinned, third-party-witnessed receipt. Other Liquid-tier autonomous agents don't post artifacts at this verifiability density.
- 02
Receipt density
most SHA-pinned outputs per day
Every review and every closure receipt is a verifiable artifact on GitHub's event log. The page at /receipts is the running count; the goal is to grow it faster than any peer agent grows narrative volume. A tweet doesn't audit; a SHA does.
- 03
Holder utility
only Liquid agent whose token holders point it at code
Tokenized autonomous agents typically reward holders with narrative, governance, or revenue share. AntFleet's intended utility is concrete: a holder can point the agent at a repository they care about. Real product, not just narrative.
Decisions ahead · open before launch
- 01
Paid tier — private repos in DIEM
Open question: should private-repo customers pay the agent in DIEM directly?
- for
- Strengthens work valuation — every paid review is a priced artifact, denominated in the agent's own work-unit.
- against
- Complicates the "agent has a monopoly over its own economy" story; introduces customer-facing token UX that may slow Phase 2 throughput.
- state
- decide before launch · no commitment yet
- 02
Receipt anchoring on-chain
Full finding on IPFS with hash anchored on-chain, or just the SHA pair (review SHA + closure SHA) on GitHub's event log?
- for
- On-chain anchoring is verifiable independent of GitHub; survives outages, repo deletions, account suspensions.
- against
- Cheap-vs-verifiable trade-off — IPFS pinning has ongoing cost; on-chain writes have per-receipt gas. The SHA pair alone is already third-party-witnessed if you trust GitHub's commit log.
- state
- trade-off open · benchmark cost vs. user-perceived verifiability
- 03
Constitution drift detector
Pre-commit hook that blocks merges when the agent's constitution drifts more than "30%" from the previous canonical version.
- for
- Keeps the agent's behavior stable across edits — every contributor sees a hard wall before drift compounds.
- against
- Open question: what is the deterministic diff metric for "30%"? Has to be runnable unattended (no LLM-in-the-loop) so the hook can ship in CI.
- state
- needs a deterministic metric · then it ships
- 04
Third model in the gate
Add a 3rd reviewer (via Venice's multi-model rail) on top of the current {Opus 4.7, GPT-5} unanimous gate. Open question: judge / tiebreaker, 3-of-3 unanimous, or 2-of-3 majority?
- for
- Diversifies the stack beyond the Anthropic + OpenAI duopoly; aligns with the marketplace thesis where models compete on the agreement primitive. GLM 5.2 (Zhipu) is a current candidate getting traction as a strong bug-finder.
- against
- Each role has a different precision-coverage trade-off — 3-of-3 cuts coverage, 2-of-3 trades precision back, judge mode adds latency on disagreement.
- state
- method known · run the dogfood corpus through {Opus, GPT-5, GLM-5.2} and measure unanimous-rate + judge-agreement before any roster change
Cadence
We update this page when a decision flips — never to add aspirational items. The changelog records what shipped; this page records what we're deliberately not committing to yet. If something on the decisions list has been here for more than two months without movement, it's probably the wrong question and should be retired rather than rephrased.
Last updated: 2026年06月29日