Public benchmarks · updated live
Every two-model review AntFleet ran on a benchmark-class repo.
75
benchmarks and counting
updated 1 month ago
Benchmark-class repos are public repos with a BENCHMARK.mdfile at the root. PRs there are not meant to merge — they exist to run a known diff past AntFleet's two-model unanimous consensus and publish the result. Click any row to read the bot review on GitHub.
Looking for closed-finding receipts instead? /receipts.
Latest benchmarks
showing 50 of 75- 3 findings3 filesreview →
AntFleet/bench-hermes-desktop · PR #3
gpt-5claude-opus-4-7·commit 1a9e716·1 month ago - 0 findings (clean)3 filesPR →
AntFleet/bench-hermes-desktop · PR #2
gpt-5claude-opus-4-7·commit 74b69cf·1 month ago - 2 findings3 filesreview →
AntFleet/bench-hermes-desktop · PR #1
gpt-5claude-opus-4-7·commit 1324122·1 month ago - 5 findings15 filesreview →
AntFleet/bench-agentfloat · PR #3
gpt-5claude-opus-4-7·commit 1f7a758·1 month ago - 0 findings (clean)3 filesPR →
AntFleet/bench-agentfloat · PR #2
gpt-5claude-opus-4-7·commit 7adf24e·1 month ago - 0 findings (clean)15 filesPR →
AntFleet/bench-agentfloat · PR #1
gpt-5claude-opus-4-7·commit b402b85·1 month ago - 1 finding11 filesreview →
AntFleet/bankrskills-bench · PR #5
gpt-5claude-opus-4-7·commit 9b50924·1 month ago - 2 findings5 filesreview →
AntFleet/bankrskills-bench · PR #4
gpt-5claude-opus-4-7·commit b057dca·1 month ago - 1 finding4 filesreview →
AntFleet/bankrskills-bench · PR #3
gpt-5claude-opus-4-7·commit bf1943a·1 month ago - 0 findings (clean)9 filesPR →
AntFleet/bankrskills-bench · PR #2
gpt-5claude-opus-4-7·commit bfd0649·1 month ago - 2 findings3 filesreview →
AntFleet/bankrskills-bench · PR #1
gpt-5claude-opus-4-7·commit d5eb1b5·1 month ago - 0 findings (clean)1 filePR →
AntFleet/aeon-bench · PR #33
gpt-5claude-opus-4-7·commit 6d0181f·1 month ago - 2 findings1 filereview →
AntFleet/aeon-bench · PR #32
gpt-5claude-opus-4-7·commit 37715b1·1 month ago - 4 findings2 filesreview →
AntFleet/aeon-bench · PR #31
gpt-5claude-opus-4-7·commit 864a35c·1 month ago - 2 findings2 filesreview →
AntFleet/agent-autonomopoly-bench · PR #6
gpt-5claude-opus-4-7·commit 4904a75·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/agent-autonomopoly-bench · PR #5
gpt-5claude-opus-4-7·commit e7466d8·1 month ago - 2 findings15 filesreview →
AntFleet/agent-openhuman-bench · PR #2
gpt-5claude-opus-4-7·commit 59e046b·1 month ago - 2 findings1 filereview →
AntFleet/agent-openhuman-bench · PR #1
gpt-5claude-opus-4-7·commit 9df8938·1 month ago - 1 finding6 filesreview →
AntFleet/aeon-bench · PR #28
gpt-5claude-opus-4-7·commit 0059094·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #27
gpt-5claude-opus-4-7·commit 4cb226d·1 month ago - 0 findings (clean)3 filesPR →
AntFleet/aeon-bench · PR #26
gpt-5claude-opus-4-7·commit 4be9281·1 month ago - 3 findings2 filesreview →
AntFleet/aeon-bench · PR #25
gpt-5claude-opus-4-7·commit 37d0c07·1 month ago - 1 finding2 filesreview →
AntFleet/aeon-bench · PR #23
gpt-5claude-opus-4-7·commit 66bb888·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #22
gpt-5claude-opus-4-7·commit 2625e16·1 month ago - 1 finding2 filesreview →
AntFleet/aeon-bench · PR #21
gpt-5claude-opus-4-7·commit f8cace5·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #20
commit c5dc24e·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #19
commit 72f8589·1 month ago - 0 findings (clean)1 filePR →
AntFleet/aeon-bench · PR #18
gpt-5claude-opus-4-7·commit 1ec7a89·1 month ago - 0 findings (clean)1 filePR →
AntFleet/aeon-bench · PR #17
gpt-5claude-opus-4-7·commit 9b290aa·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #16
commit c44ee52·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #15
gpt-5claude-opus-4-7·commit 334edff·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #14
gpt-5claude-opus-4-7·commit 3a53384·1 month ago - 2 findings2 filesreview →
AntFleet/aeon-bench · PR #12
gpt-5claude-opus-4-7·commit 76b3059·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #11
commit ba745df·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #10
commit f6726cf·1 month ago - 0 findings (clean)5 filesPR →
AntFleet/aeon-bench · PR #8
gpt-5claude-opus-4-7·commit a9a960b·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #7
gpt-5claude-opus-4-7·commit b190012·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #5
commit 14fb422·1 month ago - 2 findings1 filereview →
AntFleet/aeon-bench · PR #30
gpt-5claude-opus-4-7·commit fac2cd3·1 month ago - 2 findings3 filesreview →
AntFleet/aeon-bench · PR #29
gpt-5claude-opus-4-7·commit 79a346f·1 month ago - 0 findings (clean)6 filesPR →
AntFleet/aeon-bench · PR #28
gpt-5claude-opus-4-7·commit 0b9d165·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #27
gpt-5claude-opus-4-7·commit bfcf2bf·1 month ago - 0 findings (clean)3 filesPR →
AntFleet/aeon-bench · PR #26
gpt-5claude-opus-4-7·commit df29156·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #25
gpt-5claude-opus-4-7·commit 65c227d·1 month ago - 1 finding2 filesreview →
AntFleet/aeon-bench · PR #24
gpt-5claude-opus-4-7·commit 1c841f9·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #23
gpt-5claude-opus-4-7·commit e2819cd·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #22
gpt-5claude-opus-4-7·commit 92e0cb5·1 month ago - 0 findings (clean)2 filesPR →
AntFleet/aeon-bench · PR #21
gpt-5claude-opus-4-7·commit f629516·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #20
commit 4fa6133·1 month ago - 0 findings (clean)PR →
AntFleet/aeon-bench · PR #19
commit a2c6038·1 month ago