Aider LLM Leaderboards

Aider excels with LLMs skilled at writing and editing code, and uses benchmarks to evaluate an LLM’s ability to follow instructions and edit code successfully without human intervention. Aider’s polyglot benchmark tests LLMs on 225 challenging Exercism coding exercises across C++, Go, Java, JavaScript, Python, and Rust.

Aider polyglot coding leaderboard

	Model	Percent correct	Cost	Command	Correct edit format	Edit Format
	gpt-5 (high)	88.0%	29ドル.08	`aider --model openai/gpt-5`	91.6%	diff
Dirname : 2025年08月23日-15-47-21--gpt-5-high Test cases : 225 Model : gpt-5 (high) Edit format : diff Commit hash : 32faf82 Reasoning effort : high Pass rate 1 : 52.0 Pass rate 2 : 88.0 Pass num 1 : 117 Pass num 2 : 198 Percent cases well formed : 91.6 Error outputs : 23 Num malformed responses : 22 Num with malformed responses : 19 User asks : 96 Lazy comments : 3 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2675561 Completion tokens : 2623429 Test timeouts : 3 Total tests : 225 Command : `aider --model openai/gpt-5` Date : 2025年08月23日 Versions : 0.86.2.dev Seconds per case : 194.0 Total cost : 29.0829
	gpt-5 (medium)	86.7%	17ドル.69	`aider --model openai/gpt-5`	88.4%	diff
Dirname : 2025年08月25日-13-23-27--gpt-5-medium Test cases : 225 Model : gpt-5 (medium) Edit format : diff Commit hash : 32faf82 Reasoning effort : medium Pass rate 1 : 49.8 Pass rate 2 : 86.7 Pass num 1 : 112 Pass num 2 : 195 Percent cases well formed : 88.4 Error outputs : 40 Num malformed responses : 40 Num with malformed responses : 26 User asks : 102 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2827261 Completion tokens : 1468799 Test timeouts : 0 Total tests : 225 Command : `aider --model openai/gpt-5` Date : 2025年08月25日 Versions : 0.86.2.dev Seconds per case : 118.7 Total cost : 17.693
	o3-pro (high)	84.9%	146ドル.32	`aider --model o3-pro`	97.8%	diff
Dirname : 2025年06月28日-00-38-18--o3-pro-high Test cases : 225 Model : o3-pro (high) Edit format : diff Commit hash : 5318380 Reasoning effort : high Pass rate 1 : 43.6 Pass rate 2 : 84.9 Pass num 1 : 98 Pass num 2 : 191 Percent cases well formed : 97.8 Error outputs : 20 Num malformed responses : 8 Num with malformed responses : 5 User asks : 100 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2372636 Completion tokens : 1235902 Test timeouts : 1 Total tests : 225 Command : `aider --model o3-pro` Date : 2025年06月28日 Versions : 0.85.1.dev Seconds per case : 449.0 Total cost : 146.3249
	gemini-2.5-pro-preview-06-05 (32k think)	83.1%	49ドル.88	`aider --model gemini/gemini-2.5-pro-preview-06-05 --thinking-tokens 32k`	99.6%	diff-fenced
Dirname : 2025年06月06日-16-36-21--gemini0605-32k-think-diff-fenced Test cases : 225 Model : gemini-2.5-pro-preview-06-05 (32k think) Edit format : diff-fenced Commit hash : f827f22 Thinking tokens : 32768 Pass rate 1 : 46.2 Pass rate 2 : 83.1 Pass num 1 : 104 Pass num 2 : 187 Percent cases well formed : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 112 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2719961 Completion tokens : 4648227 Test timeouts : 0 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-06-05 --thinking-tokens 32k` Date : 2025年06月06日 Versions : 0.84.1.dev Seconds per case : 200.3 Total cost : 49.8822
	gpt-5 (low)	81.3%	10ドル.37	`aider --model openai/gpt-5`	86.7%	diff
Dirname : 2025年08月25日-14-16-37--gpt-5-low Test cases : 225 Model : gpt-5 (low) Edit format : diff Commit hash : 32faf82 Reasoning effort : low Pass rate 1 : 43.1 Pass rate 2 : 81.3 Pass num 1 : 97 Pass num 2 : 183 Percent cases well formed : 86.7 Error outputs : 46 Num malformed responses : 46 Num with malformed responses : 30 User asks : 113 Lazy comments : 1 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2534059 Completion tokens : 779568 Test timeouts : 1 Total tests : 225 Command : `aider --model openai/gpt-5` Date : 2025年08月25日 Versions : 0.86.2.dev Seconds per case : 62.4 Total cost : 10.3713
	o3 (high)	81.3%	21ドル.23	`aider --model o3 --reasoning-effort high`	94.7%	diff
Dirname : 2025年06月25日-21-04-24--o3-price-reduction-high Test cases : 225 Model : o3 (high) Edit format : diff Commit hash : c48fea6 Reasoning effort : high Pass rate 1 : 40.0 Pass rate 2 : 81.3 Pass num 1 : 90 Pass num 2 : 183 Percent cases well formed : 94.7 Error outputs : 25 Num malformed responses : 23 Num with malformed responses : 12 User asks : 116 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 3148932 Completion tokens : 2047615 Test timeouts : 2 Total tests : 225 Command : `aider --model o3 --reasoning-effort high` Date : 2025年06月25日 Versions : 0.84.1.dev Seconds per case : 197.3 Total cost : 21.2259
	grok-4 (high)	79.6%	59ドル.62	`aider --model openrouter/x-ai/grok-4`	97.3%	diff
Dirname : 2025年07月11日-19-37-40--xai-or-grok4-high Test cases : 225 Model : grok-4 (high) Edit format : diff Commit hash : f7870b6-dirty Reasoning effort : high Pass rate 1 : 40.9 Pass rate 2 : 79.6 Pass num 1 : 92 Pass num 2 : 179 Percent cases well formed : 97.3 Error outputs : 11 Num malformed responses : 8 Num with malformed responses : 6 User asks : 133 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2815347 Completion tokens : 3411480 Test timeouts : 0 Total tests : 225 Command : `aider --model openrouter/x-ai/grok-4` Date : 2025年07月11日 Versions : 0.85.2.dev Seconds per case : 403.2 Total cost : 59.6182
	gemini-2.5-pro-preview-06-05 (default think)	79.1%	45ドル.6	`aider --model gemini/gemini-2.5-pro-preview-06-05`	100.0%	diff-fenced
Dirname : 2025年06月06日-18-38-56--gemini0605-diff-fenced Test cases : 225 Model : gemini-2.5-pro-preview-06-05 (default think) Edit format : diff-fenced Commit hash : 4c161f9-dirty Pass rate 1 : 44.9 Pass rate 2 : 79.1 Pass num 1 : 101 Pass num 2 : 178 Percent cases well formed : 100.0 Error outputs : 4 Num malformed responses : 0 Num with malformed responses : 0 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 4 Prompt tokens : 2751296 Completion tokens : 4142197 Test timeouts : 1 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-06-05` Date : 2025年06月06日 Versions : 0.84.1.dev Seconds per case : 175.2 Total cost : 45.5961
	o3 (high) + gpt-4.1	78.2%	17ドル.55	`aider --model o3`	100.0%	architect
Dirname : 2025年06月27日-23-53-57--o3-mini-high-diff-arch Test cases : 224 Model : o3 (high) + gpt-4.1 Edit format : architect Commit hash : 4f4f00f-dirty Editor model : gpt-4.1 Editor edit format : editor-diff Reasoning effort : high Pass rate 1 : 34.8 Pass rate 2 : 78.2 Pass num 1 : 78 Pass num 2 : 176 Percent cases well formed : 100.0 Error outputs : 18 Num malformed responses : 0 Num with malformed responses : 0 User asks : 172 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 1306877 Completion tokens : 1327154 Test timeouts : 1 Total tests : 225 Command : `aider --model o3` Date : 2025年06月27日 Versions : 0.85.1.dev Seconds per case : 121.8 Total cost : 17.5518
	o3	76.9%	13ドル.75	`aider --model o3`	93.8%	diff
Dirname : 2025年06月25日-20-30-16--o3-price-reduction Test cases : 225 Model : o3 Edit format : diff Commit hash : c48fea6 Pass rate 1 : 40.9 Pass rate 2 : 76.9 Pass num 1 : 92 Pass num 2 : 173 Percent cases well formed : 93.8 Error outputs : 22 Num malformed responses : 22 Num with malformed responses : 14 User asks : 108 Lazy comments : 2 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2893189 Completion tokens : 1154767 Test timeouts : 1 Total tests : 225 Command : `aider --model o3` Date : 2025年06月25日 Versions : 0.84.1.dev Seconds per case : 101.7 Total cost : 13.7517
	Gemini 2.5 Pro Preview 05-06	76.9%	37ドル.41	`aider --model gemini/gemini-2.5-pro-preview-05-06`	97.3%	diff-fenced
Dirname : 2025年05月07日-19-32-40--gemini0506-diff-fenced-completion_cost Test cases : 225 Model : Gemini 2.5 Pro Preview 05-06 Edit format : diff-fenced Commit hash : 3b08327-dirty Pass rate 1 : 36.4 Pass rate 2 : 76.9 Pass num 1 : 82 Pass num 2 : 173 Percent cases well formed : 97.3 Error outputs : 15 Num malformed responses : 7 Num with malformed responses : 6 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-05-06` Date : 2025年05月07日 Versions : 0.82.4.dev Seconds per case : 165.3 Total cost : 37.4104
	DeepSeek-V3.2-Exp (Reasoner)	74.2%	1ドル.3	`aider --model deepseek/deepseek-reasoner`	97.3%	diff
Dirname : 2025年10月03日-09-45-34--deepseek-v3.2-reasoner Test cases : 225 Model : DeepSeek-V3.2-Exp (Reasoner) Edit format : diff Commit hash : cbb5376 Pass rate 1 : 39.6 Pass rate 2 : 74.2 Pass num 1 : 89 Pass num 2 : 167 Percent cases well formed : 97.3 Error outputs : 8 Num malformed responses : 6 Num with malformed responses : 6 User asks : 67 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 2191446 Completion tokens : 1645129 Test timeouts : 1 Total tests : 225 Command : `aider --model deepseek/deepseek-reasoner` Date : 2025年10月03日 Versions : 0.86.2.dev Seconds per case : 291.2 Total cost : 1.3045
	Gemini 2.5 Pro Preview 03-25	72.9%		`aider --model gemini/gemini-2.5-pro-preview-03-25`	92.4%	diff-fenced
Dirname : 2025年04月12日-04-55-50--gemini-25-pro-diff-fenced Test cases : 225 Model : Gemini 2.5 Pro Preview 03-25 Edit format : diff-fenced Commit hash : 0282574 Pass rate 1 : 40.9 Pass rate 2 : 72.9 Pass num 1 : 92 Pass num 2 : 164 Percent cases well formed : 92.4 Error outputs : 21 Num malformed responses : 21 Num with malformed responses : 17 User asks : 69 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-03-25` Date : 2025年04月12日 Versions : 0.81.3.dev Seconds per case : 45.3 Total cost : 0
	claude-opus-4-20250514 (32k thinking)	72.0%	65ドル.75	`aider --model claude-opus-4-20250514`	97.3%	diff
Dirname : 2025年05月25日-20-40-51--opus4-diff-exuser Test cases : 225 Model : claude-opus-4-20250514 (32k thinking) Edit format : diff Commit hash : 9ef3211 Thinking tokens : 32000 Pass rate 1 : 37.3 Pass rate 2 : 72.0 Pass num 1 : 84 Pass num 2 : 162 Percent cases well formed : 97.3 Error outputs : 10 Num malformed responses : 6 Num with malformed responses : 6 User asks : 97 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2567514 Completion tokens : 363142 Test timeouts : 4 Total tests : 225 Command : `aider --model claude-opus-4-20250514` Date : 2025年05月25日 Versions : 0.83.3.dev Seconds per case : 44.1 Total cost : 65.7484
	o4-mini (high)	72.0%	19ドル.64	`aider --model o4-mini`	90.7%	diff
Dirname : 2025年04月16日-22-01-58--o4-mini-high-diff-exsys Test cases : 225 Model : o4-mini (high) Edit format : diff Commit hash : b66901f-dirty Pass rate 1 : 19.6 Pass rate 2 : 72.0 Pass num 1 : 44 Pass num 2 : 162 Percent cases well formed : 90.7 Error outputs : 26 Num malformed responses : 24 Num with malformed responses : 21 User asks : 66 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model o4-mini` Date : 2025年04月16日 Versions : 0.82.1.dev Seconds per case : 176.5 Total cost : 19.6399
	DeepSeek R1 (0528)	71.4%	4ドル.8	`aider --model deepseek/deepseek-reasoner`	94.6%	diff
Dirname : 2025年06月06日-16-47-07--r1-diff Test cases : 224 Model : DeepSeek R1 (0528) Edit format : diff Commit hash : 4c161f9-dirty Pass rate 1 : 34.4 Pass rate 2 : 71.4 Pass num 1 : 77 Pass num 2 : 160 Percent cases well formed : 94.6 Error outputs : 28 Num malformed responses : 15 Num with malformed responses : 12 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2644169 Completion tokens : 1842168 Test timeouts : 2 Total tests : 225 Command : `aider --model deepseek/deepseek-reasoner` Date : 2025年06月06日 Versions : 0.84.1.dev Seconds per case : 716.6 Total cost : 4.8016
	claude-opus-4-20250514 (no think)	70.7%	68ドル.63	`aider --model claude-opus-4-20250514`	98.7%	diff
Dirname : 2025年05月25日-19-57-20--opus4-diff-exuser Test cases : 225 Model : claude-opus-4-20250514 (no think) Edit format : diff Commit hash : 9ef3211 Pass rate 1 : 32.9 Pass rate 2 : 70.7 Pass num 1 : 74 Pass num 2 : 159 Percent cases well formed : 98.7 Error outputs : 3 Num malformed responses : 3 Num with malformed responses : 3 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2671437 Completion tokens : 380717 Test timeouts : 3 Total tests : 225 Command : `aider --model claude-opus-4-20250514` Date : 2025年05月25日 Versions : 0.83.3.dev Seconds per case : 42.5 Total cost : 68.6253
	DeepSeek-V3.2-Exp (Chat)	70.2%	0ドル.88	`aider --model deepseek/deepseek-chat`	98.2%	diff
Dirname : 2025年10月03日-09-21-36--deepseek-v3.2-chat Test cases : 225 Model : DeepSeek-V3.2-Exp (Chat) Edit format : diff Commit hash : cbb5376 Pass rate 1 : 38.7 Pass rate 2 : 70.2 Pass num 1 : 87 Pass num 2 : 158 Percent cases well formed : 98.2 Error outputs : 6 Num malformed responses : 4 Num with malformed responses : 4 User asks : 60 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 2266868 Completion tokens : 573477 Test timeouts : 4 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2025年10月03日 Versions : 0.86.2.dev Seconds per case : 104.0 Total cost : 0.8756
	claude-3-7-sonnet-20250219 (32k thinking tokens)	64.9%	36ドル.83	`aider --model anthropic/claude-3-7-sonnet-20250219 --thinking-tokens 32k`	97.8%	diff
Dirname : 2025年02月24日-21-47-23--sonnet37-diff-think-32k-64k Test cases : 225 Model : claude-3-7-sonnet-20250219 (32k thinking tokens) Edit format : diff Commit hash : 60d11a6, 93edbda Pass rate 1 : 29.3 Pass rate 2 : 64.9 Pass num 1 : 66 Pass num 2 : 146 Percent cases well formed : 97.8 Error outputs : 66 Num malformed responses : 5 Num with malformed responses : 5 User asks : 5 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 1 Total tests : 225 Command : `aider --model anthropic/claude-3-7-sonnet-20250219 --thinking-tokens 32k` Date : 2025年02月24日 Versions : 0.75.1.dev Seconds per case : 105.2 Total cost : 36.8343
	DeepSeek R1 + claude-3-5-sonnet-20241022	64.0%	13ドル.29	`aider --architect --model r1 --editor-model sonnet`	100.0%	architect
Dirname : 2025年01月23日-19-14-48--r1-architect-sonnet Test cases : 225 Model : DeepSeek R1 + claude-3-5-sonnet-20241022 Edit format : architect Commit hash : 05a77c7 Editor model : claude-3-5-sonnet-20241022 Editor edit format : editor-diff Pass rate 1 : 27.1 Pass rate 2 : 64.0 Pass num 1 : 61 Pass num 2 : 144 Percent cases well formed : 100.0 Error outputs : 2 Num malformed responses : 0 Num with malformed responses : 0 User asks : 392 Lazy comments : 6 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --architect --model r1 --editor-model sonnet` Date : 2025年01月23日 Versions : 0.72.3.dev Seconds per case : 251.6 Total cost : 13.2933
	o1-2024年12月17日 (high)	61.7%	186ドル.5	`aider --model openrouter/openai/o1`	91.5%	diff
Dirname : 2024年12月21日-19-23-03--polyglot-o1-hard-diff Test cases : 224 Model : o1-2024年12月17日 (high) Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 23.7 Pass rate 2 : 61.7 Pass num 1 : 53 Pass num 2 : 139 Percent cases well formed : 91.5 Error outputs : 25 Num malformed responses : 24 Num with malformed responses : 19 User asks : 16 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model openrouter/openai/o1` Date : 2024年12月21日 Versions : 0.69.2.dev Seconds per case : 133.2 Total cost : 186.4958
	claude-sonnet-4-20250514 (32k thinking)	61.3%	26ドル.58	`aider --model claude-sonnet-4-20250514`	97.3%	diff
Dirname : 2025年05月24日-22-10-36--sonnet4-diff-exuser-think32k Test cases : 225 Model : claude-sonnet-4-20250514 (32k thinking) Edit format : diff Commit hash : e3cb907 Thinking tokens : 32000 Pass rate 1 : 25.8 Pass rate 2 : 61.3 Pass num 1 : 58 Pass num 2 : 138 Percent cases well formed : 97.3 Error outputs : 10 Num malformed responses : 10 Num with malformed responses : 6 User asks : 111 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2863068 Completion tokens : 1271074 Test timeouts : 6 Total tests : 225 Command : `aider --model claude-sonnet-4-20250514` Date : 2025年05月24日 Versions : 0.83.3.dev Seconds per case : 79.9 Total cost : 26.5755
	claude-3-7-sonnet-20250219 (no thinking)	60.4%	17ドル.72	`aider --model sonnet`	93.3%	diff
Dirname : 2025年02月24日-19-54-07--sonnet37-diff Test cases : 225 Model : claude-3-7-sonnet-20250219 (no thinking) Edit format : diff Commit hash : 75e9ee6 Pass rate 1 : 24.4 Pass rate 2 : 60.4 Pass num 1 : 55 Pass num 2 : 136 Percent cases well formed : 93.3 Error outputs : 16 Num malformed responses : 16 Num with malformed responses : 15 User asks : 12 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 0 Total tests : 225 Command : `aider --model sonnet` Date : 2025年02月24日 Versions : 0.74.4.dev Seconds per case : 28.3 Total cost : 17.7191
	o3-mini (high)	60.4%	18ドル.16	`aider --model o3-mini --reasoning-effort high`	93.3%	diff
Dirname : 2025年01月31日-20-42-47--o3-mini-diff-high Test cases : 224 Model : o3-mini (high) Edit format : diff Commit hash : b0d58d1-dirty Pass rate 1 : 21.0 Pass rate 2 : 60.4 Pass num 1 : 47 Pass num 2 : 136 Percent cases well formed : 93.3 Error outputs : 26 Num malformed responses : 24 Num with malformed responses : 15 User asks : 19 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 7 Total tests : 225 Command : `aider --model o3-mini --reasoning-effort high` Date : 2025年01月31日 Versions : 0.72.4.dev Seconds per case : 124.6 Total cost : 18.1584
	Qwen3 235B A22B diff, no think, Alibaba API	59.6%		`aider --model openai/qwen3-235b-a22b`	92.9%	diff
Dirname : 2025年05月09日-17-02-02--qwen3-235b-a22b.unthink_16k_diff Test cases : 225 Model : Qwen3 235B A22B diff, no think, Alibaba API Edit format : diff Commit hash : 91d7fbd-dirty Pass rate 1 : 28.9 Pass rate 2 : 59.6 Pass num 1 : 65 Pass num 2 : 134 Percent cases well formed : 92.9 Error outputs : 22 Num malformed responses : 22 Num with malformed responses : 16 User asks : 111 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2816192 Completion tokens : 342062 Test timeouts : 1 Total tests : 225 Command : `aider --model openai/qwen3-235b-a22b` Date : 2025年05月09日 Versions : 0.82.4.dev Seconds per case : 45.4 Total cost : 0.0
	Kimi K2	59.1%	1ドル.24	`aider --model openrouter/moonshotai/kimi-k2`	92.9%	diff
Dirname : 2025年07月17日-17-41-54--kimi-k2-diff-or-pricing Test cases : 225 Model : Kimi K2 Edit format : diff Commit hash : 915ebff-dirty Pass rate 1 : 20.4 Pass rate 2 : 59.1 Pass num 1 : 46 Pass num 2 : 133 Percent cases well formed : 92.9 Error outputs : 19 Num malformed responses : 19 Num with malformed responses : 16 User asks : 61 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2355141 Completion tokens : 363846 Test timeouts : 4 Total tests : 225 Command : `aider --model openrouter/moonshotai/kimi-k2` Date : 2025年07月17日 Versions : 0.85.3.dev Seconds per case : 67.6 Total cost : 1.2357
	DeepSeek R1	56.9%	5ドル.42	`aider --model deepseek/deepseek-reasoner`	96.9%	diff
Dirname : 2025年01月20日-19-11-38--ds-turns-upd-cur-msgs-fix-with-summarizer Test cases : 225 Model : DeepSeek R1 Edit format : diff Commit hash : 5650697-dirty Pass rate 1 : 26.7 Pass rate 2 : 56.9 Pass num 1 : 60 Pass num 2 : 128 Percent cases well formed : 96.9 Error outputs : 8 Num malformed responses : 7 Num with malformed responses : 7 User asks : 15 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 5 Total tests : 225 Command : `aider --model deepseek/deepseek-reasoner` Date : 2025年01月20日 Versions : 0.71.2.dev Seconds per case : 113.7 Total cost : 5.4193
	claude-sonnet-4-20250514 (no thinking)	56.4%	15ドル.82	`aider --model claude-sonnet-4-20250514`	98.2%	diff
Dirname : 2025年05月24日-21-17-54--sonnet4-diff-exuser Test cases : 225 Model : claude-sonnet-4-20250514 (no thinking) Edit format : diff Commit hash : ef3f8bb-dirty Pass rate 1 : 20.4 Pass rate 2 : 56.4 Pass num 1 : 46 Pass num 2 : 127 Percent cases well formed : 98.2 Error outputs : 6 Num malformed responses : 4 Num with malformed responses : 4 User asks : 129 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 3460663 Completion tokens : 433373 Test timeouts : 7 Total tests : 225 Command : `aider --model claude-sonnet-4-20250514` Date : 2025年05月24日 Versions : 0.83.3.dev Seconds per case : 29.8 Total cost : 15.8155
	gemini-2.5-flash-preview-05-20 (24k think)	55.1%	8ドル.56	`aider --model gemini/gemini-2.5-flash-preview-05-20`	95.6%	diff
Dirname : 2025年05月25日-22-58-44--flash25-05-20-24k-think Test cases : 225 Model : gemini-2.5-flash-preview-05-20 (24k think) Edit format : diff Commit hash : a8568c3-dirty Thinking tokens : 24576 Pass rate 1 : 26.2 Pass rate 2 : 55.1 Pass num 1 : 59 Pass num 2 : 124 Percent cases well formed : 95.6 Error outputs : 15 Num malformed responses : 15 Num with malformed responses : 10 User asks : 101 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 3666792 Completion tokens : 2703162 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-05-20` Date : 2025年05月25日 Versions : 0.83.3.dev Seconds per case : 53.9 Total cost : 8.5625
	DeepSeek V3 (0324)	55.1%	1ドル.12	`aider --model deepseek/deepseek-chat`	99.6%	diff
Dirname : 2025年03月24日-15-41-33--deepseek-v3-0324-polyglot-diff Test cases : 225 Model : DeepSeek V3 (0324) Edit format : diff Commit hash : 502b863 Pass rate 1 : 28.0 Pass rate 2 : 55.1 Pass num 1 : 63 Pass num 2 : 124 Percent cases well formed : 99.6 Error outputs : 32 Num malformed responses : 1 Num with malformed responses : 1 User asks : 96 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 4 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2025年03月24日 Versions : 0.78.1.dev Seconds per case : 290.0 Total cost : 1.1164
	Quasar Alpha	54.7%		`aider --model openrouter/openrouter/quasar-alpha`	98.2%	diff
Dirname : 2025年04月04日-02-57-25--qalpha-diff-exsys Test cases : 225 Model : Quasar Alpha Edit format : diff Commit hash : 8a34a6c-dirty Pass rate 1 : 21.8 Pass rate 2 : 54.7 Pass num 1 : 49 Pass num 2 : 123 Percent cases well formed : 98.2 Error outputs : 4 Num malformed responses : 4 Num with malformed responses : 4 User asks : 187 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model openrouter/openrouter/quasar-alpha` Date : 2025年04月04日 Versions : 0.80.5.dev Seconds per case : 14.8 Total cost : 0.0
	o3-mini (medium)	53.8%	8ドル.86	`aider --model o3-mini`	95.1%	diff
Dirname : 2025年01月31日-20-27-46--o3-mini-diff2 Test cases : 225 Model : o3-mini (medium) Edit format : diff Commit hash : 2fb517b-dirty Pass rate 1 : 19.1 Pass rate 2 : 53.8 Pass num 1 : 43 Pass num 2 : 121 Percent cases well formed : 95.1 Error outputs : 28 Num malformed responses : 28 Num with malformed responses : 11 User asks : 17 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model o3-mini` Date : 2025年01月31日 Versions : 0.72.4.dev Seconds per case : 47.2 Total cost : 8.8599
	Grok 3 Beta	53.3%	11ドル.03	`aider --model openrouter/x-ai/grok-3-beta`	99.6%	diff
Dirname : 2025年04月10日-04-21-31--grok3-diff-exuser Test cases : 225 Model : Grok 3 Beta Edit format : diff Commit hash : 2dd40fc-dirty Pass rate 1 : 22.2 Pass rate 2 : 53.3 Pass num 1 : 50 Pass num 2 : 120 Percent cases well formed : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 68 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model openrouter/x-ai/grok-3-beta` Date : 2025年04月10日 Versions : 0.81.2.dev Seconds per case : 15.3 Total cost : 11.0338
	Optimus Alpha	52.9%		`aider --model openrouter/openrouter/optimus-alpha`	97.3%	diff
Dirname : 2025年04月10日-19-02-44--oalpha-diff-exsys Test cases : 225 Model : Optimus Alpha Edit format : diff Commit hash : 532bc45-dirty Pass rate 1 : 21.3 Pass rate 2 : 52.9 Pass num 1 : 48 Pass num 2 : 119 Percent cases well formed : 97.3 Error outputs : 7 Num malformed responses : 6 Num with malformed responses : 6 User asks : 182 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model openrouter/openrouter/optimus-alpha` Date : 2025年04月10日 Versions : 0.81.2.dev Seconds per case : 18.4 Total cost : 0.0
	gpt-4.1	52.4%	9ドル.86	`aider --model gpt-4.1`	98.2%	diff
Dirname : 2025年04月14日-21-05-54--gpt41-diff-exuser Test cases : 225 Model : gpt-4.1 Edit format : diff Commit hash : 7a87db5-dirty Pass rate 1 : 20.0 Pass rate 2 : 52.4 Pass num 1 : 45 Pass num 2 : 118 Percent cases well formed : 98.2 Error outputs : 6 Num malformed responses : 5 Num with malformed responses : 4 User asks : 171 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 5 Total tests : 225 Command : `aider --model gpt-4.1` Date : 2025年04月14日 Versions : 0.81.4.dev Seconds per case : 20.5 Total cost : 9.8556
	claude-3-5-sonnet-20241022	51.6%	14ドル.41	`aider --model claude-3-5-sonnet-20241022`	99.6%	diff
Dirname : 2025年01月17日-19-44-33--sonnet-baseline-jan-17 Test cases : 225 Model : claude-3-5-sonnet-20241022 Edit format : diff Commit hash : 6451d59 Pass rate 1 : 22.2 Pass rate 2 : 51.6 Pass num 1 : 50 Pass num 2 : 116 Percent cases well formed : 99.6 Error outputs : 2 Num malformed responses : 1 Num with malformed responses : 1 User asks : 11 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 8 Total tests : 225 Command : `aider --model claude-3-5-sonnet-20241022` Date : 2025年01月17日 Versions : 0.71.2.dev Seconds per case : 21.4 Total cost : 14.4063
	Grok 3 Mini Beta (high)	49.3%	0ドル.73	`aider --model xai/grok-3-mini-beta --reasoning-effort high`	99.6%	whole
Dirname : 2025年04月10日-23-59-02--xai-grok3-mini-whole-high Test cases : 225 Model : Grok 3 Mini Beta (high) Edit format : whole Commit hash : 8ee33da-dirty Pass rate 1 : 17.3 Pass rate 2 : 49.3 Pass num 1 : 39 Pass num 2 : 111 Percent cases well formed : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 64 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 0 Total tests : 225 Command : `aider --model xai/grok-3-mini-beta --reasoning-effort high` Date : 2025年04月10日 Versions : 0.81.3.dev Seconds per case : 79.1 Total cost : 0.7346
	DeepSeek Chat V3 (prev)	48.4%	0ドル.34	`aider --model deepseek/deepseek-chat`	98.7%	diff
Dirname : 2024年12月25日-13-31-51--deepseekv3preview-diff2 Test cases : 225 Model : DeepSeek Chat V3 (prev) Edit format : diff Commit hash : 0a23c4a-dirty Pass rate 1 : 22.7 Pass rate 2 : 48.4 Pass num 1 : 51 Pass num 2 : 109 Percent cases well formed : 98.7 Error outputs : 7 Num malformed responses : 7 Num with malformed responses : 3 User asks : 19 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 8 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2024年12月25日 Versions : 0.69.2.dev Seconds per case : 34.8 Total cost : 0.3369
	gemini-2.5-flash-preview-04-17 (default)	47.1%	1ドル.85	`aider --model gemini/gemini-2.5-flash-preview-04-17`	85.3%	diff
Dirname : 2025年04月20日-19-54-31--flash25-diff-no-think Test cases : 225 Model : gemini-2.5-flash-preview-04-17 (default) Edit format : diff Commit hash : 7fcce5d-dirty Pass rate 1 : 21.8 Pass rate 2 : 47.1 Pass num 1 : 49 Pass num 2 : 106 Percent cases well formed : 85.3 Error outputs : 60 Num malformed responses : 55 Num with malformed responses : 33 User asks : 82 Lazy comments : 1 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 5 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-04-17` Date : 2025年04月20日 Versions : 0.82.3.dev Seconds per case : 50.1 Total cost : 1.8451
	chatgpt-4o-latest (2025年03月29日)	45.3%	19ドル.74	`aider --model chatgpt-4o-latest`	64.4%	diff
Dirname : 2025年03月29日-05-24-55--chatgpt4o-mar28-diff Test cases : 225 Model : chatgpt-4o-latest (2025年03月29日) Edit format : diff Commit hash : 0decbad Pass rate 1 : 16.4 Pass rate 2 : 45.3 Pass num 1 : 37 Pass num 2 : 102 Percent cases well formed : 64.4 Error outputs : 85 Num malformed responses : 85 Num with malformed responses : 80 User asks : 174 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model chatgpt-4o-latest` Date : 2025年03月29日 Versions : 0.79.3.dev Seconds per case : 10.3 Total cost : 19.7416
	gpt-4.5-preview	44.9%	183ドル.18	`aider --model openai/gpt-4.5-preview`	97.3%	diff
Dirname : 2025年02月27日-20-26-15--gpt45-diff3 Test cases : 224 Model : gpt-4.5-preview Edit format : diff Commit hash : b462e55-dirty Pass rate 1 : 22.3 Pass rate 2 : 44.9 Pass num 1 : 50 Pass num 2 : 101 Percent cases well formed : 97.3 Error outputs : 10 Num malformed responses : 8 Num with malformed responses : 6 User asks : 15 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model openai/gpt-4.5-preview` Date : 2025年02月27日 Versions : 0.75.2.dev Seconds per case : 113.5 Total cost : 183.1802
	gemini-2.5-flash-preview-05-20 (no think)	44.0%	1ドル.14	`aider --model gemini/gemini-2.5-flash-preview-05-20`	93.8%	diff
Dirname : 2025年05月26日-15-56-31--flash25-05-20-24k-think Test cases : 225 Model : gemini-2.5-flash-preview-05-20 (no think) Edit format : diff Commit hash : 214b811-dirty Thinking tokens : 0 Pass rate 1 : 20.9 Pass rate 2 : 44.0 Pass num 1 : 47 Pass num 2 : 99 Percent cases well formed : 93.8 Error outputs : 16 Num malformed responses : 16 Num with malformed responses : 14 User asks : 79 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 5512458 Completion tokens : 514145 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-05-20` Date : 2025年05月26日 Versions : 0.83.3.dev Seconds per case : 12.2 Total cost : 1.1354
	gpt-oss-120b (high)	41.8%	0ドル.74	`aider --model openrouter/openai/gpt-oss-120b --reasoning-effort high`	79.1%	diff
Dirname : 2025年08月06日-04-54-48--gpt-oss-120b-high-polyglot Test cases : 225 Model : gpt-oss-120b (high) Edit format : diff Commit hash : 1af0e59 Pass rate 1 : 13.8 Pass rate 2 : 41.8 Pass num 1 : 31 Pass num 2 : 94 Percent cases well formed : 79.1 Error outputs : 95 Num malformed responses : 77 Num with malformed responses : 47 User asks : 142 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 3123768 Completion tokens : 856495 Test timeouts : 4 Total tests : 225 Command : `aider --model openrouter/openai/gpt-oss-120b --reasoning-effort high` Date : 2025年08月06日 Versions : 0.85.3.dev Seconds per case : 35.5 Total cost : 0.7406
	Qwen3 32B	40.0%	0ドル.76	`aider --model openrouter/qwen/qwen3-32b`	83.6%	diff
Dirname : 2025年05月08日-03-20-24--qwen3-32b-default Test cases : 225 Model : Qwen3 32B Edit format : diff Commit hash : aaacee5-dirty, aeaf259 Pass rate 1 : 14.2 Pass rate 2 : 40.0 Pass num 1 : 32 Pass num 2 : 90 Percent cases well formed : 83.6 Error outputs : 119 Num malformed responses : 50 Num with malformed responses : 37 User asks : 97 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 12 Prompt tokens : 317591 Completion tokens : 120418 Test timeouts : 5 Total tests : 225 Command : `aider --model openrouter/qwen/qwen3-32b` Date : 2025年05月08日 Versions : 0.82.4.dev Seconds per case : 372.2 Total cost : 0.7603
	gemini-exp-1206	38.2%		`aider --model gemini/gemini-exp-1206`	98.2%	whole
Dirname : 2024年12月22日-18-43-25--gemini-exp-1206-polyglot-whole-2 Test cases : 225 Model : gemini-exp-1206 Edit format : whole Commit hash : b1bc2f8 Pass rate 1 : 19.6 Pass rate 2 : 38.2 Pass num 1 : 44 Pass num 2 : 86 Percent cases well formed : 98.2 Error outputs : 8 Num malformed responses : 8 Num with malformed responses : 4 User asks : 32 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 9 Total tests : 225 Command : `aider --model gemini/gemini-exp-1206` Date : 2024年12月22日 Versions : 0.69.2.dev Seconds per case : 45.5 Total cost : 0.0
	Gemini 2.0 Pro exp-02-05	35.6%		`aider --model gemini/gemini-2.0-pro-exp-02-05`	100.0%	whole
Dirname : 2025年02月25日-20-23-07--gemini-pro Test cases : 225 Model : Gemini 2.0 Pro exp-02-05 Edit format : whole Commit hash : 2fccd47 Pass rate 1 : 20.4 Pass rate 2 : 35.6 Pass num 1 : 46 Pass num 2 : 80 Percent cases well formed : 100.0 Error outputs : 430 Num malformed responses : 0 Num with malformed responses : 0 User asks : 13 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --model gemini/gemini-2.0-pro-exp-02-05` Date : 2025年02月25日 Versions : 0.75.2.dev Seconds per case : 34.8 Total cost : 0.0
	Grok 3 Mini Beta (low)	34.7%	0ドル.79	`aider --model openrouter/x-ai/grok-3-mini-beta`	100.0%	whole
Dirname : 2025年04月10日-18-47-24--grok3-mini-whole-exuser Test cases : 225 Model : Grok 3 Mini Beta (low) Edit format : whole Commit hash : 14ffe77-dirty Pass rate 1 : 11.1 Pass rate 2 : 34.7 Pass num 1 : 25 Pass num 2 : 78 Percent cases well formed : 100.0 Error outputs : 3 Num malformed responses : 0 Num with malformed responses : 0 User asks : 73 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --model openrouter/x-ai/grok-3-mini-beta` Date : 2025年04月10日 Versions : 0.81.2.dev Seconds per case : 35.1 Total cost : 0.7856
	o1-mini-2024年09月12日	32.9%	18ドル.58	`aider --model o1-mini`	96.9%	whole
Dirname : 2024年12月22日-21-26-35--polyglot-o1mini-whole Test cases : 225 Model : o1-mini-2024年09月12日 Edit format : whole Commit hash : 37df899 Pass rate 1 : 5.8 Pass rate 2 : 32.9 Pass num 1 : 13 Pass num 2 : 74 Percent cases well formed : 96.9 Error outputs : 8 Num malformed responses : 8 Num with malformed responses : 7 User asks : 27 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model o1-mini` Date : 2024年12月22日 Versions : 0.69.2.dev Seconds per case : 34.7 Total cost : 18.577
	gpt-4.1-mini	32.4%	1ドル.99	`aider --model gpt-4.1-mini`	92.4%	diff
Dirname : 2025年04月14日-21-27-53--gpt41mini-diff Test cases : 225 Model : gpt-4.1-mini Edit format : diff Commit hash : ffb743e-dirty Pass rate 1 : 11.1 Pass rate 2 : 32.4 Pass num 1 : 25 Pass num 2 : 73 Percent cases well formed : 92.4 Error outputs : 64 Num malformed responses : 62 Num with malformed responses : 17 User asks : 159 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 2 Total tests : 225 Command : `aider --model gpt-4.1-mini` Date : 2025年04月14日 Versions : 0.81.4.dev Seconds per case : 19.5 Total cost : 1.9918
	claude-3-5-haiku-20241022	28.0%	6ドル.06	`aider --model claude-3-5-haiku-20241022`	91.1%	diff
Dirname : 2024年12月21日-21-46-27--polyglot-haiku-diff Test cases : 225 Model : claude-3-5-haiku-20241022 Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 7.1 Pass rate 2 : 28.0 Pass num 1 : 16 Pass num 2 : 63 Percent cases well formed : 91.1 Error outputs : 31 Num malformed responses : 30 Num with malformed responses : 20 User asks : 13 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 9 Total tests : 225 Command : `aider --model claude-3-5-haiku-20241022` Date : 2024年12月21日 Versions : 0.69.2.dev Seconds per case : 31.8 Total cost : 6.0583
	chatgpt-4o-latest (2025年02月15日)	27.1%	14ドル.37	`aider --model chatgpt-4o-latest`	93.3%	diff
Dirname : 2025年02月15日-19-51-22--chatgpt4o-feb15-diff Test cases : 223 Model : chatgpt-4o-latest (2025年02月15日) Edit format : diff Commit hash : 108ce18-dirty Pass rate 1 : 9.0 Pass rate 2 : 27.1 Pass num 1 : 20 Pass num 2 : 61 Percent cases well formed : 93.3 Error outputs : 66 Num malformed responses : 21 Num with malformed responses : 15 User asks : 57 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model chatgpt-4o-latest` Date : 2025年02月15日 Versions : 0.74.3.dev Seconds per case : 12.4 Total cost : 14.3703
	QwQ-32B + Qwen 2.5 Coder Instruct	26.2%		`aider --model fireworks_ai/accounts/fireworks/models/qwq-32b --architect`	100.0%	architect
Dirname : 2025年03月07日-15-11-27--qwq32b-arch-temp-topp-again Test cases : 225 Model : QwQ-32B + Qwen 2.5 Coder Instruct Edit format : architect Commit hash : 52162a5 Editor model : fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct Editor edit format : editor-diff Pass rate 1 : 9.8 Pass rate 2 : 26.2 Pass num 1 : 22 Pass num 2 : 59 Percent cases well formed : 100.0 Error outputs : 122 Num malformed responses : 0 Num with malformed responses : 0 User asks : 489 Lazy comments : 8 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model fireworks_ai/accounts/fireworks/models/qwq-32b --architect` Date : 2025年03月07日 Versions : 0.75.3.dev Seconds per case : 137.4 Total cost : 0
	gpt-4o-2024年08月06日	23.1%	7ドル.03	`aider --model gpt-4o-2024年08月06日`	94.2%	diff
Dirname : 2024年12月30日-20-44-54--gpt4o-ex-as-sys-clean-prompt Test cases : 225 Model : gpt-4o-2024年08月06日 Edit format : diff Commit hash : 09ee197-dirty Pass rate 1 : 4.9 Pass rate 2 : 23.1 Pass num 1 : 11 Pass num 2 : 52 Percent cases well formed : 94.2 Error outputs : 21 Num malformed responses : 21 Num with malformed responses : 13 User asks : 65 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model gpt-4o-2024年08月06日` Date : 2024年12月30日 Versions : 0.70.1.dev Seconds per case : 16.0 Total cost : 7.0286
	gemini-2.0-flash-exp	22.2%		`aider --model gemini/gemini-2.0-flash-exp`	100.0%	whole
Dirname : 2024年12月22日-20-08-13--gemini-2.0-flash-exp-polyglot-whole Test cases : 225 Model : gemini-2.0-flash-exp Edit format : whole Commit hash : b1bc2f8 Pass rate 1 : 11.6 Pass rate 2 : 22.2 Pass num 1 : 26 Pass num 2 : 50 Percent cases well formed : 100.0 Error outputs : 1 Num malformed responses : 0 Num with malformed responses : 0 User asks : 9 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 8 Total tests : 225 Command : `aider --model gemini/gemini-2.0-flash-exp` Date : 2024年12月22日 Versions : 0.69.2.dev Seconds per case : 12.2 Total cost : 0.0
	qwen-max-2025年01月25日	21.8%		`OPENAI_API_BASE=https://dashscope-intl.aliyuncs.com/compatible-mode/v1 aider --model openai/qwen-max-2025年01月25日`	90.2%	diff
Dirname : 2025年01月28日-16-00-03--qwen-max-2025年01月25日-polyglot-diff Test cases : 225 Model : qwen-max-2025年01月25日 Edit format : diff Commit hash : ae7d459 Pass rate 1 : 9.3 Pass rate 2 : 21.8 Pass num 1 : 21 Pass num 2 : 49 Percent cases well formed : 90.2 Error outputs : 46 Num malformed responses : 44 Num with malformed responses : 22 User asks : 23 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 9 Total tests : 225 Command : `OPENAI_API_BASE=https://dashscope-intl.aliyuncs.com/compatible-mode/v1 aider --model openai/qwen-max-2025年01月25日` Date : 2025年01月28日 Versions : 0.72.4.dev Seconds per case : 39.5
	QwQ-32B	20.9%		`aider --model fireworks_ai/accounts/fireworks/models/qwq-32b`	67.6%	diff
Dirname : 2025年03月06日-17-40-24--qwq32b-diff-temp-topp-ex-sys-remind-user-for-real Test cases : 225 Model : QwQ-32B Edit format : diff Commit hash : 51d118f-dirty Pass rate 1 : 8.0 Pass rate 2 : 20.9 Pass num 1 : 18 Pass num 2 : 47 Percent cases well formed : 67.6 Error outputs : 145 Num malformed responses : 143 Num with malformed responses : 73 User asks : 17 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 4 Total tests : 225 Command : `aider --model fireworks_ai/accounts/fireworks/models/qwq-32b` Date : 2025年03月06日 Versions : 0.75.3.dev Seconds per case : 228.6 Total cost : 0.0
	gemini-2.0-flash-thinking-exp-01-21	18.2%		`aider --model gemini/gemini-2.0-flash-thinking-exp-01-21`	77.8%	diff
Dirname : 2025年01月21日-22-51-49--gemini-2.0-flash-thinking-exp-01-21-polyglot-diff Test cases : 225 Model : gemini-2.0-flash-thinking-exp-01-21 Edit format : diff Commit hash : 843720a Pass rate 1 : 5.8 Pass rate 2 : 18.2 Pass num 1 : 13 Pass num 2 : 41 Percent cases well formed : 77.8 Error outputs : 182 Num malformed responses : 180 Num with malformed responses : 50 User asks : 26 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 7 Total tests : 225 Command : `aider --model gemini/gemini-2.0-flash-thinking-exp-01-21` Date : 2025年01月21日 Versions : 0.72.2.dev Seconds per case : 24.2 Total cost : 0.0
	gpt-4o-2024年11月20日	18.2%	6ドル.74	`aider --model gpt-4o-2024年11月20日`	95.1%	diff
Dirname : 2024年12月30日-20-57-12--gpt-4o-2024年11月20日-ex-as-sys Test cases : 225 Model : gpt-4o-2024年11月20日 Edit format : diff Commit hash : 09ee197-dirty Pass rate 1 : 4.9 Pass rate 2 : 18.2 Pass num 1 : 11 Pass num 2 : 41 Percent cases well formed : 95.1 Error outputs : 12 Num malformed responses : 12 Num with malformed responses : 11 User asks : 53 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 12 Total tests : 225 Command : `aider --model gpt-4o-2024年11月20日` Date : 2024年12月30日 Versions : 0.70.1.dev Seconds per case : 12.1 Total cost : 6.7351
	DeepSeek Chat V2.5	17.8%	0ドル.51	`aider --model deepseek/deepseek-chat`	92.9%	diff
Dirname : 2024年12月21日-20-56-21--polyglot-deepseek-diff Test cases : 225 Model : DeepSeek Chat V2.5 Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 5.3 Pass rate 2 : 17.8 Pass num 1 : 12 Pass num 2 : 40 Percent cases well formed : 92.9 Error outputs : 42 Num malformed responses : 37 Num with malformed responses : 16 User asks : 23 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 5 Test timeouts : 5 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2024年12月21日 Versions : 0.69.2.dev Seconds per case : 184.0 Total cost : 0.5101
	Qwen2.5-Coder-32B-Instruct	16.4%		`aider --model openai/Qwen2.5-Coder-32B-Instruct`	99.6%	whole
Dirname : 2024年12月26日-00-55-20--Qwen2.5-Coder-32B-Instruct Test cases : 225 Model : Qwen2.5-Coder-32B-Instruct Edit format : whole Commit hash : b51768b0 Pass rate 1 : 4.9 Pass rate 2 : 16.4 Pass num 1 : 11 Pass num 2 : 37 Percent cases well formed : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 33 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 6 Total tests : 225 Command : `aider --model openai/Qwen2.5-Coder-32B-Instruct` Date : 2024年12月26日 Versions : 0.69.2.dev Seconds per case : 42.0 Total cost : 0.0
	Llama 4 Maverick	15.6%		`aider --model nvidia_nim/meta/llama-4-maverick-17b-128e-instruct`	99.1%	whole
Dirname : 2025年04月06日-08-39-52--llama-4-maverick-17b-128e-instruct-polyglot-whole Test cases : 225 Model : Llama 4 Maverick Edit format : whole Commit hash : 9445a31 Pass rate 1 : 4.4 Pass rate 2 : 15.6 Pass num 1 : 10 Pass num 2 : 35 Percent cases well formed : 99.1 Error outputs : 12 Num malformed responses : 2 Num with malformed responses : 2 User asks : 248 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model nvidia_nim/meta/llama-4-maverick-17b-128e-instruct` Date : 2025年04月06日 Versions : 0.81.2.dev Seconds per case : 20.5 Total cost : 0.0
	yi-lightning	12.9%		`aider --model openai/yi-lightning`	92.9%	whole
Dirname : 2024年12月23日-01-11-56--yi-test Test cases : 225 Model : yi-lightning Edit format : whole Commit hash : 2b1625e Pass rate 1 : 5.8 Pass rate 2 : 12.9 Pass num 1 : 13 Pass num 2 : 29 Percent cases well formed : 92.9 Error outputs : 87 Num malformed responses : 72 Num with malformed responses : 16 User asks : 107 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 6 Total tests : 225 Command : `aider --model openai/yi-lightning` Date : 2024年12月23日 Versions : 0.69.2.dev Seconds per case : 146.7 Total cost : 0.0
	command-a-03-2025-quality	12.0%		`OPENAI_API_BASE=https://api.cohere.ai/compatibility/v1 aider --model openai/command-a-03-2025-quality`	99.6%	whole
Dirname : 2025年03月14日-23-40-00--cmda-quality-whole2 Test cases : 225 Model : command-a-03-2025-quality Edit format : whole Commit hash : a1aa63f Pass rate 1 : 2.2 Pass rate 2 : 12.0 Pass num 1 : 5 Pass num 2 : 27 Percent cases well formed : 99.6 Error outputs : 2 Num malformed responses : 1 Num with malformed responses : 1 User asks : 215 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 4 Total tests : 225 Command : `OPENAI_API_BASE=https://api.cohere.ai/compatibility/v1 aider --model openai/command-a-03-2025-quality` Date : 2025年03月14日 Versions : 0.77.1.dev Seconds per case : 85.1 Total cost : 0.0
	Codestral 25.01	11.1%	1ドル.98	`aider --model mistral/codestral-latest`	100.0%	whole
Dirname : 2025年01月13日-18-17-25--codestral-whole2 Test cases : 225 Model : Codestral 25.01 Edit format : whole Commit hash : 0cba898-dirty Pass rate 1 : 4.0 Pass rate 2 : 11.1 Pass num 1 : 9 Pass num 2 : 25 Percent cases well formed : 100.0 Error outputs : 0 Num malformed responses : 0 Num with malformed responses : 0 User asks : 47 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model mistral/codestral-latest` Date : 2025年01月13日 Versions : 0.71.2.dev Seconds per case : 9.3 Total cost : 1.9834
	openhands-lm-32b-v0.1	10.2%		`aider --model openrouter/all-hands/openhands-lm-32b-v0.1`	95.1%	whole
Dirname : 2025年04月19日-14-43-04--o4-mini-patch Test cases : 225 Model : openhands-lm-32b-v0.1 Edit format : whole Commit hash : c08336f Pass rate 1 : 4.0 Pass rate 2 : 10.2 Pass num 1 : 9 Pass num 2 : 23 Percent cases well formed : 95.1 Error outputs : 55 Num malformed responses : 41 Num with malformed responses : 11 User asks : 166 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 11 Total tests : 225 Command : `aider --model openrouter/all-hands/openhands-lm-32b-v0.1` Date : 2025年04月19日 Versions : 0.82.2.dev Seconds per case : 195.6 Total cost : 0.0
	gpt-4.1-nano	8.9%	0ドル.43	`aider --model gpt-4.1-nano`	94.2%	whole
Dirname : 2025年04月14日-22-46-01--gpt41nano-diff Test cases : 225 Model : gpt-4.1-nano Edit format : whole Commit hash : 71d1591-dirty Pass rate 1 : 3.1 Pass rate 2 : 8.9 Pass num 1 : 7 Pass num 2 : 20 Percent cases well formed : 94.2 Error outputs : 20 Num malformed responses : 20 Num with malformed responses : 13 User asks : 316 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 8 Total tests : 225 Command : `aider --model gpt-4.1-nano` Date : 2025年04月14日 Versions : 0.81.4.dev Seconds per case : 12.0 Total cost : 0.4281
	Qwen2.5-Coder-32B-Instruct	8.0%		`aider --model openai/Qwen/Qwen2.5-Coder-32B-Instruct # via hyperbolic`	71.6%	diff
Dirname : 2024年12月22日-13-22-32--polyglot-qwen-diff Test cases : 225 Model : Qwen2.5-Coder-32B-Instruct Edit format : diff Commit hash : 6d7e8be-dirty Pass rate 1 : 4.4 Pass rate 2 : 8.0 Pass num 1 : 10 Pass num 2 : 18 Percent cases well formed : 71.6 Error outputs : 158 Num malformed responses : 148 Num with malformed responses : 64 User asks : 132 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model openai/Qwen/Qwen2.5-Coder-32B-Instruct # via hyperbolic` Date : 2024年12月22日 Versions : 0.69.2.dev Seconds per case : 84.4 Total cost : 0.0
	gemma-3-27b-it	4.9%		`aider --model openrouter/google/gemma-3-27b-it`	100.0%	whole
Dirname : 2025年03月15日-01-21-24--gemma3-27b-or Test cases : 225 Model : gemma-3-27b-it Edit format : whole Commit hash : fd21f51-dirty Pass rate 1 : 1.8 Pass rate 2 : 4.9 Pass num 1 : 4 Pass num 2 : 11 Percent cases well formed : 100.0 Error outputs : 3 Num malformed responses : 0 Num with malformed responses : 0 User asks : 181 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 3 Total tests : 225 Command : `aider --model openrouter/google/gemma-3-27b-it` Date : 2025年03月15日 Versions : 0.77.1.dev Seconds per case : 79.7 Total cost : 0.0
	gpt-4o-mini-2024年07月18日	3.6%	0ドル.32	`aider --model gpt-4o-mini-2024年07月18日`	100.0%	whole
Dirname : 2024年12月21日-18-41-18--polyglot-gpt-4o-mini Test cases : 225 Model : gpt-4o-mini-2024年07月18日 Edit format : whole Commit hash : a755079-dirty Pass rate 1 : 0.9 Pass rate 2 : 3.6 Pass num 1 : 2 Pass num 2 : 8 Percent cases well formed : 100.0 Error outputs : 0 Num malformed responses : 0 Num with malformed responses : 0 User asks : 36 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model gpt-4o-mini-2024年07月18日` Date : 2024年12月21日 Versions : 0.69.2.dev Seconds per case : 17.3 Total cost : 0.3236

By Paul Gauthier, last updated November 20, 2025.

Aider LLM Leaderboards

Aider polyglot coding leaderboard

Table of contents