-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Pull requests: EleutherAI/lm-evaluation-harness
Pull requests list
Add support for configurable chrF metric parameters in task YAML, fix...
#3363
opened Oct 23, 2025 by
augustlakia
Loading...
[AIME24 | AIME25] Enable Multiple Generation Repeats with Pass@k and Majority@k Metrics
#3351
opened Oct 17, 2025 by
ihebchaa
Loading...
Delegate BOS to the tokenizer;
add_bos_token defaults to None
#3347
opened Oct 15, 2025 by
baberabb
Loading...
Fix PIL image hashing to use actual bytes instead of object repr
#3331
opened Oct 7, 2025 by
tboerstad
Loading...
feat: Add support for accelerate-wrapped models in simple_evaluate()
#3313
opened Sep 26, 2025 by
DhruvaKashyap
Loading...
Support empty response for Completions and ChatCompletions API
#3309
opened Sep 22, 2025 by
tboerstad
Loading...
Adding New Task SLR-Bench : Scalable Logical Reasoning Benchmark
#3305
opened Sep 20, 2025 by
Ahmad21Omar
Loading...
Add long-context evaluation benchmarks (LongBench v2, Babilong, InfiniteBench, Phonebook)
#3256
opened Aug 21, 2025 by
Mariani-code
Loading...
Trim thinking content from model output in IFEval
#3240
opened Aug 14, 2025 by
davideguidobene
Loading...
ProTip!
Filter pull requests by the default branch with base:main.