Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add BHS benchmark
#3265 opened Aug 25, 2025 by jmichaelov Loading…
Fix Afrobench MasakhanePOS
#3263 opened Aug 25, 2025 by Anri-Lombard Loading…
Adding SPaRC to lm eval harness
#3262 opened Aug 25, 2025 by lkaesberg Loading…
fix gsm8k normalization
#3254 opened Aug 20, 2025 by huaanrui Loading…
Main
#3250 opened Aug 20, 2025 by seongtaehong Loading…
Adding 3LM to lm eval harness
#3241 opened Aug 14, 2025 by GeorgeSherif Loading…
Trim thinking content from model output in IFEval
#3240 opened Aug 14, 2025 by davideguidobene Loading…
Remove gen_prefix space and add warning
#3239 opened Aug 14, 2025 by bendboaz Loading…
Adding support for Structured Generation with XGrammar
#3232 opened Aug 12, 2025 by ceferisbarov Loading…
5 tasks
Pass dataset_kwargs for Unitxt tasks
#3230 opened Aug 11, 2025 by mprahl Loading…
Fewshot refactor
#3227 opened Aug 8, 2025 by baberabb Loading…
1 task
Fix the Unitxt init method to set the task name
#3225 opened Aug 8, 2025 by mprahl Loading…
Update openai_completions.py
#3215 opened Aug 7, 2025 by phseidl Loading…
add intel xpu support for HFLM
#3211 opened Aug 5, 2025 by kaixuanliu Loading…
Support for DDP+MP with native torch and no accelerate
#3205 opened Aug 3, 2025 by xgal Loading…
Add new task: kmmlu_pro, kmmlu_redux
#3198 opened Aug 1, 2025 by jeonghodot Loading…
refactor registry
#3189 opened Jul 28, 2025 by baberabb Loading…
Add eqbench tasks in Spanish and Catalan
#3168 opened Jul 21, 2025 by priverabsc Loading…
ProTip! Adding no:label will show everything without a label.