Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pull requests: RLHFlow/RLHF-Reward-Modeling

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Update eval_reward_bench_pm.py
#56 by HarmanDotpy was merged Apr 24, 2025 Loading...
Update gemma_two_head.py
#47 by Lichang-Chen was merged Dec 10, 2024 Loading...
Update README.md
#45 by Chenluye99 was merged Nov 19, 2024 Loading...
Update deepseek Top-1 acc on MATH
#44 by hanningzhang was merged Nov 18, 2024 Loading...
Update README.md of Deepseek Pass 1 acc
#43 by hanningzhang was closed Nov 17, 2024 Loading...
Rlhflow math
#42 by WeiXiongUST was merged Nov 9, 2024 Loading...
add experiment setup and results for the math prm
#41 by hanningzhang was merged Nov 9, 2024 Loading...
ODIN
#38 by Lichang-Chen was merged Nov 4, 2024 Loading...
Add RRM augmentation
#34 by TerenceLiu4444 was merged Sep 18, 2024 Loading...
Semi-Supervised Reward Modeling (SSRM)
#31 by yifei-he was merged Sep 12, 2024 Loading...
Pairwise preference model dev
#7 by WeiXiongUST was merged May 11, 2024 Loading...
re-organize code
#5 by WeiXiongUST was merged Apr 29, 2024 Loading...
Update eval_bench_mark.py
#4 by ZizhengYang was closed Nov 9, 2024 Loading...
Update eval_bench_mark.py allow use bf16 or f32
#3 by ZizhengYang was closed Nov 9, 2024 Loading...
ProTip! Type g i on any issue or pull request to go back to the issue listing page.

AltStyle によって変換されたページ (->オリジナル) /