GRM - a Ray2333 Collection

Ray2333 's Collections

GRM

GRM

updated 6 days ago

Generalizable Reward Models

Ray2333/GRM-llama3-8B-sftreg

Text Classification • Updated Oct 25 • 76 • 5
Ray2333/GRM-llama3-8B-distill

Text Classification • Updated Jul 6 • 258 • 6
Ray2333/GRM-Gemma-2B-sftreg

Text Classification • Updated Oct 23 • 44 • 3
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

Paper • 2406.10216 • Published Jun 14 • 2
Ray2333/GRM-Gemma-2B-rewardmodel-ft

Updated Sep 13 • 1.21k • 1
Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated about 16 hours ago • 567 • 1
Ray2333/GRM-llama3.2-3B-sftreg

Text Classification • Updated Oct 23 • 79 • 1
Ray2333/GRM-Gemma2-2B-sftreg

Text Classification • Updated Oct 23 • 68 • 1
Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

Text Classification • Updated about 16 hours ago • 4.01k • 2
Ray2333/GRM-gemma2-2B-rewardmodel-ft

Text Classification • Updated about 16 hours ago • 1.81k • 3
Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated about 16 hours ago • 43