Rui Yang's picture

9 7 17

Rui Yang

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

updated a model about 4 hours ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

updated a model about 4 hours ago

Ray2333/GRM-gemma2-2B-rewardmodel-ft

updated a model about 4 hours ago

Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

View all activity

Organizations

Collections 1

Papers 4

arxiv:2411.00836

arxiv:2406.10216

arxiv:2402.10207

arxiv:2310.12955

models 15

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated about 4 hours ago • 566 • 1

Ray2333/GRM-gemma2-2B-rewardmodel-ft

Text Classification • Updated about 4 hours ago • 1.74k • 3

Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

Text Classification • Updated about 4 hours ago • 3.93k • 2

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated about 4 hours ago • 43

Ray2333/GRM-llama3-8B-sftreg

Text Classification • Updated Oct 25 • 77 • 5

Ray2333/GRM-Gemma2-2B-sftreg

Text Classification • Updated Oct 23 • 65 • 1

Ray2333/GRM-Gemma-2B-sftreg

Text Classification • Updated Oct 23 • 45 • 3

Ray2333/GRM-llama3.2-3B-sftreg

Text Classification • Updated Oct 23 • 76 • 1

Ray2333/Gemma-2B-rewardmodel-ft

Updated Sep 13 • 398 • 1

Ray2333/GRM-Gemma-2B-rewardmodel-ft

Updated Sep 13 • 1.22k • 1

datasets 1

Ray2333/RiC_harmless_helpful

Viewer • Updated Jul 12 • 291k • 85