9 10 19

Shengyi Costa Huang

vwxyzjn

http://costa.sh

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

allenai/OLMo-2-1124-7B-RM

updated a model 1 day ago

allenai/OLMo-2-1124-13B-Instruct

updated a model 1 day ago

allenai/OLMo-2-1124-7B-Instruct

View all activity

Articles

Organizations

Collections 3

Papers 5

spaces 3

Runtime error

🔥

Aim

Sleeping

😻

Vwxyzjn Testyes4

Runtime error

📊

Pyserini Wikipedia Kilt Doc

models 389

vwxyzjn/rm_zephyr_new

Text Classification • Updated Sep 26 • 10

vwxyzjn/online_dpo_vllm_thread_beta_0.03__allenai_open_instruct_dev

Updated Sep 11

vwxyzjn/reward_modeling__EleutherAI_pythia-14m

Updated Aug 24 • 5

vwxyzjn/online_dpo_vllm__vwxyzjn_btulu

Updated Aug 23 • 4

vwxyzjn/online_dpo_vllm__allenai_llama-3-tulu-2-8b

Updated Aug 19 • 5

vwxyzjn/btulu

Text Generation • Updated Aug 19 • 393

vwxyzjn/online_dpo_tulu_2

Text Generation • Updated Aug 19 • 9

vwxyzjn/gkd-model

Updated Aug 15

vwxyzjn/reward_modeling__allenai_llama-3-tulu-2-8b

Updated Aug 11 • 5

vwxyzjn/online_dpo__cleanrl_EleutherAI_pythia-1b-dedupedsfttldr

Updated Aug 9

datasets 282

vwxyzjn/norobot_pref_4860

Viewer • Updated Oct 2 • 59.9k • 69

vwxyzjn/norobot_generation_4860

Viewer • Updated Oct 2 • 29.9k • 44

vwxyzjn/norobot_pref_465

Viewer • Updated Oct 2 • 59.4k • 24

vwxyzjn/norobot_generation_465

Viewer • Updated Oct 2 • 29.7k • 33

vwxyzjn/norobot_generation_16325

Viewer • Updated Oct 2 • 29.7k • 51

vwxyzjn/norobot_pref_11421

Viewer • Updated Oct 2 • 56.1k • 25

vwxyzjn/norobot_generation_11421

Viewer • Updated Oct 2 • 28k • 51

vwxyzjn/rejection_sampling_scores_1727889563

Viewer • Updated Oct 2 • 240 • 16

vwxyzjn/rejection_sampling_1727889563

Viewer • Updated Oct 2 • 60 • 27

vwxyzjn/rejection_sampling_scores_1727889130

Viewer • Updated Oct 2 • 180 • 4

Shengyi Costa Huang

AI & ML interests

Recent Activity

Articles

How NuminaMath Won the 1st AIMO Progress Prize

Preference Optimization for Vision Language Models

Putting RL back in RLHF

Constitutional AI with Open LLMs

The N Implementation Details of RLHF with PPO

Organizations

Collections 3

Papers 5

spaces 3 Sort: Recently updated

Aim

Vwxyzjn Testyes4

Pyserini Wikipedia Kilt Doc

models 389 Sort: Recently updated

datasets 282 Sort: Recently updated

spaces 3

models 389

datasets 282