25 3 20

Haoxiang Wang

Haoxiang-Wang

https://haoxiang-wang.github.io/

AI & ML interests

Machine Learning (Transfer Learning, OOD Generalization, Domain Adaptation, Meta-Learning)

Recent Activity

liked a dataset 1 day ago

Xkev/LLaVA-CoT-100k

updated a model 20 days ago

nvidia/Cosmos-Tokenizer-DV8x16x16

updated a model 20 days ago

nvidia/Cosmos-Tokenizer-DV8x8x8

View all activity

Organizations

Haoxiang-Wang's activity

New activity in nvidia/Cosmos-Tokenizer-DV8x16x16 about 1 month ago

Update README.md

#1 opened about 1 month ago by

Haoxiang-Wang

New activity in sfairXC/FsfairX-LLaMA3-RM-v0.1 about 2 months ago

Update README.md

#6 opened about 2 months ago by

Haoxiang-Wang

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 3 months ago

Why is the code-complexity coefficient so high in the demo example?

#16 opened 3 months ago by

icdt

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 4 months ago

Special tokens in the vocabulary?

#13 opened 4 months ago by

nshen7

Original reward space

#15 opened 4 months ago by

anjaa

[AUTOMATED] Model Memory Requirements

#5 opened 6 months ago by

model-sizer-bot

What is the range of the output score from the model?

#12 opened 4 months ago by

nshen7

Why is `multi_obj_rewards` multipled by 5, but then 0.5 is subtracted from it?

#11 opened 5 months ago by

xzuyn

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 5 months ago

Update README.md

#3 opened 6 months ago by

philschmid

Issue when finetuning the reward model on custom dataset

#2 opened 6 months ago by

yguooo

Longer context

#10 opened 5 months ago by

salazaaar

batched predictions with padding through the model don't seem to work correctly

#7 opened 5 months ago by

karthikramen

ModuleNotFoundError: No module named 'transformers_modules.RLHFlow.ArmoRM-Llama3-8B-v0'

#6 opened 5 months ago by

fchaubard

Why Not Utilize a Sigmoid Function in the Regression Layer?

#8 opened 5 months ago by

xwz-xmu

New activity in allenai/reward-bench 6 months ago

Separate Scores: With & Without Prior Sets

#6 opened 6 months ago by

Haoxiang-Wang

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 6 months ago

Problem running the model

#1 opened 6 months ago by

Asaf-Yehudai

New activity in RLHFlow/LLaMA3-iterative-DPO-final 6 months ago

exl2 quants

#2 opened 6 months ago by

Apel-sin

New activity in RLHFlow/pair-preference-model-LLaMA3-8B 6 months ago

CAn you specify the license for this model please ?

#1 opened 6 months ago by

sparsh35

commented a paper 7 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67 •

New activity in prometheus-eval/Feedback-Bench 8 months ago

Data Description

#2 opened 8 months ago by

Haoxiang-Wang