Models and datasets in paper "WPO: Enhancing RLHF with Weighted Preference Optimization".
Wenxuan Zhou
wzhouad
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 1 month ago
MUIRBENCH/MUIRBENCH
authored
a paper
about 2 months ago
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
upvoted
a
paper
about 2 months ago
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
Organizations
Collections
1
models
8
wzhouad/Llama3-Instruct-8B-WPO-HB-v2
Text Generation
•
Updated
•
29
•
4
wzhouad/Llama3-Instruct-8B-WPO-HB
Text Generation
•
Updated
•
7
wzhouad/zephyr-7B-WPO-HB
Text Generation
•
Updated
•
6
wzhouad/gemma-2-9b-it-WPO-HB
Text Generation
•
Updated
•
472
•
30
wzhouad/gemma-2-9b-it-WPO-FP
Text Generation
•
Updated
•
13
wzhouad/zephyr-7B-WPO-FP
Text Generation
•
Updated
•
8
wzhouad/Llama3-Instruct-8B-WPO-FP
Text Generation
•
Updated
•
7
wzhouad/prix-lm
Text Generation
•
Updated
•
19