Zephyr ORPO - a HuggingFaceH4 Collection

HuggingFaceH4 's Collections

Zephyr 7B Gemma

Papers We've Read

Awesome SFT datasets

Awesome feedback datasets

Awesome reward models

Zephyr ORPO

updated Apr 12

Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 62
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1

Text Generation • Updated Apr 18 • 482 • 261
argilla/distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 16 • 7.56k • 992 • 176