Post
2760
𦫠We have just released
argilla/Capybara-Preferences in collaboration with Kaist AI (
@JW17
,
@nlee-208
) and Hugging Face (
@lewtun
)
A new synthetic preference dataset built using
The current dataset combines the already generated alternative completions from argilla/distilabel-capybara-dpo-7k-binarized, while also adding the remaining ones using the same approach!
Here are some key features on how we built it:
- π§Ή Duplicate removal, keeping the conversation besides the last assistant response, and some slight pre-processing
- π€ Generation of alternative completions for the existing conversations (last turn only) with: mlabonne/NeuralBeagle14-7B, argilla/notus-7b-v1, and teknium/OpenHermes-2.5-Mistral-7B
- π¨π»βπ« Running UltraFeedback via GPT-4 to generate the critique i.e. ratings and rationales, for the last assistant responses
- π Finally, we selected the chosen and rejected responses based on their UltraFeedback score, and applied some slight post-processing!
Sounds simple right? Start building your own synthetic datasets with https://github.com/argilla-io/distilabel already!
A new synthetic preference dataset built using
distilabel
on top of the awesome
LDJnr/Capybara from
@LDJnr
The current dataset combines the already generated alternative completions from argilla/distilabel-capybara-dpo-7k-binarized, while also adding the remaining ones using the same approach!
Here are some key features on how we built it:
- π§Ή Duplicate removal, keeping the conversation besides the last assistant response, and some slight pre-processing
- π€ Generation of alternative completions for the existing conversations (last turn only) with: mlabonne/NeuralBeagle14-7B, argilla/notus-7b-v1, and teknium/OpenHermes-2.5-Mistral-7B
- π¨π»βπ« Running UltraFeedback via GPT-4 to generate the critique i.e. ratings and rationales, for the last assistant responses
- π Finally, we selected the chosen and rejected responses based on their UltraFeedback score, and applied some slight post-processing!
Sounds simple right? Start building your own synthetic datasets with https://github.com/argilla-io/distilabel already!