Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
asedmammad
/
Contextual_KTO_Mistral_PairRM-GGUF
like
2
GGUF
snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
English
kto
dpo
human feedback
rlhf
preferences
alignment
HALO
halos
rl
rlaif
conversational
arxiv:
2402.01306
License:
apache-2.0
Model card
Files
Files and versions
Community
1
Use this model
Create README.md
#1
by
asedmammad
- opened
Mar 11
base:
refs/heads/main
←
from:
refs/pr/1
Discussion
Files changed
+321
-0
asedmammad
Owner
Mar 11
No description provided.
Create README.md
29766fc3
asedmammad
changed pull request status to
merged
Mar 11
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment