Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
vain05
/
stablelm-2-1_6b-orpo-full-v3
like
0
Text Generation
Transformers
Safetensors
argilla/ultrafeedback-binarized-preferences-cleaned
argilla/distilabel-capybara-dpo-7k-binarized
stablelm
alignment-handbook
trl
orpo
Generated from Trainer
conversational
Inference Endpoints
License:
other
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
cdd729d
stablelm-2-1_6b-orpo-full-v3
/
model.safetensors
Commit History
Training in progress, step 1200
cdd729d
verified
vain05
commited on
Apr 7
Training in progress, step 1100
925296f
verified
vain05
commited on
Apr 7
Training in progress, step 1000
fb236e5
verified
vain05
commited on
Apr 7
Training in progress, step 900
7f38bae
verified
vain05
commited on
Apr 7
Training in progress, step 700
b00f5d2
verified
vain05
commited on
Apr 7
Training in progress, step 600
c7d824d
verified
vain05
commited on
Apr 7
Training in progress, step 500
ec6ac1b
verified
vain05
commited on
Apr 7
Training in progress, step 400
d5cfd9a
verified
vain05
commited on
Apr 7
Training in progress, step 300
0d6dc61
verified
vain05
commited on
Apr 7
Training in progress, step 200
30bf572
verified
vain05
commited on
Apr 7
Training in progress, step 100
77e9f61
verified
vain05
commited on
Apr 7
Training in progress, step 100
703dc7c
verified
vain05
commited on
Apr 7