Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RyanYr
/
reward-judge_SFT-genRM_pilot-exp
like
0
Text Generation
Transformers
Safetensors
llama
trl
sft
Generated from Trainer
conversational
text-generation-inference
Inference Endpoints
License:
llama3.1
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
reward-judge_SFT-genRM_pilot-exp
Commit History
Model save
f703c71
verified
RyanYr
commited on
Sep 12
Training in progress, step 100
fcf0820
verified
RyanYr
commited on
Sep 12
initial commit
ba4fae3
verified
RyanYr
commited on
Sep 12