StableLM-WI-DPO / checkpoint-45 /trainer_state.json

Commit History

DPO on llm-feedback v1 dataset
3d648bb

JayanthB commited on