StableLM-WI-DPO / checkpoint-45
JayanthB's picture
DPO on llm-feedback v1 dataset
3d648bb