Fine-tuning and DPO

by agershun - opened Dec 9, 2023

Dec 9, 2023

Could you share the thoughts about these questions:

Thank you!

Call Comply org Dec 9, 2023

•

yes it will work great to finetune, I recommend using huggingface trl.

Dec 9, 2023

It trains fine, thank you!

I adapted the code from this article and then modified the prompts to the Starling-LM format.

Call Comply org Dec 9, 2023

Nice work, I am glad you figured it out, let me know if you have any questions. Thanks for your support!

Dec 10, 2023

I finished wtih DPO. It also works fine with this model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment