Finetuning Scripts

#5
by abrakaa - opened

Does anyone happen to have the fine-tuning scripts for this model? What is the minimum GPU requirement for fine-tuning?

Microsoft org
β€’
edited May 22

@abrakaa Thanks for your interest in our model. We may open-source the finetuning script within the near future.

@donniems We need it soon :)

@donniems looking forward to it

@donniems Need it now! Really looking forward to it!

@donniems Can’t wait to finetune on my own data!

@abrakaa Thanks for your interest in our model. We may open-source the finetuning script within the near future.

I wrote a finetuning script for Phi3-V: https://github.com/GaiZhenbiao/Phi3V-Finetuning/tree/main , enjoy πŸ˜‰

https://github.com/2U1/Phi3-Vision-ft

I've made a code that has option to fine-tune full module (including vision model) like llava-1.6 !

Microsoft org

Thank you all your interest in Phi-3 Vision model.
This is the finetuning recipe https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md

@nguyenbh Hi thank you so much for the finetune example! The example stated that the average tokens of DocVQA dataset is about 2XXX, I wonder if there is a long input would that cause OOM error? Does script have protection about that?

Microsoft org

@eddtsoi there's no explicit code for handling OOM. Note that the 2k average #tokens include the image tokens. In some cases that doesn't require high resolution, it is possible to finetune with a lower --num_crops to reduce the sequence length. For full finetuning, we have tested on 4x 48GB and 8x 32GB GPUs. If using (q)lora, I believe a single consumer level 24GB GPU works.

Sign up or log in to comment