Fine Tuning with QLora Overfitting
Hello,
I am fine-tuning a model using Qlora, but I'm encountering overfitting issues. Increasing the dropout hasn't significantly improved the performance on the test set. I would appreciate any advice on how to mitigate overfitting. The project involves an OCR task aimed at extracting specific fields, and the model is particularly struggling with extracting addresses. Thank you for your help.
Hi @RicoRausch
That sounds like a general question (i.e. not specific to idefics2 itself) that would be more suitable for the discussion forum (https://discuss.huggingface.co/). i could find a few discussions on overffitting.
Generally speaking, a few things you can explore without knowing too much about your problem: bigger weight decay, fine-tuning less parameters, doing early exit.