Fine Tuning with QLora Overfitting

#26

by RicoRausch - opened Apr 21

Discussion

RicoRausch

Apr 21

•

edited Apr 22

Hello,
I am fine-tuning a model using Qlora, but I'm encountering overfitting issues. Increasing the dropout hasn't significantly improved the performance on the test set. I would appreciate any advice on how to mitigate overfitting. The project involves an OCR task aimed at extracting specific fields, and the model is particularly struggling with extracting addresses. Thank you for your help.

VictorSanh

Apr 22

Hi @RicoRausch

That sounds like a general question (i.e. not specific to idefics2 itself) that would be more suitable for the discussion forum (https://discuss.huggingface.co/). i could find a few discussions on overffitting.

Generally speaking, a few things you can explore without knowing too much about your problem: bigger weight decay, fine-tuning less parameters, doing early exit.

VictorSanh changed discussion status to closed Apr 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment