the good initial lr?

by butyuhao - opened Jan 17, 2023

Jan 17, 2023

What learning rate should I use at first to fine-tune bart-large?

Fudan NLP org Jan 17, 2023

A good start is 1e-5 or 2e-5, with lr warm-up and decay. In the paper, we grid search the lr in [5e-6, 1e-5, 2e-5, 5e-5].

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment