metadata

license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: google/flan-t5-base
model-index:
  - name: flan-t5-base-AR-LORA-V1
    results: []

flan-t5-base-AR-LORA-V1

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7887
Exact Match: 28.3
Gen Len: 3.592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match	Gen Len
1.1717	1.0	625	0.9465	18.9	3.82
0.8167	2.0	1250	0.8975	17.9	3.923
0.9046	3.0	1875	0.8691	25.4	3.338
0.9501	4.0	2500	0.8624	17.8	3.978
0.884	5.0	3125	0.8469	19.9	3.917
0.8418	6.0	3750	0.8356	24.8	3.596
0.877	7.0	4375	0.8261	19.0	3.926
0.804	8.0	5000	0.8147	23.0	3.732
0.8267	9.0	5625	0.8123	26.0	3.629
0.8979	10.0	6250	0.8132	24.5	3.685
0.8165	11.0	6875	0.8084	28.4	3.517
0.891	12.0	7500	0.8034	28.1	3.548
0.768	13.0	8125	0.8095	29.1	3.45
0.6895	14.0	8750	0.8018	27.7	3.553
0.7796	15.0	9375	0.7996	30.1	3.49
0.787	16.0	10000	0.8013	26.0	3.665
0.811	17.0	10625	0.7979	28.5	3.563
0.7858	18.0	11250	0.7991	26.4	3.64
0.8608	19.0	11875	0.7955	24.8	3.733
0.9044	20.0	12500	0.7913	25.9	3.662
0.9171	21.0	13125	0.7905	25.9	3.708
0.8093	22.0	13750	0.7918	28.1	3.596
0.7653	23.0	14375	0.7940	28.3	3.586
0.9361	24.0	15000	0.7887	28.3	3.592
0.6999	25.0	15625	0.7921	29.6	3.552
0.728	26.0	16250	0.7918	27.8	3.621
0.7169	27.0	16875	0.7908	27.2	3.628
0.6388	28.0	17500	0.7920	28.9	3.572
0.7302	29.0	18125	0.7920	28.8	3.573
0.7651	30.0	18750	0.7917	28.0	3.599

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.2.1
Datasets 2.19.1
Tokenizers 0.19.1