metadata

license: apache-2.0
base_model: google-t5/t5-small
tags:
  - generated_from_trainer
model-index:
  - name: t5-small-samsum
    results: []
datasets:
  - samsum
pipeline_tag: summarization

t5-small-samsum

This model is a fine-tuned version of google-t5/t5-small on an samsum dataset. It achieves the following results on the evaluation set:

Loss: 1.6507

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 64
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	460	1.9598
2.4944	2.0	921	1.8661
2.0902	3.0	1381	1.8210
2.0173	4.0	1842	1.8009
1.9623	5.0	2302	1.7787
1.9331	6.0	2763	1.7637
1.903	7.0	3223	1.7514
1.881	8.0	3684	1.7390
1.8648	9.0	4144	1.7350
1.8463	10.0	4605	1.7242
1.8302	11.0	5065	1.7189
1.8119	12.0	5526	1.7098
1.8119	13.0	5986	1.7076
1.8007	14.0	6447	1.7057
1.7903	15.0	6907	1.6984
1.778	16.0	7368	1.6944
1.7639	17.0	7828	1.6907
1.7596	18.0	8289	1.6896
1.746	19.0	8749	1.6861
1.7342	20.0	9210	1.6860
1.732	21.0	9670	1.6808
1.719	22.0	10131	1.6760
1.7152	23.0	10591	1.6778
1.7082	24.0	11052	1.6762
1.7003	25.0	11512	1.6707
1.7003	26.0	11973	1.6722
1.6952	27.0	12433	1.6701
1.6848	28.0	12894	1.6671
1.6814	29.0	13354	1.6668
1.6743	30.0	13815	1.6637
1.6742	31.0	14275	1.6640
1.6652	32.0	14736	1.6624
1.6582	33.0	15196	1.6606
1.6575	34.0	15657	1.6605
1.6499	35.0	16117	1.6617
1.6455	36.0	16578	1.6601
1.6506	37.0	17038	1.6594
1.6506	38.0	17499	1.6556
1.637	39.0	17959	1.6570
1.6374	40.0	18420	1.6558
1.6303	41.0	18880	1.6557
1.6311	42.0	19341	1.6553
1.6234	43.0	19801	1.6570
1.619	44.0	20262	1.6537
1.6214	45.0	20722	1.6529
1.6183	46.0	21183	1.6542
1.609	47.0	21643	1.6543
1.6159	48.0	22104	1.6530
1.6101	49.0	22564	1.6524
1.6083	50.0	23025	1.6515
1.6083	51.0	23485	1.6528
1.605	52.0	23946	1.6526
1.6011	53.0	24406	1.6515
1.6028	54.0	24867	1.6517
1.6015	55.0	25327	1.6512
1.601	56.0	25788	1.6504
1.6007	57.0	26248	1.6513
1.5948	58.0	26709	1.6511
1.5973	59.0	27169	1.6515
1.5929	60.0	27630	1.6514
1.5955	61.0	28090	1.6507
1.5931	62.0	28551	1.6507
1.5939	63.0	29011	1.6507
1.5939	63.93	29440	1.6507

Framework versions

Transformers 4.39.1
Pytorch 2.2.1
Datasets 2.18.0
Tokenizers 0.15.2