Edit model card

t5-large-finetuned

This model is a fine-tuned version of google-t5/t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6085
  • Rouge1: 25.8315
  • Rouge2: 11.4547
  • Rougel: 22.5227
  • Rougelsum: 22.7341

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
1.7803 1.0 5351 1.6070 25.1375 10.9135 21.8817 22.0576
1.4798 2.0 10702 1.4737 25.4328 11.0728 21.8859 22.0964
1.2923 3.0 16053 1.4838 25.6553 11.3169 22.1861 22.3694
1.1509 4.0 21404 1.4842 25.7181 11.4215 22.271 22.4394
1.0404 5.0 26755 1.5121 26.0812 11.8877 22.7516 22.941
0.9533 6.0 32106 1.5602 25.5218 11.486 22.2236 22.4401
0.888 7.0 37457 1.5832 25.8289 11.5647 22.5507 22.7091
0.8424 8.0 42808 1.6085 25.8315 11.4547 22.5227 22.7341

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
3
Safetensors
Model size
738M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for czartur/t5-large-dc

Base model

google-t5/t5-large
Finetuned
(68)
this model