folflo's picture
Training complete
d132b7f
metadata
license: mit
base_model: facebook/mbart-large-50
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mbart-large-50-finetuned-model-hu_1121
    results: []

mbart-large-50-finetuned-model-hu_1121

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6534
  • Rouge1: 35.6227
  • Rouge2: 13.0189
  • Rougel: 22.0402
  • Rougelsum: 26.9175

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.9553 1.0 21353 2.5450 33.3195 12.2415 21.2029 25.3382
2.2811 2.0 42706 2.3570 33.6149 11.975 20.9943 25.726
1.9886 3.0 64059 2.3144 34.6221 12.2867 21.7798 26.1901
1.7463 4.0 85412 2.3198 35.2114 12.9183 22.215 27.1176
1.5245 5.0 106765 2.3774 35.1147 13.1621 22.3167 26.9264
1.3222 6.0 128118 2.4642 35.5719 13.1532 22.0023 26.8084
1.1456 7.0 149471 2.5673 35.9156 13.2115 22.2552 27.2581
1.0087 8.0 170824 2.6534 35.6227 13.0189 22.0402 26.9175

Framework versions

  • Transformers 4.35.1
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1