arthurmluz's picture
Model save
dce9280
|
raw
history blame
4.86 kB
metadata
license: mit
base_model: unicamp-dl/ptt5-base-portuguese-vocab
tags:
  - generated_from_trainer
datasets:
  - xlsum
metrics:
  - rouge
model-index:
  - name: ptt5-xlsumm-30epochs
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: xlsum
          type: xlsum
          config: portuguese
          split: test
          args: portuguese
        metrics:
          - name: Rouge1
            type: rouge
            value: 0.3246

ptt5-xlsumm-30epochs

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on the xlsum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1787
  • Rouge1: 0.3246
  • Rouge2: 0.1471
  • Rougel: 0.2617
  • Rougelsum: 0.2641
  • Gen Len: 18.7065

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.5549 1.0 28701 2.2631 0.3049 0.1232 0.2416 0.2436 18.8318
2.4878 2.0 57402 2.2000 0.3134 0.1322 0.2497 0.2519 18.7965
2.3021 3.0 86103 2.1654 0.3161 0.1354 0.2528 0.255 18.732
2.2808 4.0 114804 2.1423 0.3179 0.1376 0.2544 0.2565 18.7217
2.2731 5.0 143505 2.1291 0.3202 0.1405 0.2567 0.259 18.7037
2.1654 6.0 172206 2.1209 0.3209 0.1417 0.2577 0.2598 18.6956
2.1716 7.0 200907 2.1173 0.3213 0.1423 0.2584 0.2606 18.7256
2.0696 8.0 229608 2.1136 0.3234 0.1441 0.2603 0.2627 18.7352
2.0492 9.0 258309 2.1123 0.3214 0.1425 0.2589 0.261 18.6357
2.0953 10.0 287010 2.1136 0.3244 0.146 0.2611 0.2634 18.7001
2.0358 11.0 315711 2.1180 0.3248 0.1466 0.2617 0.2639 18.6868
1.9475 12.0 344412 2.1191 0.3243 0.1463 0.2614 0.2637 18.6707
2.0194 13.0 373113 2.1181 0.3253 0.1466 0.2616 0.264 18.6939
1.925 14.0 401814 2.1236 0.3232 0.1454 0.2604 0.2629 18.6843
1.9194 15.0 430515 2.1294 0.3239 0.1464 0.2612 0.2636 18.6792
1.9163 16.0 459216 2.1301 0.3248 0.1464 0.261 0.2635 18.701
1.8482 17.0 487917 2.1366 0.325 0.1473 0.2619 0.2644 18.6786
1.8637 18.0 516618 2.1387 0.3263 0.1483 0.2624 0.2648 18.6811
1.8496 19.0 545319 2.1425 0.3244 0.1461 0.2613 0.2637 18.6934
1.8565 20.0 574020 2.1513 0.3257 0.1479 0.2626 0.2649 18.702
1.7683 21.0 602721 2.1559 0.3261 0.1482 0.2622 0.2646 18.718
1.7483 22.0 631422 2.1577 0.3254 0.1482 0.2625 0.2649 18.6939
1.7832 23.0 660123 2.1614 0.3234 0.147 0.2616 0.264 18.7033
1.8002 24.0 688824 2.1625 0.3246 0.1477 0.2626 0.2649 18.682
1.7381 25.0 717525 2.1689 0.3253 0.1473 0.2617 0.2641 18.7289
1.7367 26.0 746226 2.1677 0.3255 0.1475 0.2626 0.2649 18.7015
1.752 27.0 774927 2.1760 0.3255 0.1482 0.2631 0.2654 18.7146
1.7595 28.0 803628 2.1753 0.3241 0.1468 0.2616 0.264 18.7036
1.777 29.0 832329 2.1785 0.3246 0.1474 0.2618 0.2643 18.7089
1.7142 30.0 861030 2.1787 0.3246 0.1471 0.2617 0.2641 18.7065

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3