ptt5-cstnews / README.md
arthurmluz's picture
Model save
133e676
metadata
license: mit
base_model: unicamp-dl/ptt5-base-portuguese-vocab
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: ptt5-cstnews
    results: []

ptt5-cstnews

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2824
  • Rouge1: 0.1798
  • Rouge2: 0.1214
  • Rougel: 0.1629
  • Rougelsum: 0.1734
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 88 3.1376 0.1326 0.089 0.1198 0.1274 19.0
No log 2.0 176 2.5907 0.1782 0.1106 0.1507 0.169 19.0
3.838 3.0 264 2.4840 0.1816 0.1192 0.1607 0.1747 19.0
3.838 4.0 352 2.4281 0.1784 0.117 0.1605 0.1725 19.0
2.4977 5.0 440 2.3860 0.1757 0.1138 0.158 0.1697 19.0
2.4977 6.0 528 2.3611 0.18 0.121 0.1628 0.1744 19.0
2.3101 7.0 616 2.3437 0.1792 0.1203 0.1625 0.1733 19.0
2.3101 8.0 704 2.3327 0.1791 0.121 0.1625 0.1734 19.0
2.3101 9.0 792 2.3165 0.1795 0.1217 0.1625 0.1741 19.0
2.1814 10.0 880 2.3109 0.1772 0.1186 0.1598 0.1711 19.0
2.1814 11.0 968 2.2978 0.1785 0.1201 0.1611 0.1726 19.0
2.1193 12.0 1056 2.2923 0.1792 0.1204 0.1618 0.1733 19.0
2.1193 13.0 1144 2.2958 0.1789 0.1204 0.1613 0.1729 19.0
2.0126 14.0 1232 2.2870 0.1785 0.1204 0.161 0.1725 19.0
2.0126 15.0 1320 2.2872 0.1789 0.1204 0.1613 0.1728 19.0
1.9237 16.0 1408 2.2799 0.1792 0.1204 0.1618 0.1733 19.0
1.9237 17.0 1496 2.2825 0.1787 0.1219 0.1626 0.1732 19.0
1.9237 18.0 1584 2.2788 0.1787 0.1219 0.1626 0.1729 19.0
1.9157 19.0 1672 2.2787 0.1784 0.1215 0.1622 0.1727 19.0
1.9157 20.0 1760 2.2835 0.1776 0.1211 0.1614 0.1721 19.0
1.8614 21.0 1848 2.2785 0.1808 0.1218 0.1636 0.1752 19.0
1.8614 22.0 1936 2.2823 0.1795 0.1214 0.1626 0.1732 19.0
1.8565 23.0 2024 2.2774 0.1798 0.1214 0.1629 0.1734 19.0
1.8565 24.0 2112 2.2797 0.1798 0.1214 0.1629 0.1734 19.0
1.8076 25.0 2200 2.2818 0.1798 0.1214 0.1629 0.1734 19.0
1.8076 26.0 2288 2.2825 0.1795 0.1214 0.1626 0.1732 19.0
1.8076 27.0 2376 2.2825 0.1795 0.1214 0.1626 0.1732 19.0
1.7745 28.0 2464 2.2823 0.1798 0.1214 0.1629 0.1734 19.0
1.7745 29.0 2552 2.2829 0.1795 0.1214 0.1626 0.1732 19.0
1.8083 30.0 2640 2.2824 0.1798 0.1214 0.1629 0.1734 19.0

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.1