Edit model card

Swin-Bert_Mimic

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1025
  • Rouge1: 35.8104
  • Rouge2: 22.5915
  • Rougel: 34.3056
  • Rougelsum: 35.1416
  • Gen Len: 21.289

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0677 1.0 7500 0.0742 34.0952 25.4639 34.0546 34.0407 14.412
0.0621 2.0 15000 0.0686 37.767 26.9356 37.0596 37.4647 18.921
0.0595 3.0 22500 0.0670 38.07 26.9203 37.1384 37.7633 22.422
0.0536 4.0 30000 0.0655 38.064 27.0799 37.3483 37.7981 18.476
0.0484 5.0 37500 0.0655 38.8419 27.551 37.992 38.573 19.552
0.0436 6.0 45000 0.0672 39.2556 27.3445 38.1583 38.9199 19.699
0.0394 7.0 52500 0.0680 38.6881 27.1077 37.6518 38.3678 19.322
0.0355 8.0 60000 0.0697 39.2775 27.1638 38.1169 38.786 20.125
0.0318 9.0 67500 0.0719 38.8973 27.0819 37.8138 38.4725 20.237
0.0265 10.0 75000 0.0746 38.2854 26.3015 37.0627 37.8955 20.799
0.0241 11.0 82500 0.0769 37.7814 25.9821 36.6626 37.3682 20.437
0.0204 12.0 90000 0.0810 37.7945 26.012 36.5089 37.3188 20.945
0.0172 13.0 97500 0.0846 37.5296 25.3082 36.2752 36.9433 20.397
0.0147 14.0 105000 0.0876 36.6675 24.5001 35.264 36.034 22.044
0.012 15.0 112500 0.0907 35.8928 23.4706 34.3812 35.2234 21.344
0.0103 16.0 120000 0.0947 35.6648 22.8131 34.1013 35.0637 22.095
0.0084 17.0 127500 0.0971 35.7702 22.9984 34.2882 35.1362 21.501
0.0068 18.0 135000 0.0996 35.4212 22.3513 33.9646 34.8255 22.152
0.0058 19.0 142500 0.1019 35.9704 23.1195 34.4672 35.3553 21.404
0.0048 20.0 150000 0.1025 35.8104 22.5915 34.3056 35.1416 21.289

Framework versions

  • Transformers 4.37.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.1
Downloads last month
7
Safetensors
Model size
226M params
Tensor type
I64
·
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.