VaniLara commited on
Commit
8c6c368
1 Parent(s): 5b9f0ce

Model save

Browse files
README.md CHANGED
@@ -18,10 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [vgaraujov/bart-base-spanish](https://huggingface.co/vgaraujov/bart-base-spanish) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.0123
22
- - Bleu: 81.8551
23
- - Rouge: {'rouge1': 0.9422515866486456, 'rouge2': 0.8737405002405003, 'rougeL': 0.9363672046907345, 'rougeLsum': 0.9364728490463785}
24
- - Ter: 8.7118
25
 
26
  ## Model description
27
 
@@ -41,8 +41,8 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 1.5e-05
44
- - train_batch_size: 12
45
- - eval_batch_size: 12
46
  - seed: 42
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
@@ -53,31 +53,31 @@ The following hyperparameters were used during training:
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Ter |
55
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:---------------------------------------------------------------------------------------------------------------------------:|:-------:|
56
- | 0.0324 | 1.0 | 225 | 0.0225 | 72.7749 | {'rouge1': 0.8830902523402526, 'rouge2': 0.7709874297815476, 'rougeL': 0.8694410309905668, 'rougeLsum': 0.869743878469885} | 16.4968 |
57
- | 0.0187 | 2.0 | 450 | 0.0155 | 81.9344 | {'rouge1': 0.9237037532184591, 'rouge2': 0.8457443112443113, 'rougeL': 0.9173110054433588, 'rougeLsum': 0.9176172563819627} | 10.2873 |
58
- | 0.0109 | 3.0 | 675 | 0.0142 | 79.3413 | {'rouge1': 0.9307991409462, 'rouge2': 0.8545628630628632, 'rougeL': 0.9241576974224035, 'rougeLsum': 0.9239856347356347} | 10.8434 |
59
- | 0.0099 | 4.0 | 900 | 0.0123 | 82.4245 | {'rouge1': 0.9380482866806397, 'rouge2': 0.8707410459910461, 'rougeL': 0.9337147808618397, 'rougeLsum': 0.9338772787669846} | 8.9898 |
60
- | 0.0115 | 5.0 | 1125 | 0.0117 | 80.3604 | {'rouge1': 0.9482994702665755, 'rouge2': 0.8712202111613877, 'rougeL': 0.942820963393332, 'rougeLsum': 0.942393967315794} | 8.6191 |
61
- | 0.0056 | 6.0 | 1350 | 0.0117 | 83.0304 | {'rouge1': 0.9462984866440752, 'rouge2': 0.8715856458356457, 'rougeL': 0.9409380815263171, 'rougeLsum': 0.9412039578068991} | 8.4337 |
62
- | 0.0059 | 7.0 | 1575 | 0.0119 | 83.0528 | {'rouge1': 0.9496951399993196, 'rouge2': 0.8794314791961855, 'rougeL': 0.9442460554581298, 'rougeLsum': 0.9445281533052429} | 8.1557 |
63
- | 0.0055 | 8.0 | 1800 | 0.0119 | 82.0772 | {'rouge1': 0.9461830571389397, 'rouge2': 0.8765663040663043, 'rougeL': 0.9411501096942276, 'rougeLsum': 0.9412948766919356} | 8.9898 |
64
- | 0.0045 | 9.0 | 2025 | 0.0124 | 81.5960 | {'rouge1': 0.9417869646693177, 'rouge2': 0.8676474946622006, 'rougeL': 0.935177818769924, 'rougeLsum': 0.9352072609193355} | 8.8044 |
65
- | 0.0046 | 10.0 | 2250 | 0.0123 | 80.3279 | {'rouge1': 0.9406776813371245, 'rouge2': 0.8716705772441067, 'rougeL': 0.9355530020947977, 'rougeLsum': 0.9350984560102209} | 8.9898 |
66
- | 0.0041 | 11.0 | 2475 | 0.0118 | 81.1506 | {'rouge1': 0.9449248578219166, 'rouge2': 0.8736089204912735, 'rougeL': 0.9394655745870917, 'rougeLsum': 0.9396245046663005} | 8.7118 |
67
- | 0.004 | 12.0 | 2700 | 0.0121 | 82.4137 | {'rouge1': 0.9479337149778327, 'rouge2': 0.8761151903651905, 'rougeL': 0.9422136735813209, 'rougeLsum': 0.9425275606746195} | 8.1557 |
68
- | 0.0035 | 13.0 | 2925 | 0.0121 | 81.4307 | {'rouge1': 0.9436979997127055, 'rouge2': 0.8760786305198072, 'rougeL': 0.9377141821886406, 'rougeLsum': 0.9378051091495366} | 8.5264 |
69
- | 0.0034 | 14.0 | 3150 | 0.0118 | 81.7553 | {'rouge1': 0.9440729475203159, 'rouge2': 0.8736759498671265, 'rougeL': 0.9381740514387574, 'rougeLsum': 0.9378773147217421} | 8.9898 |
70
- | 0.0037 | 15.0 | 3375 | 0.0121 | 81.6178 | {'rouge1': 0.9456137800108388, 'rouge2': 0.873859483109483, 'rougeL': 0.9394098205715855, 'rougeLsum': 0.9399457127839486} | 8.7118 |
71
- | 0.0028 | 16.0 | 3600 | 0.0122 | 81.1136 | {'rouge1': 0.9433953611747732, 'rouge2': 0.8706926129426129, 'rougeL': 0.9381087131822429, 'rougeLsum': 0.9380460381122143} | 8.8971 |
72
- | 0.0029 | 17.0 | 3825 | 0.0123 | 81.8608 | {'rouge1': 0.941960338680927, 'rouge2': 0.8727757427757428, 'rougeL': 0.9359009315847551, 'rougeLsum': 0.936447181903064} | 8.8044 |
73
- | 0.0029 | 18.0 | 4050 | 0.0123 | 81.6538 | {'rouge1': 0.9453945182784191, 'rouge2': 0.8744925183748715, 'rougeL': 0.9399012401856829, 'rougeLsum': 0.9405087715002577} | 8.6191 |
74
- | 0.0027 | 19.0 | 4275 | 0.0123 | 81.9450 | {'rouge1': 0.9431345337668868, 'rouge2': 0.8752672512672515, 'rougeL': 0.937374804117451, 'rougeLsum': 0.937653051197169} | 8.6191 |
75
- | 0.0026 | 20.0 | 4500 | 0.0123 | 81.8551 | {'rouge1': 0.9422515866486456, 'rouge2': 0.8737405002405003, 'rougeL': 0.9363672046907345, 'rougeLsum': 0.9364728490463785} | 8.7118 |
76
 
77
 
78
  ### Framework versions
79
 
80
- - Transformers 4.41.2
81
- - Pytorch 2.3.0+cu121
82
  - Datasets 2.20.0
83
  - Tokenizers 0.19.1
 
18
 
19
  This model is a fine-tuned version of [vgaraujov/bart-base-spanish](https://huggingface.co/vgaraujov/bart-base-spanish) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.0118
22
+ - Bleu: 81.9815
23
+ - Rouge: {'rouge1': 0.9443279678697634, 'rouge2': 0.8734148591060358, 'rougeL': 0.9402155941323899, 'rougeLsum': 0.939641014300457}
24
+ - Ter: 8.2484
25
 
26
  ## Model description
27
 
 
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 1.5e-05
44
+ - train_batch_size: 32
45
+ - eval_batch_size: 16
46
  - seed: 42
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
 
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Ter |
55
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:---------------------------------------------------------------------------------------------------------------------------:|:-------:|
56
+ | 0.0843 | 1.0 | 85 | 0.0450 | 44.4102 | {'rouge1': 0.7730682242594011, 'rouge2': 0.5854813797313798, 'rougeL': 0.7516482322144085, 'rougeLsum': 0.7515933978875158} | 37.4421 |
57
+ | 0.0327 | 2.0 | 170 | 0.0211 | 74.7194 | {'rouge1': 0.8898686025017295, 'rouge2': 0.7856047476488655, 'rougeL': 0.8819296092337889, 'rougeLsum': 0.8808977661485402} | 14.7359 |
58
+ | 0.0208 | 3.0 | 255 | 0.0170 | 78.3049 | {'rouge1': 0.9214616309616313, 'rouge2': 0.8310873663373664, 'rougeL': 0.9134419132066192, 'rougeLsum': 0.912669522198934} | 11.4921 |
59
+ | 0.0157 | 4.0 | 340 | 0.0143 | 78.9394 | {'rouge1': 0.9265025192672252, 'rouge2': 0.8525591075591076, 'rougeL': 0.920511965159024, 'rougeLsum': 0.9201432630550277} | 11.1214 |
60
+ | 0.0141 | 5.0 | 425 | 0.0141 | 78.7360 | {'rouge1': 0.9380953893600955, 'rouge2': 0.8617384374884377, 'rougeL': 0.9308421148568208, 'rougeLsum': 0.9300975337298869} | 10.3800 |
61
+ | 0.0109 | 6.0 | 510 | 0.0135 | 83.9518 | {'rouge1': 0.935395465917525, 'rouge2': 0.8671870754517816, 'rougeL': 0.9302533751365021, 'rougeLsum': 0.92907347000265} | 9.1752 |
62
+ | 0.0099 | 7.0 | 595 | 0.0126 | 80.5408 | {'rouge1': 0.9415159377659379, 'rouge2': 0.8721845746845747, 'rougeL': 0.9354884592531653, 'rougeLsum': 0.9349897020191137} | 9.3605 |
63
+ | 0.0102 | 8.0 | 680 | 0.0125 | 82.4160 | {'rouge1': 0.9432206608750726, 'rouge2': 0.8694791042291042, 'rougeL': 0.9373894379041439, 'rougeLsum': 0.9365909324879915} | 8.7118 |
64
+ | 0.0089 | 9.0 | 765 | 0.0127 | 81.8874 | {'rouge1': 0.9440492903066432, 'rouge2': 0.8746245236245236, 'rougeL': 0.9401187072731193, 'rougeLsum': 0.9394614353511412} | 8.9898 |
65
+ | 0.0073 | 10.0 | 850 | 0.0120 | 81.7304 | {'rouge1': 0.9486932005902595, 'rouge2': 0.8767943445443445, 'rougeL': 0.94263671377642, 'rougeLsum': 0.9421138461211994} | 8.5264 |
66
+ | 0.0079 | 11.0 | 935 | 0.0122 | 82.8916 | {'rouge1': 0.9489730179950766, 'rouge2': 0.8816752136752137, 'rougeL': 0.9447908921144217, 'rougeLsum': 0.9441282361429422} | 7.8777 |
67
+ | 0.0071 | 12.0 | 1020 | 0.0120 | 81.8038 | {'rouge1': 0.9475330723198372, 'rouge2': 0.8747884060384061, 'rougeL': 0.9424402721461548, 'rougeLsum': 0.9419356876783349} | 8.2484 |
68
+ | 0.0068 | 13.0 | 1105 | 0.0116 | 81.4716 | {'rouge1': 0.9434723177149648, 'rouge2': 0.8697018722018725, 'rougeL': 0.9390087110601817, 'rougeLsum': 0.9380446906035144} | 8.6191 |
69
+ | 0.0066 | 14.0 | 1190 | 0.0116 | 83.0533 | {'rouge1': 0.9496244270435449, 'rouge2': 0.8807424704924707, 'rougeL': 0.9455004138018845, 'rougeLsum': 0.9448490578049402} | 7.8777 |
70
+ | 0.0055 | 15.0 | 1275 | 0.0120 | 81.9058 | {'rouge1': 0.9482457927349568, 'rouge2': 0.8757257834757834, 'rougeL': 0.9434175579322643, 'rougeLsum': 0.942753301480082} | 8.1557 |
71
+ | 0.0054 | 16.0 | 1360 | 0.0121 | 82.1548 | {'rouge1': 0.9455781873355404, 'rouge2': 0.8749565527065528, 'rougeL': 0.9421678762414054, 'rougeLsum': 0.9410527226041933} | 8.5264 |
72
+ | 0.0058 | 17.0 | 1445 | 0.0119 | 82.2341 | {'rouge1': 0.9475412104235637, 'rouge2': 0.8764932844932849, 'rougeL': 0.9435994822171294, 'rougeLsum': 0.9428020011034717} | 8.3411 |
73
+ | 0.0053 | 18.0 | 1530 | 0.0117 | 82.1956 | {'rouge1': 0.9448616840675663, 'rouge2': 0.8742621082621084, 'rougeL': 0.9408799656226128, 'rougeLsum': 0.9401594728800612} | 8.1557 |
74
+ | 0.005 | 19.0 | 1615 | 0.0117 | 81.8240 | {'rouge1': 0.9442392093200918, 'rouge2': 0.8730217745217748, 'rougeL': 0.9401292441218911, 'rougeLsum': 0.9394873525167642} | 8.3411 |
75
+ | 0.0047 | 20.0 | 1700 | 0.0118 | 81.9815 | {'rouge1': 0.9443279678697634, 'rouge2': 0.8734148591060358, 'rougeL': 0.9402155941323899, 'rougeLsum': 0.939641014300457} | 8.2484 |
76
 
77
 
78
  ### Framework versions
79
 
80
+ - Transformers 4.42.4
81
+ - Pytorch 2.3.1+cu121
82
  - Datasets 2.20.0
83
  - Tokenizers 0.19.1
generation_config.json CHANGED
@@ -7,5 +7,5 @@
7
  "no_repeat_ngram_size": 3,
8
  "num_beams": 4,
9
  "pad_token_id": 1,
10
- "transformers_version": "4.41.2"
11
  }
 
7
  "no_repeat_ngram_size": 3,
8
  "num_beams": 4,
9
  "pad_token_id": 1,
10
+ "transformers_version": "4.42.4"
11
  }
runs/Jul18_17-55-45_34beeb718974/events.out.tfevents.1721325432.34beeb718974.10921.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3d7e2e77a7e21bca4fc988a83c5c667a4ff4b185247ac61b8ea1d8207aec8e81
3
- size 30332
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7466f7cf0258bbcef99fd85605880b716a4f575d23a92db75010b0a47c4f5837
3
+ size 31052