Model save
Browse files
README.md
CHANGED
@@ -18,10 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
|
|
18 |
|
19 |
This model is a fine-tuned version of [vgaraujov/bart-base-spanish](https://huggingface.co/vgaraujov/bart-base-spanish) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.
|
22 |
-
- Bleu: 81.
|
23 |
-
- Rouge: {'rouge1': 0.
|
24 |
-
- Ter: 8.
|
25 |
|
26 |
## Model description
|
27 |
|
@@ -41,8 +41,8 @@ More information needed
|
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
- learning_rate: 1.5e-05
|
44 |
-
- train_batch_size:
|
45 |
-
- eval_batch_size:
|
46 |
- seed: 42
|
47 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
48 |
- lr_scheduler_type: linear
|
@@ -53,31 +53,31 @@ The following hyperparameters were used during training:
|
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Ter |
|
55 |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------------------------------------------------------------------------------------------------------------------------:|:-------:|
|
56 |
-
| 0.
|
57 |
-
| 0.
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
|
77 |
|
78 |
### Framework versions
|
79 |
|
80 |
-
- Transformers 4.
|
81 |
-
- Pytorch 2.3.
|
82 |
- Datasets 2.20.0
|
83 |
- Tokenizers 0.19.1
|
|
|
18 |
|
19 |
This model is a fine-tuned version of [vgaraujov/bart-base-spanish](https://huggingface.co/vgaraujov/bart-base-spanish) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.0118
|
22 |
+
- Bleu: 81.9815
|
23 |
+
- Rouge: {'rouge1': 0.9443279678697634, 'rouge2': 0.8734148591060358, 'rougeL': 0.9402155941323899, 'rougeLsum': 0.939641014300457}
|
24 |
+
- Ter: 8.2484
|
25 |
|
26 |
## Model description
|
27 |
|
|
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
- learning_rate: 1.5e-05
|
44 |
+
- train_batch_size: 32
|
45 |
+
- eval_batch_size: 16
|
46 |
- seed: 42
|
47 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
48 |
- lr_scheduler_type: linear
|
|
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Ter |
|
55 |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------------------------------------------------------------------------------------------------------------------------:|:-------:|
|
56 |
+
| 0.0843 | 1.0 | 85 | 0.0450 | 44.4102 | {'rouge1': 0.7730682242594011, 'rouge2': 0.5854813797313798, 'rougeL': 0.7516482322144085, 'rougeLsum': 0.7515933978875158} | 37.4421 |
|
57 |
+
| 0.0327 | 2.0 | 170 | 0.0211 | 74.7194 | {'rouge1': 0.8898686025017295, 'rouge2': 0.7856047476488655, 'rougeL': 0.8819296092337889, 'rougeLsum': 0.8808977661485402} | 14.7359 |
|
58 |
+
| 0.0208 | 3.0 | 255 | 0.0170 | 78.3049 | {'rouge1': 0.9214616309616313, 'rouge2': 0.8310873663373664, 'rougeL': 0.9134419132066192, 'rougeLsum': 0.912669522198934} | 11.4921 |
|
59 |
+
| 0.0157 | 4.0 | 340 | 0.0143 | 78.9394 | {'rouge1': 0.9265025192672252, 'rouge2': 0.8525591075591076, 'rougeL': 0.920511965159024, 'rougeLsum': 0.9201432630550277} | 11.1214 |
|
60 |
+
| 0.0141 | 5.0 | 425 | 0.0141 | 78.7360 | {'rouge1': 0.9380953893600955, 'rouge2': 0.8617384374884377, 'rougeL': 0.9308421148568208, 'rougeLsum': 0.9300975337298869} | 10.3800 |
|
61 |
+
| 0.0109 | 6.0 | 510 | 0.0135 | 83.9518 | {'rouge1': 0.935395465917525, 'rouge2': 0.8671870754517816, 'rougeL': 0.9302533751365021, 'rougeLsum': 0.92907347000265} | 9.1752 |
|
62 |
+
| 0.0099 | 7.0 | 595 | 0.0126 | 80.5408 | {'rouge1': 0.9415159377659379, 'rouge2': 0.8721845746845747, 'rougeL': 0.9354884592531653, 'rougeLsum': 0.9349897020191137} | 9.3605 |
|
63 |
+
| 0.0102 | 8.0 | 680 | 0.0125 | 82.4160 | {'rouge1': 0.9432206608750726, 'rouge2': 0.8694791042291042, 'rougeL': 0.9373894379041439, 'rougeLsum': 0.9365909324879915} | 8.7118 |
|
64 |
+
| 0.0089 | 9.0 | 765 | 0.0127 | 81.8874 | {'rouge1': 0.9440492903066432, 'rouge2': 0.8746245236245236, 'rougeL': 0.9401187072731193, 'rougeLsum': 0.9394614353511412} | 8.9898 |
|
65 |
+
| 0.0073 | 10.0 | 850 | 0.0120 | 81.7304 | {'rouge1': 0.9486932005902595, 'rouge2': 0.8767943445443445, 'rougeL': 0.94263671377642, 'rougeLsum': 0.9421138461211994} | 8.5264 |
|
66 |
+
| 0.0079 | 11.0 | 935 | 0.0122 | 82.8916 | {'rouge1': 0.9489730179950766, 'rouge2': 0.8816752136752137, 'rougeL': 0.9447908921144217, 'rougeLsum': 0.9441282361429422} | 7.8777 |
|
67 |
+
| 0.0071 | 12.0 | 1020 | 0.0120 | 81.8038 | {'rouge1': 0.9475330723198372, 'rouge2': 0.8747884060384061, 'rougeL': 0.9424402721461548, 'rougeLsum': 0.9419356876783349} | 8.2484 |
|
68 |
+
| 0.0068 | 13.0 | 1105 | 0.0116 | 81.4716 | {'rouge1': 0.9434723177149648, 'rouge2': 0.8697018722018725, 'rougeL': 0.9390087110601817, 'rougeLsum': 0.9380446906035144} | 8.6191 |
|
69 |
+
| 0.0066 | 14.0 | 1190 | 0.0116 | 83.0533 | {'rouge1': 0.9496244270435449, 'rouge2': 0.8807424704924707, 'rougeL': 0.9455004138018845, 'rougeLsum': 0.9448490578049402} | 7.8777 |
|
70 |
+
| 0.0055 | 15.0 | 1275 | 0.0120 | 81.9058 | {'rouge1': 0.9482457927349568, 'rouge2': 0.8757257834757834, 'rougeL': 0.9434175579322643, 'rougeLsum': 0.942753301480082} | 8.1557 |
|
71 |
+
| 0.0054 | 16.0 | 1360 | 0.0121 | 82.1548 | {'rouge1': 0.9455781873355404, 'rouge2': 0.8749565527065528, 'rougeL': 0.9421678762414054, 'rougeLsum': 0.9410527226041933} | 8.5264 |
|
72 |
+
| 0.0058 | 17.0 | 1445 | 0.0119 | 82.2341 | {'rouge1': 0.9475412104235637, 'rouge2': 0.8764932844932849, 'rougeL': 0.9435994822171294, 'rougeLsum': 0.9428020011034717} | 8.3411 |
|
73 |
+
| 0.0053 | 18.0 | 1530 | 0.0117 | 82.1956 | {'rouge1': 0.9448616840675663, 'rouge2': 0.8742621082621084, 'rougeL': 0.9408799656226128, 'rougeLsum': 0.9401594728800612} | 8.1557 |
|
74 |
+
| 0.005 | 19.0 | 1615 | 0.0117 | 81.8240 | {'rouge1': 0.9442392093200918, 'rouge2': 0.8730217745217748, 'rougeL': 0.9401292441218911, 'rougeLsum': 0.9394873525167642} | 8.3411 |
|
75 |
+
| 0.0047 | 20.0 | 1700 | 0.0118 | 81.9815 | {'rouge1': 0.9443279678697634, 'rouge2': 0.8734148591060358, 'rougeL': 0.9402155941323899, 'rougeLsum': 0.939641014300457} | 8.2484 |
|
76 |
|
77 |
|
78 |
### Framework versions
|
79 |
|
80 |
+
- Transformers 4.42.4
|
81 |
+
- Pytorch 2.3.1+cu121
|
82 |
- Datasets 2.20.0
|
83 |
- Tokenizers 0.19.1
|
generation_config.json
CHANGED
@@ -7,5 +7,5 @@
|
|
7 |
"no_repeat_ngram_size": 3,
|
8 |
"num_beams": 4,
|
9 |
"pad_token_id": 1,
|
10 |
-
"transformers_version": "4.
|
11 |
}
|
|
|
7 |
"no_repeat_ngram_size": 3,
|
8 |
"num_beams": 4,
|
9 |
"pad_token_id": 1,
|
10 |
+
"transformers_version": "4.42.4"
|
11 |
}
|
runs/Jul18_17-55-45_34beeb718974/events.out.tfevents.1721325432.34beeb718974.10921.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7466f7cf0258bbcef99fd85605880b716a4f575d23a92db75010b0a47c4f5837
|
3 |
+
size 31052
|