ldos commited on
Commit
ac25796
1 Parent(s): 90ab4bc

End of training

Browse files
Files changed (3) hide show
  1. README.md +169 -0
  2. generation_config.json +6 -0
  3. pytorch_model.bin +1 -1
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: t5-small
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - rouge
8
+ model-index:
9
+ - name: text_shortening_model_v29
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # text_shortening_model_v29
17
+
18
+ This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 2.6052
21
+ - Rouge1: 0.5112
22
+ - Rouge2: 0.2802
23
+ - Rougel: 0.4539
24
+ - Rougelsum: 0.4538
25
+ - Bert precision: 0.8765
26
+ - Bert recall: 0.8742
27
+ - Average word count: 8.8438
28
+ - Max word count: 16
29
+ - Min word count: 4
30
+ - Average token count: 13.4174
31
+ - % shortened texts with length > 12: 8.7087
32
+
33
+ ## Model description
34
+
35
+ More information needed
36
+
37
+ ## Intended uses & limitations
38
+
39
+ More information needed
40
+
41
+ ## Training and evaluation data
42
+
43
+ More information needed
44
+
45
+ ## Training procedure
46
+
47
+ ### Training hyperparameters
48
+
49
+ The following hyperparameters were used during training:
50
+ - learning_rate: 0.0003
51
+ - train_batch_size: 8
52
+ - eval_batch_size: 8
53
+ - seed: 42
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: linear
56
+ - num_epochs: 100
57
+
58
+ ### Training results
59
+
60
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bert precision | Bert recall | Average word count | Max word count | Min word count | Average token count | % shortened texts with length > 12 |
61
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:--------------:|:-----------:|:------------------:|:--------------:|:--------------:|:-------------------:|:----------------------------------:|
62
+ | 1.9361 | 1.0 | 145 | 1.4858 | 0.4996 | 0.2801 | 0.4497 | 0.4507 | 0.8753 | 0.8723 | 8.7808 | 16 | 3 | 13.2372 | 7.2072 |
63
+ | 1.4692 | 2.0 | 290 | 1.3868 | 0.5013 | 0.2812 | 0.4477 | 0.4485 | 0.8736 | 0.8731 | 9.0601 | 16 | 3 | 13.7147 | 13.2132 |
64
+ | 1.2301 | 3.0 | 435 | 1.3641 | 0.5294 | 0.307 | 0.4735 | 0.474 | 0.8785 | 0.8799 | 9.0961 | 16 | 4 | 13.7327 | 16.8168 |
65
+ | 1.049 | 4.0 | 580 | 1.3702 | 0.524 | 0.2979 | 0.4705 | 0.4706 | 0.8782 | 0.8788 | 9.1081 | 16 | 4 | 13.6066 | 13.8138 |
66
+ | 0.9261 | 5.0 | 725 | 1.3843 | 0.5424 | 0.3166 | 0.489 | 0.4886 | 0.8829 | 0.8833 | 8.9219 | 17 | 4 | 13.6907 | 8.4084 |
67
+ | 0.8067 | 6.0 | 870 | 1.4039 | 0.5269 | 0.3011 | 0.4682 | 0.4684 | 0.8777 | 0.878 | 9.2252 | 17 | 4 | 13.973 | 13.2132 |
68
+ | 0.7133 | 7.0 | 1015 | 1.5083 | 0.5168 | 0.3022 | 0.4618 | 0.4613 | 0.8791 | 0.8758 | 8.7447 | 17 | 4 | 13.4655 | 10.2102 |
69
+ | 0.6428 | 8.0 | 1160 | 1.4856 | 0.5184 | 0.2907 | 0.4624 | 0.4617 | 0.8804 | 0.8754 | 8.5976 | 16 | 3 | 13.0571 | 9.009 |
70
+ | 0.5741 | 9.0 | 1305 | 1.5332 | 0.5231 | 0.3003 | 0.4669 | 0.4673 | 0.8809 | 0.8791 | 8.8829 | 17 | 4 | 13.5706 | 7.5075 |
71
+ | 0.5231 | 10.0 | 1450 | 1.5603 | 0.53 | 0.3032 | 0.4725 | 0.4727 | 0.8843 | 0.8775 | 8.4625 | 17 | 4 | 13.033 | 5.7057 |
72
+ | 0.4607 | 11.0 | 1595 | 1.6079 | 0.5118 | 0.2821 | 0.4583 | 0.4577 | 0.8777 | 0.8715 | 8.3453 | 16 | 4 | 13.012 | 6.9069 |
73
+ | 0.4136 | 12.0 | 1740 | 1.7147 | 0.5136 | 0.2849 | 0.4558 | 0.4556 | 0.8776 | 0.8734 | 8.7297 | 16 | 3 | 13.3874 | 9.3093 |
74
+ | 0.3829 | 13.0 | 1885 | 1.7425 | 0.5182 | 0.287 | 0.459 | 0.4591 | 0.8792 | 0.8746 | 8.7207 | 17 | 4 | 13.3934 | 8.1081 |
75
+ | 0.3366 | 14.0 | 2030 | 1.7518 | 0.5171 | 0.2871 | 0.4564 | 0.4557 | 0.8796 | 0.8735 | 8.5195 | 16 | 4 | 13.0811 | 5.4054 |
76
+ | 0.3076 | 15.0 | 2175 | 1.8555 | 0.5139 | 0.2891 | 0.4581 | 0.4581 | 0.879 | 0.8754 | 8.7658 | 16 | 4 | 13.2973 | 9.9099 |
77
+ | 0.2908 | 16.0 | 2320 | 1.8983 | 0.5239 | 0.3011 | 0.4654 | 0.4651 | 0.8799 | 0.8794 | 8.979 | 16 | 4 | 13.6547 | 12.012 |
78
+ | 0.2606 | 17.0 | 2465 | 1.9211 | 0.5158 | 0.2875 | 0.4538 | 0.4542 | 0.8774 | 0.8739 | 8.7868 | 17 | 2 | 13.5736 | 12.012 |
79
+ | 0.2477 | 18.0 | 2610 | 1.9208 | 0.51 | 0.2872 | 0.4515 | 0.4517 | 0.8774 | 0.8733 | 8.6577 | 17 | 4 | 13.3093 | 10.8108 |
80
+ | 0.2195 | 19.0 | 2755 | 1.9720 | 0.5112 | 0.2838 | 0.456 | 0.4559 | 0.8775 | 0.8754 | 8.8799 | 17 | 3 | 13.4835 | 10.8108 |
81
+ | 0.1998 | 20.0 | 2900 | 1.9987 | 0.511 | 0.2817 | 0.4526 | 0.4525 | 0.8783 | 0.8751 | 8.7838 | 17 | 3 | 13.4955 | 9.9099 |
82
+ | 0.1936 | 21.0 | 3045 | 2.0389 | 0.5066 | 0.2818 | 0.4482 | 0.4485 | 0.8762 | 0.8722 | 8.6186 | 17 | 4 | 13.1231 | 9.009 |
83
+ | 0.1813 | 22.0 | 3190 | 2.0735 | 0.5078 | 0.29 | 0.4556 | 0.4562 | 0.8772 | 0.8754 | 8.8198 | 17 | 4 | 13.4895 | 9.3093 |
84
+ | 0.1726 | 23.0 | 3335 | 2.0743 | 0.5108 | 0.2901 | 0.458 | 0.4581 | 0.8795 | 0.8736 | 8.4775 | 17 | 2 | 13.0931 | 9.009 |
85
+ | 0.164 | 24.0 | 3480 | 2.1380 | 0.5077 | 0.2887 | 0.4578 | 0.4565 | 0.878 | 0.8727 | 8.4474 | 17 | 4 | 13.003 | 5.7057 |
86
+ | 0.1506 | 25.0 | 3625 | 2.1435 | 0.5005 | 0.2725 | 0.4456 | 0.4452 | 0.8748 | 0.8717 | 8.6637 | 17 | 4 | 13.2943 | 6.6066 |
87
+ | 0.1402 | 26.0 | 3770 | 2.1956 | 0.5114 | 0.2899 | 0.4577 | 0.4571 | 0.8769 | 0.8753 | 8.8709 | 17 | 4 | 13.3544 | 9.3093 |
88
+ | 0.138 | 27.0 | 3915 | 2.2175 | 0.5079 | 0.2824 | 0.4544 | 0.4548 | 0.8772 | 0.8739 | 8.6847 | 17 | 4 | 13.3423 | 8.4084 |
89
+ | 0.1313 | 28.0 | 4060 | 2.2267 | 0.5048 | 0.2793 | 0.4483 | 0.448 | 0.8747 | 0.8717 | 8.6817 | 17 | 4 | 13.2733 | 9.009 |
90
+ | 0.122 | 29.0 | 4205 | 2.2464 | 0.5105 | 0.2813 | 0.4544 | 0.4548 | 0.8746 | 0.8736 | 8.9099 | 18 | 4 | 13.4595 | 10.5105 |
91
+ | 0.1195 | 30.0 | 4350 | 2.2419 | 0.5124 | 0.2922 | 0.461 | 0.4609 | 0.8768 | 0.8733 | 8.6637 | 16 | 4 | 13.2883 | 7.5075 |
92
+ | 0.1131 | 31.0 | 4495 | 2.2243 | 0.5215 | 0.3025 | 0.4702 | 0.4698 | 0.8802 | 0.878 | 8.7117 | 16 | 4 | 13.3814 | 9.3093 |
93
+ | 0.1102 | 32.0 | 4640 | 2.2847 | 0.5078 | 0.2826 | 0.4567 | 0.4559 | 0.8788 | 0.8729 | 8.3904 | 18 | 4 | 12.9099 | 6.3063 |
94
+ | 0.1105 | 33.0 | 4785 | 2.2545 | 0.5049 | 0.2759 | 0.4489 | 0.4484 | 0.8762 | 0.8729 | 8.6667 | 18 | 4 | 13.1952 | 9.009 |
95
+ | 0.099 | 34.0 | 4930 | 2.2819 | 0.5207 | 0.296 | 0.4662 | 0.4665 | 0.8814 | 0.8775 | 8.6186 | 17 | 4 | 13.1952 | 8.1081 |
96
+ | 0.1018 | 35.0 | 5075 | 2.2901 | 0.5133 | 0.2812 | 0.4597 | 0.4597 | 0.8777 | 0.8743 | 8.7237 | 17 | 4 | 13.3243 | 10.8108 |
97
+ | 0.0992 | 36.0 | 5220 | 2.3349 | 0.5011 | 0.272 | 0.4442 | 0.4439 | 0.8738 | 0.8722 | 8.9129 | 16 | 2 | 13.5856 | 11.1111 |
98
+ | 0.0921 | 37.0 | 5365 | 2.3193 | 0.506 | 0.2816 | 0.4539 | 0.4539 | 0.8776 | 0.8739 | 8.7658 | 16 | 4 | 13.3093 | 8.7087 |
99
+ | 0.0936 | 38.0 | 5510 | 2.3404 | 0.5101 | 0.2815 | 0.4565 | 0.4566 | 0.8768 | 0.8754 | 8.8168 | 16 | 4 | 13.4535 | 10.5105 |
100
+ | 0.0833 | 39.0 | 5655 | 2.3583 | 0.5026 | 0.2818 | 0.4512 | 0.4509 | 0.8749 | 0.8743 | 8.8709 | 16 | 3 | 13.4955 | 9.3093 |
101
+ | 0.0869 | 40.0 | 5800 | 2.3443 | 0.5091 | 0.2855 | 0.4521 | 0.4521 | 0.8769 | 0.8743 | 8.8378 | 16 | 4 | 13.4474 | 11.4114 |
102
+ | 0.0783 | 41.0 | 5945 | 2.3609 | 0.5045 | 0.2851 | 0.4519 | 0.4513 | 0.8784 | 0.8738 | 8.5946 | 16 | 4 | 13.1261 | 7.8078 |
103
+ | 0.08 | 42.0 | 6090 | 2.4229 | 0.5053 | 0.2774 | 0.4508 | 0.4506 | 0.8769 | 0.8743 | 8.6667 | 16 | 4 | 13.2853 | 8.4084 |
104
+ | 0.0792 | 43.0 | 6235 | 2.3731 | 0.5156 | 0.2877 | 0.4618 | 0.4619 | 0.8775 | 0.8771 | 8.955 | 16 | 4 | 13.6937 | 8.7087 |
105
+ | 0.075 | 44.0 | 6380 | 2.4058 | 0.5119 | 0.286 | 0.453 | 0.4535 | 0.8761 | 0.8762 | 8.976 | 17 | 3 | 13.7387 | 12.012 |
106
+ | 0.0754 | 45.0 | 6525 | 2.3808 | 0.5142 | 0.2894 | 0.4584 | 0.4583 | 0.8772 | 0.8765 | 8.967 | 16 | 4 | 13.6096 | 12.3123 |
107
+ | 0.0713 | 46.0 | 6670 | 2.3949 | 0.5093 | 0.2841 | 0.4566 | 0.4568 | 0.8758 | 0.8748 | 8.8559 | 16 | 4 | 13.4775 | 9.9099 |
108
+ | 0.066 | 47.0 | 6815 | 2.4103 | 0.5094 | 0.2798 | 0.4551 | 0.4553 | 0.8763 | 0.8753 | 8.9009 | 16 | 4 | 13.4655 | 10.2102 |
109
+ | 0.0684 | 48.0 | 6960 | 2.4284 | 0.5021 | 0.2763 | 0.4476 | 0.4465 | 0.8754 | 0.8733 | 8.6727 | 16 | 4 | 13.2162 | 8.7087 |
110
+ | 0.0656 | 49.0 | 7105 | 2.4512 | 0.5137 | 0.289 | 0.4584 | 0.4583 | 0.8763 | 0.8748 | 8.8378 | 16 | 4 | 13.4174 | 9.6096 |
111
+ | 0.0664 | 50.0 | 7250 | 2.4427 | 0.5106 | 0.2789 | 0.4507 | 0.4501 | 0.8761 | 0.8747 | 8.7327 | 16 | 4 | 13.5255 | 8.4084 |
112
+ | 0.0628 | 51.0 | 7395 | 2.4792 | 0.5069 | 0.2802 | 0.4527 | 0.453 | 0.8775 | 0.8751 | 8.7417 | 16 | 2 | 13.3063 | 8.7087 |
113
+ | 0.0662 | 52.0 | 7540 | 2.4619 | 0.5103 | 0.281 | 0.4567 | 0.4567 | 0.8776 | 0.874 | 8.6216 | 16 | 3 | 13.1772 | 9.009 |
114
+ | 0.0633 | 53.0 | 7685 | 2.4705 | 0.5053 | 0.2785 | 0.4489 | 0.449 | 0.8761 | 0.8735 | 8.7447 | 16 | 4 | 13.3874 | 8.7087 |
115
+ | 0.0592 | 54.0 | 7830 | 2.4978 | 0.5133 | 0.2813 | 0.452 | 0.4528 | 0.8769 | 0.8746 | 8.8438 | 16 | 4 | 13.4354 | 9.6096 |
116
+ | 0.0577 | 55.0 | 7975 | 2.4823 | 0.5063 | 0.2793 | 0.448 | 0.4488 | 0.8758 | 0.8721 | 8.6036 | 16 | 4 | 13.1111 | 6.9069 |
117
+ | 0.0609 | 56.0 | 8120 | 2.4779 | 0.5133 | 0.2797 | 0.4539 | 0.4544 | 0.8764 | 0.8756 | 8.97 | 16 | 3 | 13.5976 | 10.5105 |
118
+ | 0.0539 | 57.0 | 8265 | 2.5132 | 0.5096 | 0.2778 | 0.453 | 0.4536 | 0.877 | 0.8734 | 8.7117 | 16 | 4 | 13.3003 | 7.2072 |
119
+ | 0.0564 | 58.0 | 8410 | 2.4783 | 0.517 | 0.2872 | 0.4622 | 0.4625 | 0.8778 | 0.8759 | 8.9159 | 16 | 4 | 13.5556 | 11.4114 |
120
+ | 0.0543 | 59.0 | 8555 | 2.5184 | 0.5071 | 0.2788 | 0.4515 | 0.4513 | 0.8766 | 0.8734 | 8.7177 | 16 | 4 | 13.2583 | 9.009 |
121
+ | 0.0518 | 60.0 | 8700 | 2.4945 | 0.5049 | 0.2754 | 0.4529 | 0.4529 | 0.8755 | 0.8749 | 8.9459 | 16 | 4 | 13.6787 | 10.8108 |
122
+ | 0.0541 | 61.0 | 8845 | 2.5282 | 0.4983 | 0.2693 | 0.4414 | 0.4403 | 0.8723 | 0.8726 | 8.973 | 16 | 4 | 13.6667 | 11.1111 |
123
+ | 0.0532 | 62.0 | 8990 | 2.5237 | 0.5007 | 0.2712 | 0.4464 | 0.4456 | 0.8741 | 0.8744 | 9.0541 | 16 | 4 | 13.7477 | 11.1111 |
124
+ | 0.0514 | 63.0 | 9135 | 2.5247 | 0.5041 | 0.2784 | 0.4525 | 0.452 | 0.8768 | 0.8735 | 8.7898 | 16 | 4 | 13.4144 | 8.7087 |
125
+ | 0.0516 | 64.0 | 9280 | 2.5289 | 0.5065 | 0.2826 | 0.4517 | 0.4515 | 0.8753 | 0.8745 | 9.042 | 16 | 4 | 13.6907 | 11.1111 |
126
+ | 0.0504 | 65.0 | 9425 | 2.5002 | 0.5055 | 0.2826 | 0.4565 | 0.4562 | 0.877 | 0.8724 | 8.6727 | 16 | 4 | 13.3123 | 7.5075 |
127
+ | 0.0479 | 66.0 | 9570 | 2.5361 | 0.503 | 0.2783 | 0.4529 | 0.4532 | 0.8756 | 0.874 | 8.8529 | 16 | 4 | 13.4865 | 8.1081 |
128
+ | 0.0515 | 67.0 | 9715 | 2.5260 | 0.5043 | 0.2758 | 0.451 | 0.4512 | 0.874 | 0.8748 | 9.0661 | 17 | 4 | 13.7808 | 10.5105 |
129
+ | 0.0544 | 68.0 | 9860 | 2.5213 | 0.5051 | 0.2846 | 0.4543 | 0.4545 | 0.8754 | 0.8739 | 8.9219 | 16 | 3 | 13.5586 | 10.5105 |
130
+ | 0.0445 | 69.0 | 10005 | 2.5543 | 0.5097 | 0.2859 | 0.4573 | 0.4577 | 0.878 | 0.8748 | 8.6937 | 16 | 3 | 13.3363 | 9.009 |
131
+ | 0.0484 | 70.0 | 10150 | 2.5472 | 0.5028 | 0.2791 | 0.4502 | 0.4503 | 0.8757 | 0.8736 | 8.8078 | 16 | 3 | 13.4264 | 7.5075 |
132
+ | 0.0437 | 71.0 | 10295 | 2.5621 | 0.5089 | 0.2851 | 0.4553 | 0.4556 | 0.8765 | 0.8742 | 8.8408 | 16 | 4 | 13.5105 | 8.7087 |
133
+ | 0.0473 | 72.0 | 10440 | 2.5503 | 0.5087 | 0.2818 | 0.4558 | 0.4555 | 0.8771 | 0.8743 | 8.8559 | 16 | 4 | 13.4204 | 8.7087 |
134
+ | 0.0472 | 73.0 | 10585 | 2.5726 | 0.5168 | 0.2866 | 0.4571 | 0.4577 | 0.8775 | 0.8761 | 8.9039 | 17 | 4 | 13.5285 | 9.6096 |
135
+ | 0.041 | 74.0 | 10730 | 2.5982 | 0.5137 | 0.2895 | 0.4594 | 0.4601 | 0.8769 | 0.8757 | 8.8709 | 16 | 4 | 13.4805 | 9.3093 |
136
+ | 0.0409 | 75.0 | 10875 | 2.5589 | 0.5058 | 0.2824 | 0.4553 | 0.4554 | 0.8766 | 0.8746 | 8.7898 | 16 | 4 | 13.3033 | 8.7087 |
137
+ | 0.0441 | 76.0 | 11020 | 2.5642 | 0.501 | 0.2791 | 0.452 | 0.4521 | 0.8763 | 0.8717 | 8.5225 | 16 | 4 | 13.048 | 6.006 |
138
+ | 0.0427 | 77.0 | 11165 | 2.5522 | 0.5102 | 0.2864 | 0.4573 | 0.4579 | 0.8784 | 0.8749 | 8.7207 | 17 | 4 | 13.3183 | 7.5075 |
139
+ | 0.0449 | 78.0 | 11310 | 2.5454 | 0.5071 | 0.2846 | 0.4567 | 0.4561 | 0.8775 | 0.875 | 8.7658 | 16 | 4 | 13.2523 | 7.5075 |
140
+ | 0.0397 | 79.0 | 11455 | 2.5598 | 0.5111 | 0.2863 | 0.4566 | 0.4569 | 0.8781 | 0.8752 | 8.7267 | 16 | 4 | 13.2973 | 7.2072 |
141
+ | 0.046 | 80.0 | 11600 | 2.5171 | 0.5063 | 0.2838 | 0.4541 | 0.4541 | 0.8768 | 0.8734 | 8.6456 | 16 | 4 | 13.2492 | 6.6066 |
142
+ | 0.0403 | 81.0 | 11745 | 2.5398 | 0.5154 | 0.2872 | 0.4584 | 0.4584 | 0.8774 | 0.876 | 8.9489 | 18 | 4 | 13.4955 | 8.7087 |
143
+ | 0.0407 | 82.0 | 11890 | 2.5526 | 0.5178 | 0.2904 | 0.4631 | 0.4632 | 0.8789 | 0.8769 | 8.8589 | 18 | 4 | 13.4354 | 7.5075 |
144
+ | 0.0414 | 83.0 | 12035 | 2.5718 | 0.5154 | 0.2876 | 0.4604 | 0.4609 | 0.8783 | 0.8749 | 8.7808 | 17 | 4 | 13.3303 | 7.5075 |
145
+ | 0.0406 | 84.0 | 12180 | 2.5673 | 0.5138 | 0.2861 | 0.4581 | 0.4587 | 0.8773 | 0.8758 | 8.8949 | 17 | 4 | 13.4895 | 8.1081 |
146
+ | 0.037 | 85.0 | 12325 | 2.5770 | 0.511 | 0.2873 | 0.4575 | 0.4573 | 0.8775 | 0.876 | 8.8559 | 16 | 4 | 13.4384 | 8.4084 |
147
+ | 0.0404 | 86.0 | 12470 | 2.5786 | 0.5145 | 0.2848 | 0.4578 | 0.4581 | 0.8774 | 0.8754 | 8.8649 | 16 | 4 | 13.4865 | 8.7087 |
148
+ | 0.0364 | 87.0 | 12615 | 2.5822 | 0.5089 | 0.2791 | 0.454 | 0.4539 | 0.8761 | 0.8743 | 8.8288 | 17 | 4 | 13.4174 | 7.8078 |
149
+ | 0.0365 | 88.0 | 12760 | 2.5821 | 0.5105 | 0.2806 | 0.4555 | 0.4559 | 0.8779 | 0.8752 | 8.7838 | 16 | 4 | 13.3634 | 7.8078 |
150
+ | 0.0359 | 89.0 | 12905 | 2.5798 | 0.5121 | 0.2787 | 0.4546 | 0.4549 | 0.8771 | 0.8753 | 8.8799 | 16 | 4 | 13.4835 | 8.4084 |
151
+ | 0.0349 | 90.0 | 13050 | 2.5960 | 0.5109 | 0.2788 | 0.4533 | 0.454 | 0.8775 | 0.8747 | 8.8108 | 16 | 4 | 13.3874 | 9.009 |
152
+ | 0.035 | 91.0 | 13195 | 2.5979 | 0.5072 | 0.2778 | 0.454 | 0.4539 | 0.8764 | 0.8743 | 8.8589 | 16 | 4 | 13.3964 | 9.6096 |
153
+ | 0.0355 | 92.0 | 13340 | 2.6016 | 0.5101 | 0.2795 | 0.4544 | 0.4548 | 0.8767 | 0.8743 | 8.8589 | 16 | 4 | 13.4505 | 9.009 |
154
+ | 0.0352 | 93.0 | 13485 | 2.6036 | 0.5107 | 0.2814 | 0.455 | 0.4554 | 0.8772 | 0.8747 | 8.8619 | 16 | 4 | 13.4294 | 9.009 |
155
+ | 0.0338 | 94.0 | 13630 | 2.6016 | 0.5065 | 0.2771 | 0.4512 | 0.4514 | 0.8758 | 0.8741 | 8.9249 | 16 | 4 | 13.5165 | 9.3093 |
156
+ | 0.0359 | 95.0 | 13775 | 2.6044 | 0.5071 | 0.2761 | 0.4496 | 0.4501 | 0.8755 | 0.8733 | 8.8559 | 16 | 4 | 13.4264 | 9.6096 |
157
+ | 0.0349 | 96.0 | 13920 | 2.5986 | 0.5072 | 0.277 | 0.4523 | 0.4524 | 0.8756 | 0.8736 | 8.8679 | 16 | 4 | 13.4655 | 9.6096 |
158
+ | 0.0358 | 97.0 | 14065 | 2.5994 | 0.5068 | 0.276 | 0.4498 | 0.4502 | 0.8749 | 0.8733 | 8.8589 | 16 | 4 | 13.4685 | 8.7087 |
159
+ | 0.0338 | 98.0 | 14210 | 2.6041 | 0.5105 | 0.2805 | 0.4536 | 0.4535 | 0.8761 | 0.8741 | 8.8498 | 16 | 4 | 13.4444 | 8.7087 |
160
+ | 0.0359 | 99.0 | 14355 | 2.6051 | 0.5095 | 0.2774 | 0.452 | 0.4522 | 0.876 | 0.8738 | 8.8529 | 16 | 4 | 13.4174 | 9.009 |
161
+ | 0.0357 | 100.0 | 14500 | 2.6052 | 0.5112 | 0.2802 | 0.4539 | 0.4538 | 0.8765 | 0.8742 | 8.8438 | 16 | 4 | 13.4174 | 8.7087 |
162
+
163
+
164
+ ### Framework versions
165
+
166
+ - Transformers 4.33.1
167
+ - Pytorch 2.0.1+cu118
168
+ - Datasets 2.14.5
169
+ - Tokenizers 0.13.3
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.33.1"
6
+ }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:10d852a669fe2c2779d0207057a5eda85d6bd6872857de0816c27de5efcd9ce5
3
  size 242071641
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3214abef3dfab1532061f6c8655d5cac9be20374b0ab9829b373d329ab026f6c
3
  size 242071641