ldos commited on
Commit
4e693e9
1 Parent(s): 0d3bb8b

End of training

Browse files
Files changed (3) hide show
  1. README.md +169 -0
  2. generation_config.json +6 -0
  3. pytorch_model.bin +1 -1
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: t5-small
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - rouge
8
+ model-index:
9
+ - name: text_shortening_model_v32
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # text_shortening_model_v32
17
+
18
+ This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 2.6385
21
+ - Rouge1: 0.527
22
+ - Rouge2: 0.3031
23
+ - Rougel: 0.4768
24
+ - Rougelsum: 0.4774
25
+ - Bert precision: 0.8854
26
+ - Bert recall: 0.8798
27
+ - Average word count: 8.4444
28
+ - Max word count: 17
29
+ - Min word count: 4
30
+ - Average token count: 12.7447
31
+ - % shortened texts with length > 12: 10.2102
32
+
33
+ ## Model description
34
+
35
+ More information needed
36
+
37
+ ## Intended uses & limitations
38
+
39
+ More information needed
40
+
41
+ ## Training and evaluation data
42
+
43
+ More information needed
44
+
45
+ ## Training procedure
46
+
47
+ ### Training hyperparameters
48
+
49
+ The following hyperparameters were used during training:
50
+ - learning_rate: 0.0003
51
+ - train_batch_size: 16
52
+ - eval_batch_size: 16
53
+ - seed: 42
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: linear
56
+ - num_epochs: 100
57
+
58
+ ### Training results
59
+
60
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bert precision | Bert recall | Average word count | Max word count | Min word count | Average token count | % shortened texts with length > 12 |
61
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:--------------:|:-----------:|:------------------:|:--------------:|:--------------:|:-------------------:|:----------------------------------:|
62
+ | 1.2885 | 1.0 | 73 | 1.5384 | 0.5094 | 0.2857 | 0.4536 | 0.4541 | 0.8746 | 0.8764 | 9.0631 | 17 | 3 | 13.9309 | 15.015 |
63
+ | 1.1256 | 2.0 | 146 | 1.4350 | 0.5228 | 0.3037 | 0.4677 | 0.4684 | 0.879 | 0.8771 | 8.7898 | 17 | 4 | 13.3423 | 14.1141 |
64
+ | 1.0169 | 3.0 | 219 | 1.3707 | 0.5356 | 0.3171 | 0.4797 | 0.4803 | 0.8806 | 0.8838 | 9.1141 | 17 | 4 | 13.7207 | 14.4144 |
65
+ | 0.9561 | 4.0 | 292 | 1.3611 | 0.5449 | 0.3213 | 0.4888 | 0.4896 | 0.8843 | 0.8862 | 8.8408 | 16 | 4 | 13.4865 | 9.009 |
66
+ | 0.8725 | 5.0 | 365 | 1.3343 | 0.5422 | 0.3199 | 0.4936 | 0.4943 | 0.8847 | 0.8866 | 9.0781 | 16 | 4 | 13.6547 | 11.7117 |
67
+ | 0.823 | 6.0 | 438 | 1.3632 | 0.5405 | 0.3183 | 0.4913 | 0.4923 | 0.8849 | 0.8824 | 8.5886 | 16 | 4 | 13.1021 | 12.9129 |
68
+ | 0.7673 | 7.0 | 511 | 1.3989 | 0.5425 | 0.3181 | 0.4856 | 0.4863 | 0.8815 | 0.8849 | 9.2342 | 16 | 5 | 13.7117 | 15.9159 |
69
+ | 0.7449 | 8.0 | 584 | 1.4205 | 0.5391 | 0.3196 | 0.4841 | 0.4845 | 0.8838 | 0.8821 | 8.7207 | 16 | 4 | 13.1201 | 12.9129 |
70
+ | 0.7134 | 9.0 | 657 | 1.4581 | 0.5441 | 0.3128 | 0.4884 | 0.4888 | 0.8853 | 0.885 | 8.6937 | 16 | 4 | 13.2913 | 7.8078 |
71
+ | 0.6875 | 10.0 | 730 | 1.4754 | 0.5434 | 0.3148 | 0.4886 | 0.4884 | 0.8865 | 0.8838 | 8.6727 | 16 | 3 | 13.1081 | 10.5105 |
72
+ | 0.6786 | 11.0 | 803 | 1.4771 | 0.5411 | 0.3107 | 0.4891 | 0.4895 | 0.8874 | 0.8836 | 8.5435 | 16 | 4 | 13.03 | 8.4084 |
73
+ | 0.6388 | 12.0 | 876 | 1.5743 | 0.5379 | 0.309 | 0.482 | 0.4829 | 0.8851 | 0.8807 | 8.5495 | 15 | 4 | 13.033 | 10.8108 |
74
+ | 0.6202 | 13.0 | 949 | 1.6033 | 0.5423 | 0.3078 | 0.4852 | 0.4858 | 0.8875 | 0.8834 | 8.4414 | 16 | 2 | 12.982 | 9.009 |
75
+ | 0.6046 | 14.0 | 1022 | 1.6242 | 0.5352 | 0.3073 | 0.4793 | 0.4795 | 0.8851 | 0.8812 | 8.5165 | 16 | 2 | 12.994 | 9.9099 |
76
+ | 0.6019 | 15.0 | 1095 | 1.6496 | 0.539 | 0.3001 | 0.4802 | 0.4805 | 0.8861 | 0.8832 | 8.7748 | 16 | 4 | 13.1111 | 9.9099 |
77
+ | 0.5717 | 16.0 | 1168 | 1.7001 | 0.5471 | 0.3198 | 0.4947 | 0.4954 | 0.8876 | 0.8869 | 8.8498 | 17 | 4 | 13.3153 | 14.4144 |
78
+ | 0.5567 | 17.0 | 1241 | 1.7371 | 0.5304 | 0.3012 | 0.4802 | 0.4805 | 0.8849 | 0.8809 | 8.4955 | 16 | 2 | 12.8919 | 9.9099 |
79
+ | 0.5458 | 18.0 | 1314 | 1.7639 | 0.5312 | 0.299 | 0.4782 | 0.4784 | 0.8858 | 0.8812 | 8.5045 | 15 | 4 | 13.012 | 9.009 |
80
+ | 0.528 | 19.0 | 1387 | 1.8120 | 0.5282 | 0.306 | 0.4791 | 0.4794 | 0.8857 | 0.8819 | 8.5886 | 16 | 3 | 12.9009 | 10.8108 |
81
+ | 0.5055 | 20.0 | 1460 | 1.8516 | 0.5357 | 0.3088 | 0.4793 | 0.4796 | 0.8863 | 0.8821 | 8.6366 | 16 | 4 | 13.1141 | 9.3093 |
82
+ | 0.5098 | 21.0 | 1533 | 1.8717 | 0.5304 | 0.2966 | 0.4746 | 0.4745 | 0.8843 | 0.8806 | 8.5946 | 16 | 4 | 13.039 | 9.9099 |
83
+ | 0.5143 | 22.0 | 1606 | 1.9507 | 0.533 | 0.3006 | 0.4813 | 0.4819 | 0.8855 | 0.8815 | 8.4895 | 18 | 2 | 12.967 | 8.4084 |
84
+ | 0.4923 | 23.0 | 1679 | 1.9452 | 0.5263 | 0.2936 | 0.4748 | 0.474 | 0.8845 | 0.8805 | 8.4985 | 16 | 2 | 12.9309 | 9.009 |
85
+ | 0.4891 | 24.0 | 1752 | 1.9700 | 0.5306 | 0.3027 | 0.48 | 0.4803 | 0.8862 | 0.881 | 8.4565 | 15 | 4 | 12.982 | 7.2072 |
86
+ | 0.4902 | 25.0 | 1825 | 2.0222 | 0.5336 | 0.3079 | 0.4833 | 0.4836 | 0.8867 | 0.8826 | 8.5465 | 16 | 4 | 12.9429 | 10.2102 |
87
+ | 0.4691 | 26.0 | 1898 | 2.0300 | 0.5332 | 0.3083 | 0.4831 | 0.4838 | 0.8862 | 0.8829 | 8.6036 | 15 | 4 | 13.1231 | 12.3123 |
88
+ | 0.4554 | 27.0 | 1971 | 2.0376 | 0.5345 | 0.3074 | 0.4802 | 0.4802 | 0.8877 | 0.8822 | 8.4354 | 16 | 4 | 12.8018 | 7.2072 |
89
+ | 0.4668 | 28.0 | 2044 | 2.0778 | 0.534 | 0.3056 | 0.4836 | 0.4839 | 0.8852 | 0.8816 | 8.5946 | 18 | 4 | 13.0691 | 10.5105 |
90
+ | 0.4637 | 29.0 | 2117 | 2.0837 | 0.5255 | 0.2986 | 0.4761 | 0.4769 | 0.8839 | 0.881 | 8.5105 | 16 | 4 | 13.048 | 9.6096 |
91
+ | 0.4568 | 30.0 | 2190 | 2.1224 | 0.5332 | 0.3045 | 0.4805 | 0.4801 | 0.8842 | 0.8833 | 8.8198 | 18 | 4 | 13.3483 | 13.8138 |
92
+ | 0.4602 | 31.0 | 2263 | 2.1452 | 0.5323 | 0.3019 | 0.4776 | 0.4775 | 0.8842 | 0.882 | 8.6637 | 18 | 4 | 13.1682 | 11.7117 |
93
+ | 0.4584 | 32.0 | 2336 | 2.1395 | 0.5379 | 0.3125 | 0.4873 | 0.4875 | 0.8839 | 0.883 | 8.7808 | 15 | 4 | 13.3754 | 10.8108 |
94
+ | 0.4495 | 33.0 | 2409 | 2.1839 | 0.5295 | 0.3002 | 0.4767 | 0.4763 | 0.882 | 0.8819 | 8.8979 | 17 | 4 | 13.4685 | 13.5135 |
95
+ | 0.4418 | 34.0 | 2482 | 2.2072 | 0.5266 | 0.3009 | 0.477 | 0.4769 | 0.8836 | 0.8791 | 8.5375 | 15 | 2 | 12.9459 | 10.5105 |
96
+ | 0.4378 | 35.0 | 2555 | 2.2251 | 0.5242 | 0.2946 | 0.4728 | 0.4729 | 0.883 | 0.8784 | 8.5255 | 17 | 4 | 12.8709 | 10.8108 |
97
+ | 0.4224 | 36.0 | 2628 | 2.2447 | 0.5296 | 0.3023 | 0.4774 | 0.4785 | 0.8843 | 0.88 | 8.5736 | 15 | 4 | 12.979 | 10.8108 |
98
+ | 0.4322 | 37.0 | 2701 | 2.2509 | 0.5187 | 0.2921 | 0.4694 | 0.4698 | 0.8824 | 0.877 | 8.4535 | 15 | 4 | 12.8949 | 12.3123 |
99
+ | 0.4367 | 38.0 | 2774 | 2.2949 | 0.5166 | 0.2887 | 0.4646 | 0.4653 | 0.8807 | 0.876 | 8.5465 | 17 | 4 | 12.9369 | 12.012 |
100
+ | 0.4301 | 39.0 | 2847 | 2.2866 | 0.5256 | 0.298 | 0.4693 | 0.4696 | 0.8825 | 0.8777 | 8.5255 | 16 | 4 | 13.0 | 9.9099 |
101
+ | 0.4219 | 40.0 | 2920 | 2.2993 | 0.5213 | 0.2908 | 0.4697 | 0.4699 | 0.8833 | 0.8788 | 8.5646 | 15 | 4 | 13.03 | 10.5105 |
102
+ | 0.4165 | 41.0 | 2993 | 2.3157 | 0.5226 | 0.2977 | 0.4697 | 0.4695 | 0.884 | 0.878 | 8.3964 | 15 | 4 | 12.7988 | 9.9099 |
103
+ | 0.4352 | 42.0 | 3066 | 2.3181 | 0.5199 | 0.2854 | 0.4641 | 0.4641 | 0.8822 | 0.8769 | 8.4925 | 17 | 4 | 12.7958 | 10.5105 |
104
+ | 0.4209 | 43.0 | 3139 | 2.3455 | 0.5247 | 0.2943 | 0.4743 | 0.4746 | 0.8833 | 0.8812 | 8.6757 | 17 | 4 | 13.1111 | 11.1111 |
105
+ | 0.4227 | 44.0 | 3212 | 2.3553 | 0.5146 | 0.2885 | 0.4631 | 0.4638 | 0.883 | 0.8765 | 8.3213 | 17 | 4 | 12.5736 | 9.9099 |
106
+ | 0.4205 | 45.0 | 3285 | 2.3684 | 0.5205 | 0.2925 | 0.4652 | 0.4658 | 0.8821 | 0.8779 | 8.4895 | 15 | 4 | 12.952 | 11.4114 |
107
+ | 0.4039 | 46.0 | 3358 | 2.3505 | 0.5254 | 0.3 | 0.4741 | 0.4742 | 0.8835 | 0.8792 | 8.5105 | 17 | 4 | 12.9339 | 10.5105 |
108
+ | 0.41 | 47.0 | 3431 | 2.3901 | 0.522 | 0.2994 | 0.4712 | 0.4718 | 0.8829 | 0.8792 | 8.5195 | 16 | 4 | 12.9339 | 11.1111 |
109
+ | 0.4104 | 48.0 | 3504 | 2.4093 | 0.5263 | 0.3 | 0.473 | 0.4736 | 0.8856 | 0.8791 | 8.3243 | 17 | 4 | 12.7207 | 9.009 |
110
+ | 0.412 | 49.0 | 3577 | 2.4144 | 0.523 | 0.2983 | 0.4702 | 0.4703 | 0.8828 | 0.8804 | 8.7688 | 17 | 4 | 13.1982 | 11.7117 |
111
+ | 0.4165 | 50.0 | 3650 | 2.4154 | 0.5206 | 0.2966 | 0.468 | 0.4679 | 0.8836 | 0.8798 | 8.6607 | 17 | 4 | 13.048 | 9.3093 |
112
+ | 0.4019 | 51.0 | 3723 | 2.4539 | 0.5242 | 0.3013 | 0.474 | 0.4751 | 0.8845 | 0.8806 | 8.6096 | 17 | 3 | 12.988 | 11.1111 |
113
+ | 0.3948 | 52.0 | 3796 | 2.4132 | 0.5267 | 0.2984 | 0.4741 | 0.4749 | 0.8834 | 0.8802 | 8.6847 | 17 | 3 | 13.1592 | 13.2132 |
114
+ | 0.4105 | 53.0 | 3869 | 2.4407 | 0.5214 | 0.2937 | 0.4676 | 0.4682 | 0.882 | 0.8799 | 8.7117 | 17 | 4 | 13.0901 | 12.9129 |
115
+ | 0.4115 | 54.0 | 3942 | 2.4676 | 0.5292 | 0.3007 | 0.4783 | 0.479 | 0.8865 | 0.8797 | 8.3243 | 17 | 3 | 12.6667 | 8.4084 |
116
+ | 0.3972 | 55.0 | 4015 | 2.4592 | 0.5273 | 0.3041 | 0.4777 | 0.4784 | 0.8864 | 0.8799 | 8.3784 | 17 | 3 | 12.7778 | 10.8108 |
117
+ | 0.3965 | 56.0 | 4088 | 2.4719 | 0.5157 | 0.293 | 0.4657 | 0.4656 | 0.8829 | 0.8777 | 8.4084 | 17 | 3 | 12.7598 | 10.2102 |
118
+ | 0.4106 | 57.0 | 4161 | 2.4792 | 0.52 | 0.2942 | 0.4685 | 0.4692 | 0.8839 | 0.8797 | 8.5165 | 17 | 4 | 12.9309 | 9.9099 |
119
+ | 0.3923 | 58.0 | 4234 | 2.5007 | 0.5229 | 0.2991 | 0.4733 | 0.4738 | 0.8852 | 0.88 | 8.5345 | 18 | 4 | 12.8739 | 11.4114 |
120
+ | 0.4065 | 59.0 | 4307 | 2.4745 | 0.5201 | 0.2921 | 0.4686 | 0.4693 | 0.8829 | 0.8788 | 8.5826 | 17 | 4 | 13.006 | 10.2102 |
121
+ | 0.4095 | 60.0 | 4380 | 2.4775 | 0.5187 | 0.2925 | 0.4683 | 0.4685 | 0.8826 | 0.8804 | 8.6817 | 15 | 4 | 13.1141 | 10.8108 |
122
+ | 0.4016 | 61.0 | 4453 | 2.4853 | 0.5178 | 0.2897 | 0.467 | 0.4675 | 0.8823 | 0.8786 | 8.5766 | 15 | 4 | 13.003 | 10.8108 |
123
+ | 0.4015 | 62.0 | 4526 | 2.4844 | 0.5255 | 0.2908 | 0.4704 | 0.4713 | 0.8839 | 0.8799 | 8.5616 | 16 | 4 | 13.03 | 9.6096 |
124
+ | 0.399 | 63.0 | 4599 | 2.5017 | 0.52 | 0.2909 | 0.4669 | 0.4674 | 0.8835 | 0.8793 | 8.5405 | 16 | 4 | 12.9159 | 9.009 |
125
+ | 0.4075 | 64.0 | 4672 | 2.5025 | 0.523 | 0.2976 | 0.4734 | 0.4741 | 0.885 | 0.88 | 8.5015 | 17 | 4 | 12.8709 | 9.9099 |
126
+ | 0.3977 | 65.0 | 4745 | 2.5306 | 0.5213 | 0.3006 | 0.4743 | 0.4747 | 0.8842 | 0.8799 | 8.4745 | 17 | 4 | 12.9279 | 10.5105 |
127
+ | 0.3978 | 66.0 | 4818 | 2.5439 | 0.5219 | 0.2982 | 0.4719 | 0.472 | 0.8842 | 0.8792 | 8.4414 | 17 | 4 | 12.7357 | 9.9099 |
128
+ | 0.3971 | 67.0 | 4891 | 2.5319 | 0.5293 | 0.2998 | 0.4762 | 0.4769 | 0.8856 | 0.8811 | 8.6156 | 17 | 4 | 12.9309 | 9.3093 |
129
+ | 0.3881 | 68.0 | 4964 | 2.5460 | 0.5216 | 0.2947 | 0.4714 | 0.4715 | 0.8848 | 0.879 | 8.3453 | 17 | 4 | 12.6847 | 8.4084 |
130
+ | 0.3947 | 69.0 | 5037 | 2.5447 | 0.527 | 0.2998 | 0.4741 | 0.4745 | 0.8844 | 0.8812 | 8.5856 | 17 | 4 | 13.015 | 10.8108 |
131
+ | 0.3862 | 70.0 | 5110 | 2.5670 | 0.5271 | 0.304 | 0.4766 | 0.4775 | 0.885 | 0.8811 | 8.5556 | 17 | 4 | 12.9249 | 9.9099 |
132
+ | 0.3947 | 71.0 | 5183 | 2.5535 | 0.5224 | 0.2984 | 0.4701 | 0.4703 | 0.8844 | 0.8795 | 8.5075 | 17 | 4 | 12.8559 | 10.8108 |
133
+ | 0.4056 | 72.0 | 5256 | 2.5729 | 0.5266 | 0.2987 | 0.4727 | 0.4737 | 0.8837 | 0.8812 | 8.6306 | 17 | 4 | 13.0601 | 11.1111 |
134
+ | 0.3906 | 73.0 | 5329 | 2.5667 | 0.5231 | 0.2982 | 0.4691 | 0.4699 | 0.8828 | 0.8802 | 8.6036 | 17 | 4 | 13.0571 | 10.2102 |
135
+ | 0.3875 | 74.0 | 5402 | 2.5688 | 0.5252 | 0.2972 | 0.4697 | 0.4709 | 0.8836 | 0.8804 | 8.5946 | 17 | 4 | 12.994 | 10.2102 |
136
+ | 0.3869 | 75.0 | 5475 | 2.5824 | 0.5283 | 0.3009 | 0.4741 | 0.4743 | 0.885 | 0.8823 | 8.6306 | 17 | 4 | 13.03 | 11.1111 |
137
+ | 0.3797 | 76.0 | 5548 | 2.5827 | 0.5242 | 0.2992 | 0.4717 | 0.4723 | 0.8838 | 0.882 | 8.6697 | 17 | 4 | 13.1021 | 11.4114 |
138
+ | 0.3716 | 77.0 | 5621 | 2.5992 | 0.5197 | 0.2971 | 0.4667 | 0.4681 | 0.8833 | 0.8803 | 8.5766 | 17 | 4 | 12.973 | 11.7117 |
139
+ | 0.3852 | 78.0 | 5694 | 2.5840 | 0.5226 | 0.3008 | 0.4703 | 0.4711 | 0.8839 | 0.8803 | 8.5616 | 17 | 3 | 12.979 | 11.1111 |
140
+ | 0.4031 | 79.0 | 5767 | 2.5853 | 0.5328 | 0.3096 | 0.4794 | 0.4798 | 0.887 | 0.882 | 8.4865 | 17 | 3 | 12.8679 | 8.7087 |
141
+ | 0.3849 | 80.0 | 5840 | 2.5943 | 0.5315 | 0.3101 | 0.4811 | 0.4818 | 0.8863 | 0.882 | 8.4925 | 17 | 3 | 12.8979 | 8.7087 |
142
+ | 0.3937 | 81.0 | 5913 | 2.5984 | 0.5278 | 0.3033 | 0.4763 | 0.4766 | 0.8851 | 0.8813 | 8.5646 | 17 | 3 | 12.9189 | 9.9099 |
143
+ | 0.402 | 82.0 | 5986 | 2.6003 | 0.5229 | 0.2993 | 0.4709 | 0.4717 | 0.8841 | 0.8793 | 8.5135 | 17 | 3 | 12.8889 | 10.5105 |
144
+ | 0.4004 | 83.0 | 6059 | 2.6012 | 0.5261 | 0.3025 | 0.4751 | 0.4756 | 0.8849 | 0.8805 | 8.4835 | 17 | 3 | 12.8138 | 11.1111 |
145
+ | 0.3968 | 84.0 | 6132 | 2.6119 | 0.5266 | 0.3042 | 0.4755 | 0.476 | 0.8858 | 0.8811 | 8.4835 | 17 | 3 | 12.8198 | 10.5105 |
146
+ | 0.393 | 85.0 | 6205 | 2.6203 | 0.5269 | 0.3026 | 0.4736 | 0.4745 | 0.8856 | 0.8811 | 8.5045 | 17 | 4 | 12.8228 | 10.5105 |
147
+ | 0.4003 | 86.0 | 6278 | 2.6245 | 0.5281 | 0.3035 | 0.4741 | 0.4752 | 0.8856 | 0.8808 | 8.4474 | 17 | 4 | 12.7598 | 9.9099 |
148
+ | 0.3923 | 87.0 | 6351 | 2.6331 | 0.5238 | 0.2992 | 0.4726 | 0.4729 | 0.8848 | 0.8799 | 8.4114 | 17 | 4 | 12.7658 | 9.9099 |
149
+ | 0.3958 | 88.0 | 6424 | 2.6281 | 0.5265 | 0.3015 | 0.4747 | 0.4751 | 0.8848 | 0.8806 | 8.4925 | 17 | 4 | 12.8258 | 10.5105 |
150
+ | 0.3938 | 89.0 | 6497 | 2.6312 | 0.5261 | 0.3034 | 0.4753 | 0.4759 | 0.8848 | 0.8805 | 8.4715 | 17 | 4 | 12.8348 | 10.8108 |
151
+ | 0.3698 | 90.0 | 6570 | 2.6221 | 0.5253 | 0.3018 | 0.4734 | 0.4744 | 0.8845 | 0.8803 | 8.4775 | 17 | 4 | 12.8228 | 10.5105 |
152
+ | 0.3946 | 91.0 | 6643 | 2.6173 | 0.5258 | 0.3025 | 0.4739 | 0.4748 | 0.8849 | 0.8806 | 8.4625 | 17 | 4 | 12.8378 | 10.2102 |
153
+ | 0.3933 | 92.0 | 6716 | 2.6259 | 0.5269 | 0.302 | 0.476 | 0.4764 | 0.8851 | 0.88 | 8.4444 | 17 | 4 | 12.7928 | 10.5105 |
154
+ | 0.3915 | 93.0 | 6789 | 2.6323 | 0.5314 | 0.306 | 0.4783 | 0.4789 | 0.8858 | 0.8814 | 8.5195 | 17 | 4 | 12.8739 | 11.1111 |
155
+ | 0.3936 | 94.0 | 6862 | 2.6365 | 0.5293 | 0.3039 | 0.4778 | 0.4785 | 0.8857 | 0.8807 | 8.4775 | 17 | 4 | 12.8048 | 10.5105 |
156
+ | 0.3853 | 95.0 | 6935 | 2.6385 | 0.5294 | 0.3042 | 0.4783 | 0.4788 | 0.8857 | 0.8808 | 8.4835 | 17 | 4 | 12.8198 | 10.5105 |
157
+ | 0.3871 | 96.0 | 7008 | 2.6379 | 0.5283 | 0.3059 | 0.4778 | 0.4786 | 0.8858 | 0.8806 | 8.4865 | 17 | 4 | 12.8198 | 9.6096 |
158
+ | 0.3769 | 97.0 | 7081 | 2.6410 | 0.5283 | 0.3057 | 0.4784 | 0.479 | 0.8857 | 0.8806 | 8.5015 | 17 | 4 | 12.8228 | 10.2102 |
159
+ | 0.3997 | 98.0 | 7154 | 2.6420 | 0.5279 | 0.3048 | 0.4777 | 0.4784 | 0.8852 | 0.8801 | 8.4655 | 17 | 4 | 12.7928 | 10.2102 |
160
+ | 0.3935 | 99.0 | 7227 | 2.6392 | 0.5267 | 0.3033 | 0.4763 | 0.4771 | 0.8852 | 0.8799 | 8.4444 | 17 | 4 | 12.7568 | 10.2102 |
161
+ | 0.3891 | 100.0 | 7300 | 2.6385 | 0.527 | 0.3031 | 0.4768 | 0.4774 | 0.8854 | 0.8798 | 8.4444 | 17 | 4 | 12.7447 | 10.2102 |
162
+
163
+
164
+ ### Framework versions
165
+
166
+ - Transformers 4.33.1
167
+ - Pytorch 2.0.1+cu118
168
+ - Datasets 2.14.5
169
+ - Tokenizers 0.13.3
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.33.1"
6
+ }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4d89d4d1ff2a04ed02f41d04e99e50598a251ffdb8dc3ceca36ec39b87a19cc0
3
  size 242071641
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cc577fee4bcf738756e931a8bf26017fb98fa27438c716c071dbd6270edf4f7
3
  size 242071641