lyan62 commited on
Commit
79f082a
1 Parent(s): a91ee72

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +561 -0
README.md ADDED
@@ -0,0 +1,561 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: zh_wiki_small
6
+ results: []
7
+ ---
8
+
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
+ # zh_wiki_small
13
+
14
+ This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 0.4177
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 0.00015
36
+ - train_batch_size: 32
37
+ - eval_batch_size: 4
38
+ - seed: 42
39
+ - distributed_type: multi-GPU
40
+ - num_devices: 4
41
+ - gradient_accumulation_steps: 2
42
+ - total_train_batch_size: 256
43
+ - total_eval_batch_size: 16
44
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
+ - lr_scheduler_type: cosine
46
+ - lr_scheduler_warmup_ratio: 0.05
47
+ - training_steps: 500000
48
+ - mixed_precision_training: Native AMP
49
+
50
+ ### Training results
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:-----:|:------:|:---------------:|
54
+ | 0.4793 | 0.14 | 1000 | 0.4540 |
55
+ | 0.4701 | 0.28 | 2000 | 0.4467 |
56
+ | 0.4673 | 0.42 | 3000 | 0.4304 |
57
+ | 0.4669 | 0.55 | 4000 | 0.4413 |
58
+ | 0.4668 | 0.69 | 5000 | 0.4368 |
59
+ | 0.4676 | 0.83 | 6000 | 0.4358 |
60
+ | 0.4691 | 0.97 | 7000 | 0.4367 |
61
+ | 0.4693 | 1.11 | 8000 | 0.4429 |
62
+ | 0.4709 | 1.25 | 9000 | 0.4388 |
63
+ | 0.4722 | 1.39 | 10000 | 0.4453 |
64
+ | 0.4729 | 1.53 | 11000 | 0.4415 |
65
+ | 0.4732 | 1.67 | 12000 | 0.4510 |
66
+ | 0.4751 | 1.8 | 13000 | 0.4461 |
67
+ | 0.4765 | 1.94 | 14000 | 0.4448 |
68
+ | 0.477 | 2.08 | 15000 | 0.4498 |
69
+ | 0.4779 | 2.22 | 16000 | 0.4447 |
70
+ | 0.4795 | 2.36 | 17000 | 0.4430 |
71
+ | 0.481 | 2.5 | 18000 | 0.4499 |
72
+ | 0.4821 | 2.64 | 19000 | 0.4551 |
73
+ | 0.4829 | 2.78 | 20000 | 0.4519 |
74
+ | 0.4838 | 2.91 | 21000 | 0.4520 |
75
+ | 0.4856 | 3.05 | 22000 | 0.4633 |
76
+ | 0.4857 | 3.19 | 23000 | 0.4576 |
77
+ | 0.4869 | 3.33 | 24000 | 0.4485 |
78
+ | 0.4882 | 3.47 | 25000 | 0.4591 |
79
+ | 0.4883 | 3.61 | 26000 | 0.4645 |
80
+ | 0.4889 | 3.75 | 27000 | 0.4570 |
81
+ | 0.4884 | 3.89 | 28000 | 0.4572 |
82
+ | 0.4897 | 4.02 | 29000 | 0.4553 |
83
+ | 0.4883 | 4.16 | 30000 | 0.4534 |
84
+ | 0.4881 | 4.3 | 31000 | 0.4587 |
85
+ | 0.4889 | 4.44 | 32000 | 0.4632 |
86
+ | 0.4886 | 4.58 | 33000 | 0.4587 |
87
+ | 0.4883 | 4.72 | 34000 | 0.4621 |
88
+ | 0.4876 | 4.86 | 35000 | 0.4522 |
89
+ | 0.4878 | 5.0 | 36000 | 0.4560 |
90
+ | 0.4883 | 5.13 | 37000 | 0.4579 |
91
+ | 0.4882 | 5.27 | 38000 | 0.4554 |
92
+ | 0.4883 | 5.41 | 39000 | 0.4588 |
93
+ | 0.4872 | 5.55 | 40000 | 0.4561 |
94
+ | 0.4868 | 5.69 | 41000 | 0.4614 |
95
+ | 0.4875 | 5.83 | 42000 | 0.4584 |
96
+ | 0.4868 | 5.97 | 43000 | 0.4619 |
97
+ | 0.4874 | 6.11 | 44000 | 0.4519 |
98
+ | 0.4874 | 6.24 | 45000 | 0.4625 |
99
+ | 0.487 | 6.38 | 46000 | 0.4579 |
100
+ | 0.4872 | 6.52 | 47000 | 0.4534 |
101
+ | 0.4872 | 6.66 | 48000 | 0.4516 |
102
+ | 0.4865 | 6.8 | 49000 | 0.4635 |
103
+ | 0.4865 | 6.94 | 50000 | 0.4610 |
104
+ | 0.4863 | 7.08 | 51000 | 0.4515 |
105
+ | 0.4861 | 7.22 | 52000 | 0.4584 |
106
+ | 0.4866 | 7.35 | 53000 | 0.4541 |
107
+ | 0.4862 | 7.49 | 54000 | 0.4508 |
108
+ | 0.4863 | 7.63 | 55000 | 0.4565 |
109
+ | 0.486 | 7.77 | 56000 | 0.4665 |
110
+ | 0.486 | 7.91 | 57000 | 0.4565 |
111
+ | 0.4861 | 8.05 | 58000 | 0.4580 |
112
+ | 0.4852 | 8.19 | 59000 | 0.4596 |
113
+ | 0.4846 | 8.33 | 60000 | 0.4527 |
114
+ | 0.4848 | 8.46 | 61000 | 0.4505 |
115
+ | 0.4849 | 8.6 | 62000 | 0.4407 |
116
+ | 0.4851 | 8.74 | 63000 | 0.4579 |
117
+ | 0.4848 | 8.88 | 64000 | 0.4559 |
118
+ | 0.4851 | 9.02 | 65000 | 0.4505 |
119
+ | 0.4846 | 9.16 | 66000 | 0.4615 |
120
+ | 0.4842 | 9.3 | 67000 | 0.4618 |
121
+ | 0.484 | 9.44 | 68000 | 0.4559 |
122
+ | 0.4841 | 9.57 | 69000 | 0.4613 |
123
+ | 0.484 | 9.71 | 70000 | 0.4527 |
124
+ | 0.4842 | 9.85 | 71000 | 0.4483 |
125
+ | 0.4842 | 9.99 | 72000 | 0.4585 |
126
+ | 0.4837 | 10.13 | 73000 | 0.4585 |
127
+ | 0.4833 | 10.27 | 74000 | 0.4541 |
128
+ | 0.4836 | 10.41 | 75000 | 0.4528 |
129
+ | 0.4832 | 10.55 | 76000 | 0.4475 |
130
+ | 0.4836 | 10.68 | 77000 | 0.4525 |
131
+ | 0.4826 | 10.82 | 78000 | 0.4562 |
132
+ | 0.4824 | 10.96 | 79000 | 0.4502 |
133
+ | 0.4828 | 11.1 | 80000 | 0.4529 |
134
+ | 0.4829 | 11.24 | 81000 | 0.4524 |
135
+ | 0.4823 | 11.38 | 82000 | 0.4506 |
136
+ | 0.4827 | 11.52 | 83000 | 0.4511 |
137
+ | 0.4823 | 11.66 | 84000 | 0.4506 |
138
+ | 0.4827 | 11.79 | 85000 | 0.4561 |
139
+ | 0.4832 | 11.93 | 86000 | 0.4471 |
140
+ | 0.482 | 12.07 | 87000 | 0.4479 |
141
+ | 0.4819 | 12.21 | 88000 | 0.4561 |
142
+ | 0.4816 | 12.35 | 89000 | 0.4590 |
143
+ | 0.4818 | 12.49 | 90000 | 0.4469 |
144
+ | 0.4815 | 12.63 | 91000 | 0.4633 |
145
+ | 0.4822 | 12.77 | 92000 | 0.4566 |
146
+ | 0.4816 | 12.9 | 93000 | 0.4548 |
147
+ | 0.4824 | 13.04 | 94000 | 0.4548 |
148
+ | 0.4812 | 13.18 | 95000 | 0.4533 |
149
+ | 0.4809 | 13.32 | 96000 | 0.4546 |
150
+ | 0.481 | 13.46 | 97000 | 0.4590 |
151
+ | 0.4807 | 13.6 | 98000 | 0.4465 |
152
+ | 0.4808 | 13.74 | 99000 | 0.4531 |
153
+ | 0.4806 | 13.88 | 100000 | 0.4459 |
154
+ | 0.4809 | 14.01 | 101000 | 0.4517 |
155
+ | 0.4801 | 14.15 | 102000 | 0.4519 |
156
+ | 0.4801 | 14.29 | 103000 | 0.4547 |
157
+ | 0.4805 | 14.43 | 104000 | 0.4517 |
158
+ | 0.4799 | 14.57 | 105000 | 0.4491 |
159
+ | 0.4805 | 14.71 | 106000 | 0.4559 |
160
+ | 0.48 | 14.85 | 107000 | 0.4551 |
161
+ | 0.4796 | 14.99 | 108000 | 0.4537 |
162
+ | 0.4801 | 15.12 | 109000 | 0.4509 |
163
+ | 0.4797 | 15.26 | 110000 | 0.4482 |
164
+ | 0.4798 | 15.4 | 111000 | 0.4466 |
165
+ | 0.4789 | 15.54 | 112000 | 0.4445 |
166
+ | 0.4808 | 15.68 | 113000 | 0.4493 |
167
+ | 0.4789 | 15.82 | 114000 | 0.4475 |
168
+ | 0.4792 | 15.96 | 115000 | 0.4543 |
169
+ | 0.4787 | 16.1 | 116000 | 0.4471 |
170
+ | 0.4796 | 16.23 | 117000 | 0.4565 |
171
+ | 0.4787 | 16.37 | 118000 | 0.4515 |
172
+ | 0.4788 | 16.51 | 119000 | 0.4449 |
173
+ | 0.4783 | 16.65 | 120000 | 0.4454 |
174
+ | 0.4787 | 16.79 | 121000 | 0.4486 |
175
+ | 0.4789 | 16.93 | 122000 | 0.4480 |
176
+ | 0.4782 | 17.07 | 123000 | 0.4529 |
177
+ | 0.4782 | 17.21 | 124000 | 0.4481 |
178
+ | 0.4777 | 17.34 | 125000 | 0.4528 |
179
+ | 0.4779 | 17.48 | 126000 | 0.4514 |
180
+ | 0.4781 | 17.62 | 127000 | 0.4520 |
181
+ | 0.4776 | 17.76 | 128000 | 0.4495 |
182
+ | 0.4777 | 17.9 | 129000 | 0.4501 |
183
+ | 0.4783 | 18.04 | 130000 | 0.4528 |
184
+ | 0.4771 | 18.18 | 131000 | 0.4498 |
185
+ | 0.4775 | 18.32 | 132000 | 0.4525 |
186
+ | 0.4772 | 18.45 | 133000 | 0.4482 |
187
+ | 0.4775 | 18.59 | 134000 | 0.4532 |
188
+ | 0.4769 | 18.73 | 135000 | 0.4537 |
189
+ | 0.4776 | 18.87 | 136000 | 0.4509 |
190
+ | 0.4775 | 19.01 | 137000 | 0.4464 |
191
+ | 0.4769 | 19.15 | 138000 | 0.4464 |
192
+ | 0.4772 | 19.29 | 139000 | 0.4499 |
193
+ | 0.4766 | 19.43 | 140000 | 0.4428 |
194
+ | 0.4764 | 19.56 | 141000 | 0.4536 |
195
+ | 0.477 | 19.7 | 142000 | 0.4444 |
196
+ | 0.4764 | 19.84 | 143000 | 0.4482 |
197
+ | 0.4764 | 19.98 | 144000 | 0.4510 |
198
+ | 0.4763 | 20.12 | 145000 | 0.4519 |
199
+ | 0.4761 | 20.26 | 146000 | 0.4452 |
200
+ | 0.4761 | 20.4 | 147000 | 0.4476 |
201
+ | 0.4756 | 20.54 | 148000 | 0.4494 |
202
+ | 0.4757 | 20.67 | 149000 | 0.4544 |
203
+ | 0.4762 | 20.81 | 150000 | 0.4412 |
204
+ | 0.4757 | 20.95 | 151000 | 0.4459 |
205
+ | 0.4749 | 21.09 | 152000 | 0.4532 |
206
+ | 0.4752 | 21.23 | 153000 | 0.4477 |
207
+ | 0.4749 | 21.37 | 154000 | 0.4396 |
208
+ | 0.4764 | 21.51 | 155000 | 0.4466 |
209
+ | 0.4753 | 21.65 | 156000 | 0.4523 |
210
+ | 0.4755 | 21.78 | 157000 | 0.4582 |
211
+ | 0.4749 | 21.92 | 158000 | 0.4539 |
212
+ | 0.475 | 22.06 | 159000 | 0.4539 |
213
+ | 0.4747 | 22.2 | 160000 | 0.4519 |
214
+ | 0.4745 | 22.34 | 161000 | 0.4370 |
215
+ | 0.4748 | 22.48 | 162000 | 0.4449 |
216
+ | 0.4743 | 22.62 | 163000 | 0.4484 |
217
+ | 0.4745 | 22.76 | 164000 | 0.4471 |
218
+ | 0.4739 | 22.89 | 165000 | 0.4480 |
219
+ | 0.4746 | 23.03 | 166000 | 0.4519 |
220
+ | 0.4739 | 23.17 | 167000 | 0.4478 |
221
+ | 0.4739 | 23.31 | 168000 | 0.4497 |
222
+ | 0.4738 | 23.45 | 169000 | 0.4462 |
223
+ | 0.474 | 23.59 | 170000 | 0.4430 |
224
+ | 0.4737 | 23.73 | 171000 | 0.4483 |
225
+ | 0.4737 | 23.87 | 172000 | 0.4508 |
226
+ | 0.474 | 24.0 | 173000 | 0.4439 |
227
+ | 0.4729 | 24.14 | 174000 | 0.4426 |
228
+ | 0.4735 | 24.28 | 175000 | 0.4433 |
229
+ | 0.4722 | 24.42 | 176000 | 0.4483 |
230
+ | 0.4728 | 24.56 | 177000 | 0.4496 |
231
+ | 0.4727 | 24.7 | 178000 | 0.4473 |
232
+ | 0.4729 | 24.84 | 179000 | 0.4404 |
233
+ | 0.4722 | 24.98 | 180000 | 0.4426 |
234
+ | 0.4724 | 25.11 | 181000 | 0.4479 |
235
+ | 0.4739 | 25.25 | 182000 | 0.4430 |
236
+ | 0.4723 | 25.39 | 183000 | 0.4418 |
237
+ | 0.4724 | 25.53 | 184000 | 0.4371 |
238
+ | 0.472 | 25.67 | 185000 | 0.4456 |
239
+ | 0.4726 | 25.81 | 186000 | 0.4419 |
240
+ | 0.4721 | 25.95 | 187000 | 0.4417 |
241
+ | 0.4722 | 26.09 | 188000 | 0.4475 |
242
+ | 0.4715 | 26.22 | 189000 | 0.4389 |
243
+ | 0.4717 | 26.36 | 190000 | 0.4451 |
244
+ | 0.4716 | 26.5 | 191000 | 0.4440 |
245
+ | 0.4714 | 26.64 | 192000 | 0.4399 |
246
+ | 0.4712 | 26.78 | 193000 | 0.4398 |
247
+ | 0.4709 | 26.92 | 194000 | 0.4424 |
248
+ | 0.4714 | 27.06 | 195000 | 0.4533 |
249
+ | 0.4706 | 27.2 | 196000 | 0.4394 |
250
+ | 0.471 | 27.33 | 197000 | 0.4436 |
251
+ | 0.4707 | 27.47 | 198000 | 0.4421 |
252
+ | 0.471 | 27.61 | 199000 | 0.4459 |
253
+ | 0.4707 | 27.75 | 200000 | 0.4439 |
254
+ | 0.471 | 27.89 | 201000 | 0.4467 |
255
+ | 0.471 | 28.03 | 202000 | 0.4439 |
256
+ | 0.4704 | 28.17 | 203000 | 0.4445 |
257
+ | 0.4705 | 28.31 | 204000 | 0.4429 |
258
+ | 0.4706 | 28.44 | 205000 | 0.4382 |
259
+ | 0.4703 | 28.58 | 206000 | 0.4425 |
260
+ | 0.4695 | 28.72 | 207000 | 0.4414 |
261
+ | 0.4696 | 28.86 | 208000 | 0.4405 |
262
+ | 0.4696 | 29.0 | 209000 | 0.4460 |
263
+ | 0.4701 | 29.14 | 210000 | 0.4460 |
264
+ | 0.4696 | 29.28 | 211000 | 0.4397 |
265
+ | 0.4693 | 29.42 | 212000 | 0.4439 |
266
+ | 0.4694 | 29.55 | 213000 | 0.4495 |
267
+ | 0.469 | 29.69 | 214000 | 0.4466 |
268
+ | 0.4691 | 29.83 | 215000 | 0.4336 |
269
+ | 0.4694 | 29.97 | 216000 | 0.4377 |
270
+ | 0.4698 | 30.11 | 217000 | 0.4356 |
271
+ | 0.4689 | 30.25 | 218000 | 0.4381 |
272
+ | 0.4685 | 30.39 | 219000 | 0.4431 |
273
+ | 0.4688 | 30.53 | 220000 | 0.4411 |
274
+ | 0.4687 | 30.66 | 221000 | 0.4445 |
275
+ | 0.4685 | 30.8 | 222000 | 0.4432 |
276
+ | 0.4687 | 30.94 | 223000 | 0.4383 |
277
+ | 0.4681 | 31.08 | 224000 | 0.4371 |
278
+ | 0.4683 | 31.22 | 225000 | 0.4384 |
279
+ | 0.4678 | 31.36 | 226000 | 0.4396 |
280
+ | 0.4682 | 31.5 | 227000 | 0.4387 |
281
+ | 0.4671 | 31.64 | 228000 | 0.4382 |
282
+ | 0.4676 | 31.77 | 229000 | 0.4410 |
283
+ | 0.4681 | 31.91 | 230000 | 0.4391 |
284
+ | 0.4676 | 32.05 | 231000 | 0.4429 |
285
+ | 0.4673 | 32.19 | 232000 | 0.4395 |
286
+ | 0.4669 | 32.33 | 233000 | 0.4389 |
287
+ | 0.4675 | 32.47 | 234000 | 0.4452 |
288
+ | 0.4667 | 32.61 | 235000 | 0.4395 |
289
+ | 0.4667 | 32.75 | 236000 | 0.4460 |
290
+ | 0.4672 | 32.88 | 237000 | 0.4404 |
291
+ | 0.4667 | 33.02 | 238000 | 0.4372 |
292
+ | 0.4663 | 33.16 | 239000 | 0.4362 |
293
+ | 0.4669 | 33.3 | 240000 | 0.4428 |
294
+ | 0.4662 | 33.44 | 241000 | 0.4370 |
295
+ | 0.4662 | 33.58 | 242000 | 0.4382 |
296
+ | 0.466 | 33.72 | 243000 | 0.4395 |
297
+ | 0.4661 | 33.86 | 244000 | 0.4418 |
298
+ | 0.4663 | 33.99 | 245000 | 0.4407 |
299
+ | 0.4661 | 34.13 | 246000 | 0.4346 |
300
+ | 0.4652 | 34.27 | 247000 | 0.4392 |
301
+ | 0.4662 | 34.41 | 248000 | 0.4396 |
302
+ | 0.4655 | 34.55 | 249000 | 0.4427 |
303
+ | 0.4657 | 34.69 | 250000 | 0.4484 |
304
+ | 0.4654 | 34.83 | 251000 | 0.4268 |
305
+ | 0.4655 | 34.97 | 252000 | 0.4384 |
306
+ | 0.4649 | 35.1 | 253000 | 0.4383 |
307
+ | 0.465 | 35.24 | 254000 | 0.4368 |
308
+ | 0.4648 | 35.38 | 255000 | 0.4327 |
309
+ | 0.4647 | 35.52 | 256000 | 0.4416 |
310
+ | 0.4652 | 35.66 | 257000 | 0.4390 |
311
+ | 0.4646 | 35.8 | 258000 | 0.4450 |
312
+ | 0.4651 | 35.94 | 259000 | 0.4354 |
313
+ | 0.4643 | 36.08 | 260000 | 0.4473 |
314
+ | 0.464 | 36.21 | 261000 | 0.4423 |
315
+ | 0.4638 | 36.35 | 262000 | 0.4339 |
316
+ | 0.464 | 36.49 | 263000 | 0.4438 |
317
+ | 0.464 | 36.63 | 264000 | 0.4398 |
318
+ | 0.4637 | 36.77 | 265000 | 0.4352 |
319
+ | 0.4641 | 36.91 | 266000 | 0.4352 |
320
+ | 0.4651 | 37.05 | 267000 | 0.4324 |
321
+ | 0.4637 | 37.19 | 268000 | 0.4341 |
322
+ | 0.4633 | 37.32 | 269000 | 0.4331 |
323
+ | 0.4639 | 37.46 | 270000 | 0.4391 |
324
+ | 0.463 | 37.6 | 271000 | 0.4380 |
325
+ | 0.4635 | 37.74 | 272000 | 0.4355 |
326
+ | 0.4631 | 37.88 | 273000 | 0.4397 |
327
+ | 0.464 | 38.02 | 274000 | 0.4336 |
328
+ | 0.4629 | 38.16 | 275000 | 0.4339 |
329
+ | 0.4634 | 38.3 | 276000 | 0.4355 |
330
+ | 0.4632 | 38.43 | 277000 | 0.4388 |
331
+ | 0.4628 | 38.57 | 278000 | 0.4341 |
332
+ | 0.4621 | 38.71 | 279000 | 0.4337 |
333
+ | 0.4626 | 38.85 | 280000 | 0.4340 |
334
+ | 0.462 | 38.99 | 281000 | 0.4306 |
335
+ | 0.8286 | 39.13 | 282000 | 0.4504 |
336
+ | 0.4624 | 39.27 | 283000 | 0.4399 |
337
+ | 0.4621 | 39.41 | 284000 | 0.4351 |
338
+ | 0.4622 | 39.54 | 285000 | 0.4304 |
339
+ | 0.4619 | 39.68 | 286000 | 0.4329 |
340
+ | 0.4618 | 39.82 | 287000 | 0.4208 |
341
+ | 0.462 | 39.96 | 288000 | 0.4414 |
342
+ | 0.4615 | 40.1 | 289000 | 0.4353 |
343
+ | 0.4614 | 40.24 | 290000 | 0.4398 |
344
+ | 0.4611 | 40.38 | 291000 | 0.4371 |
345
+ | 0.4608 | 40.52 | 292000 | 0.4326 |
346
+ | 0.4611 | 40.65 | 293000 | 0.4332 |
347
+ | 0.4614 | 40.79 | 294000 | 0.4343 |
348
+ | 0.4609 | 40.93 | 295000 | 0.4306 |
349
+ | 0.4608 | 41.07 | 296000 | 0.4323 |
350
+ | 0.4608 | 41.21 | 297000 | 0.4321 |
351
+ | 0.4601 | 41.35 | 298000 | 0.4330 |
352
+ | 0.4606 | 41.49 | 299000 | 0.4361 |
353
+ | 0.4606 | 41.63 | 300000 | 0.4367 |
354
+ | 0.46 | 41.76 | 301000 | 0.4327 |
355
+ | 0.4596 | 41.9 | 302000 | 0.4306 |
356
+ | 0.46 | 42.04 | 303000 | 0.4352 |
357
+ | 0.46 | 42.18 | 304000 | 0.4338 |
358
+ | 0.4597 | 42.32 | 305000 | 0.4333 |
359
+ | 0.4596 | 42.46 | 306000 | 0.4334 |
360
+ | 0.4591 | 42.6 | 307000 | 0.4334 |
361
+ | 0.4597 | 42.74 | 308000 | 0.4319 |
362
+ | 0.4586 | 42.87 | 309000 | 0.4268 |
363
+ | 0.4593 | 43.01 | 310000 | 0.4366 |
364
+ | 0.4591 | 43.15 | 311000 | 0.4283 |
365
+ | 0.4587 | 43.29 | 312000 | 0.4289 |
366
+ | 0.4594 | 43.43 | 313000 | 0.4332 |
367
+ | 0.459 | 43.57 | 314000 | 0.4326 |
368
+ | 0.4586 | 43.71 | 315000 | 0.4356 |
369
+ | 0.4581 | 43.85 | 316000 | 0.4271 |
370
+ | 0.4584 | 43.98 | 317000 | 0.4325 |
371
+ | 0.4586 | 44.12 | 318000 | 0.4350 |
372
+ | 0.4584 | 44.26 | 319000 | 0.4273 |
373
+ | 0.4576 | 44.4 | 320000 | 0.4284 |
374
+ | 0.458 | 44.54 | 321000 | 0.4331 |
375
+ | 0.4581 | 44.68 | 322000 | 0.4263 |
376
+ | 0.4579 | 44.82 | 323000 | 0.4283 |
377
+ | 0.4583 | 44.96 | 324000 | 0.4362 |
378
+ | 0.4571 | 45.1 | 325000 | 0.4330 |
379
+ | 0.4566 | 45.23 | 326000 | 0.4300 |
380
+ | 0.4572 | 45.37 | 327000 | 0.4258 |
381
+ | 0.4574 | 45.51 | 328000 | 0.4200 |
382
+ | 0.4573 | 45.65 | 329000 | 0.4299 |
383
+ | 0.4578 | 45.79 | 330000 | 0.4319 |
384
+ | 0.4576 | 45.93 | 331000 | 0.4352 |
385
+ | 0.4574 | 46.07 | 332000 | 0.4278 |
386
+ | 0.4572 | 46.21 | 333000 | 0.4326 |
387
+ | 0.4568 | 46.34 | 334000 | 0.4295 |
388
+ | 0.4569 | 46.48 | 335000 | 0.4300 |
389
+ | 0.4566 | 46.62 | 336000 | 0.4333 |
390
+ | 0.4567 | 46.76 | 337000 | 0.4262 |
391
+ | 0.4564 | 46.9 | 338000 | 0.4354 |
392
+ | 0.4574 | 47.04 | 339000 | 0.4357 |
393
+ | 0.4564 | 47.18 | 340000 | 0.4308 |
394
+ | 0.4554 | 47.32 | 341000 | 0.4350 |
395
+ | 0.456 | 47.45 | 342000 | 0.4400 |
396
+ | 0.456 | 47.59 | 343000 | 0.4237 |
397
+ | 0.4559 | 47.73 | 344000 | 0.4236 |
398
+ | 0.4559 | 47.87 | 345000 | 0.4305 |
399
+ | 0.4559 | 48.01 | 346000 | 0.4245 |
400
+ | 0.4549 | 48.15 | 347000 | 0.4182 |
401
+ | 0.4556 | 48.29 | 348000 | 0.4330 |
402
+ | 0.4551 | 48.43 | 349000 | 0.4397 |
403
+ | 0.455 | 48.56 | 350000 | 0.4252 |
404
+ | 0.4548 | 48.7 | 351000 | 0.4246 |
405
+ | 0.4551 | 48.84 | 352000 | 0.4291 |
406
+ | 0.4554 | 48.98 | 353000 | 0.4286 |
407
+ | 0.4547 | 49.12 | 354000 | 0.4336 |
408
+ | 0.4548 | 49.26 | 355000 | 0.4324 |
409
+ | 0.4545 | 49.4 | 356000 | 0.4236 |
410
+ | 0.4547 | 49.54 | 357000 | 0.4345 |
411
+ | 0.4542 | 49.67 | 358000 | 0.4329 |
412
+ | 0.4545 | 49.81 | 359000 | 0.4241 |
413
+ | 0.4541 | 49.95 | 360000 | 0.4177 |
414
+ | 0.454 | 50.09 | 361000 | 0.4244 |
415
+ | 0.4538 | 50.23 | 362000 | 0.4190 |
416
+ | 0.4535 | 50.37 | 363000 | 0.4331 |
417
+ | 0.4545 | 50.51 | 364000 | 0.4252 |
418
+ | 0.454 | 50.65 | 365000 | 0.4315 |
419
+ | 0.4536 | 50.78 | 366000 | 0.4301 |
420
+ | 0.4534 | 50.92 | 367000 | 0.4357 |
421
+ | 0.4537 | 51.06 | 368000 | 0.4334 |
422
+ | 0.4535 | 51.2 | 369000 | 0.4200 |
423
+ | 0.4538 | 51.34 | 370000 | 0.4274 |
424
+ | 0.4536 | 51.48 | 371000 | 0.4178 |
425
+ | 0.4534 | 51.62 | 372000 | 0.4181 |
426
+ | 0.4533 | 51.76 | 373000 | 0.4211 |
427
+ | 0.4535 | 51.89 | 374000 | 0.4290 |
428
+ | 0.4535 | 52.03 | 375000 | 0.4201 |
429
+ | 0.4526 | 52.17 | 376000 | 0.4263 |
430
+ | 0.4526 | 52.31 | 377000 | 0.4237 |
431
+ | 0.4524 | 52.45 | 378000 | 0.4254 |
432
+ | 0.4529 | 52.59 | 379000 | 0.4260 |
433
+ | 0.4531 | 52.73 | 380000 | 0.4202 |
434
+ | 0.4523 | 52.87 | 381000 | 0.4223 |
435
+ | 0.4523 | 53.0 | 382000 | 0.4271 |
436
+ | 0.4522 | 53.14 | 383000 | 0.4286 |
437
+ | 0.4524 | 53.28 | 384000 | 0.4256 |
438
+ | 0.4515 | 53.42 | 385000 | 0.4221 |
439
+ | 0.4513 | 53.56 | 386000 | 0.4255 |
440
+ | 0.452 | 53.7 | 387000 | 0.4270 |
441
+ | 0.4519 | 53.84 | 388000 | 0.4222 |
442
+ | 0.4518 | 53.98 | 389000 | 0.4233 |
443
+ | 0.4513 | 54.11 | 390000 | 0.4233 |
444
+ | 0.4517 | 54.25 | 391000 | 0.4239 |
445
+ | 0.4518 | 54.39 | 392000 | 0.4273 |
446
+ | 0.4508 | 54.53 | 393000 | 0.4200 |
447
+ | 0.4511 | 54.67 | 394000 | 0.4236 |
448
+ | 0.4508 | 54.81 | 395000 | 0.4193 |
449
+ | 0.4507 | 54.95 | 396000 | 0.4293 |
450
+ | 0.4508 | 55.09 | 397000 | 0.4187 |
451
+ | 0.4504 | 55.22 | 398000 | 0.4283 |
452
+ | 0.4512 | 55.36 | 399000 | 0.4239 |
453
+ | 0.4504 | 55.5 | 400000 | 0.4269 |
454
+ | 0.4506 | 55.64 | 401000 | 0.4291 |
455
+ | 0.4504 | 55.78 | 402000 | 0.4238 |
456
+ | 0.4503 | 55.92 | 403000 | 0.4200 |
457
+ | 0.4506 | 56.06 | 404000 | 0.4186 |
458
+ | 0.4507 | 56.2 | 405000 | 0.4260 |
459
+ | 0.4504 | 56.33 | 406000 | 0.4188 |
460
+ | 0.4503 | 56.47 | 407000 | 0.4231 |
461
+ | 0.4498 | 56.61 | 408000 | 0.4148 |
462
+ | 0.4499 | 56.75 | 409000 | 0.4182 |
463
+ | 0.4498 | 56.89 | 410000 | 0.4229 |
464
+ | 0.4501 | 57.03 | 411000 | 0.4252 |
465
+ | 0.4497 | 57.17 | 412000 | 0.4220 |
466
+ | 0.45 | 57.31 | 413000 | 0.4181 |
467
+ | 0.4497 | 57.44 | 414000 | 0.4270 |
468
+ | 0.4497 | 57.58 | 415000 | 0.4208 |
469
+ | 0.4499 | 57.72 | 416000 | 0.4224 |
470
+ | 0.4496 | 57.86 | 417000 | 0.4207 |
471
+ | 0.4494 | 58.0 | 418000 | 0.4268 |
472
+ | 0.4499 | 58.14 | 419000 | 0.4240 |
473
+ | 0.4495 | 58.28 | 420000 | 0.4294 |
474
+ | 0.4487 | 58.42 | 421000 | 0.4207 |
475
+ | 0.4495 | 58.55 | 422000 | 0.4246 |
476
+ | 0.4491 | 58.69 | 423000 | 0.4213 |
477
+ | 0.4492 | 58.83 | 424000 | 0.4241 |
478
+ | 0.4486 | 58.97 | 425000 | 0.4247 |
479
+ | 0.4485 | 59.11 | 426000 | 0.4163 |
480
+ | 0.4489 | 59.25 | 427000 | 0.4239 |
481
+ | 0.4483 | 59.39 | 428000 | 0.4240 |
482
+ | 0.4491 | 59.53 | 429000 | 0.4214 |
483
+ | 0.4485 | 59.66 | 430000 | 0.4285 |
484
+ | 0.449 | 59.8 | 431000 | 0.4265 |
485
+ | 0.4484 | 59.94 | 432000 | 0.4188 |
486
+ | 0.4484 | 60.08 | 433000 | 0.4176 |
487
+ | 0.4488 | 60.22 | 434000 | 0.4200 |
488
+ | 0.448 | 60.36 | 435000 | 0.4116 |
489
+ | 0.4477 | 60.5 | 436000 | 0.4215 |
490
+ | 0.4484 | 60.64 | 437000 | 0.4204 |
491
+ | 0.448 | 60.77 | 438000 | 0.4093 |
492
+ | 0.4479 | 60.91 | 439000 | 0.4181 |
493
+ | 0.4481 | 61.05 | 440000 | 0.4232 |
494
+ | 0.4477 | 61.19 | 441000 | 0.4202 |
495
+ | 0.4478 | 61.33 | 442000 | 0.4167 |
496
+ | 0.4481 | 61.47 | 443000 | 0.4173 |
497
+ | 0.4483 | 61.61 | 444000 | 0.4158 |
498
+ | 0.4473 | 61.75 | 445000 | 0.4174 |
499
+ | 0.4474 | 61.88 | 446000 | 0.4266 |
500
+ | 0.4477 | 62.02 | 447000 | 0.4242 |
501
+ | 0.4476 | 62.16 | 448000 | 0.4240 |
502
+ | 0.4478 | 62.3 | 449000 | 0.4286 |
503
+ | 0.4474 | 62.44 | 450000 | 0.4294 |
504
+ | 0.4482 | 62.58 | 451000 | 0.4144 |
505
+ | 0.4471 | 62.72 | 452000 | 0.4316 |
506
+ | 0.448 | 62.86 | 453000 | 0.4228 |
507
+ | 0.4474 | 62.99 | 454000 | 0.4242 |
508
+ | 0.447 | 63.13 | 455000 | 0.4231 |
509
+ | 0.4475 | 63.27 | 456000 | 0.4235 |
510
+ | 0.4475 | 63.41 | 457000 | 0.4279 |
511
+ | 0.4476 | 63.55 | 458000 | 0.4230 |
512
+ | 0.4464 | 63.69 | 459000 | 0.4145 |
513
+ | 0.4467 | 63.83 | 460000 | 0.4230 |
514
+ | 0.4465 | 63.97 | 461000 | 0.4208 |
515
+ | 0.4466 | 64.1 | 462000 | 0.4243 |
516
+ | 0.447 | 64.24 | 463000 | 0.4220 |
517
+ | 0.4473 | 64.38 | 464000 | 0.4253 |
518
+ | 0.4471 | 64.52 | 465000 | 0.4194 |
519
+ | 0.447 | 64.66 | 466000 | 0.4262 |
520
+ | 0.447 | 64.8 | 467000 | 0.4245 |
521
+ | 0.4468 | 64.94 | 468000 | 0.4143 |
522
+ | 0.4463 | 65.08 | 469000 | 0.4187 |
523
+ | 0.4465 | 65.21 | 470000 | 0.4185 |
524
+ | 0.4465 | 65.35 | 471000 | 0.4244 |
525
+ | 0.4467 | 65.49 | 472000 | 0.4201 |
526
+ | 0.4465 | 65.63 | 473000 | 0.4160 |
527
+ | 0.4467 | 65.77 | 474000 | 0.4273 |
528
+ | 0.4465 | 65.91 | 475000 | 0.4183 |
529
+ | 0.4467 | 66.05 | 476000 | 0.4227 |
530
+ | 0.4469 | 66.19 | 477000 | 0.4166 |
531
+ | 0.4467 | 66.32 | 478000 | 0.4199 |
532
+ | 0.4464 | 66.46 | 479000 | 0.4181 |
533
+ | 0.4463 | 66.6 | 480000 | 0.4217 |
534
+ | 0.4464 | 66.74 | 481000 | 0.4158 |
535
+ | 0.4468 | 66.88 | 482000 | 0.4191 |
536
+ | 0.447 | 67.02 | 483000 | 0.4248 |
537
+ | 0.4465 | 67.16 | 484000 | 0.4234 |
538
+ | 0.4463 | 67.3 | 485000 | 0.4238 |
539
+ | 0.446 | 67.43 | 486000 | 0.4162 |
540
+ | 0.4462 | 67.57 | 487000 | 0.4202 |
541
+ | 0.4462 | 67.71 | 488000 | 0.4177 |
542
+ | 0.4455 | 67.85 | 489000 | 0.4228 |
543
+ | 0.4463 | 67.99 | 490000 | 0.4146 |
544
+ | 0.4454 | 68.13 | 491000 | 0.4190 |
545
+ | 0.446 | 68.27 | 492000 | 0.4219 |
546
+ | 0.4461 | 68.41 | 493000 | 0.4250 |
547
+ | 0.4462 | 68.54 | 494000 | 0.4172 |
548
+ | 0.4464 | 68.68 | 495000 | 0.4122 |
549
+ | 0.4459 | 68.82 | 496000 | 0.4178 |
550
+ | 0.4459 | 68.96 | 497000 | 0.4095 |
551
+ | 0.4458 | 69.1 | 498000 | 0.4124 |
552
+ | 0.4458 | 69.24 | 499000 | 0.4182 |
553
+ | 0.4458 | 69.38 | 500000 | 0.4177 |
554
+
555
+
556
+ ### Framework versions
557
+
558
+ - Transformers 4.17.0
559
+ - Pytorch 1.12.0
560
+ - Datasets 2.0.0
561
+ - Tokenizers 0.13.2