aryaadhi commited on
Commit
48fb8a8
1 Parent(s): ca013d8

End of training

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 2.2002
20
 
21
  ## Model description
22
 
@@ -47,16 +47,487 @@ The following hyperparameters were used during training:
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:------:|:----:|:---------------:|
50
- | 2.447 | 0.4202 | 100 | 2.4592 |
51
- | 2.3965 | 0.8403 | 200 | 2.3260 |
52
- | 2.2171 | 1.2605 | 300 | 2.2326 |
53
- | 2.1162 | 1.6807 | 400 | 2.2002 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
 
56
  ### Framework versions
57
 
58
  - PEFT 0.11.1
59
- - Transformers 4.41.2
60
- - Pytorch 2.3.0+cu121
61
- - Datasets 2.20.0
62
  - Tokenizers 0.19.1
 
16
 
17
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.7469
20
 
21
  ## Model description
22
 
 
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:------:|:----:|:---------------:|
50
+ | No log | 0.0042 | 10 | 3.8019 |
51
+ | No log | 0.0084 | 20 | 3.4249 |
52
+ | No log | 0.0126 | 30 | 3.2262 |
53
+ | No log | 0.0168 | 40 | 3.1067 |
54
+ | No log | 0.0211 | 50 | 3.0400 |
55
+ | No log | 0.0253 | 60 | 2.9943 |
56
+ | No log | 0.0295 | 70 | 2.9605 |
57
+ | No log | 0.0337 | 80 | 2.9280 |
58
+ | No log | 0.0379 | 90 | 2.8981 |
59
+ | 3.1867 | 0.0421 | 100 | 2.8728 |
60
+ | 3.1867 | 0.0463 | 110 | 2.8370 |
61
+ | 3.1867 | 0.0505 | 120 | 2.8006 |
62
+ | 3.1867 | 0.0547 | 130 | 2.8022 |
63
+ | 3.1867 | 0.0589 | 140 | 2.7554 |
64
+ | 3.1867 | 0.0632 | 150 | 2.7308 |
65
+ | 3.1867 | 0.0674 | 160 | 2.7201 |
66
+ | 3.1867 | 0.0716 | 170 | 2.6914 |
67
+ | 3.1867 | 0.0758 | 180 | 2.6740 |
68
+ | 3.1867 | 0.08 | 190 | 2.6529 |
69
+ | 2.7281 | 0.0842 | 200 | 2.6375 |
70
+ | 2.7281 | 0.0884 | 210 | 2.6309 |
71
+ | 2.7281 | 0.0926 | 220 | 2.6019 |
72
+ | 2.7281 | 0.0968 | 230 | 2.6017 |
73
+ | 2.7281 | 0.1011 | 240 | 2.5798 |
74
+ | 2.7281 | 0.1053 | 250 | 2.5643 |
75
+ | 2.7281 | 0.1095 | 260 | 2.5479 |
76
+ | 2.7281 | 0.1137 | 270 | 2.5434 |
77
+ | 2.7281 | 0.1179 | 280 | 2.5248 |
78
+ | 2.7281 | 0.1221 | 290 | 2.5156 |
79
+ | 2.5594 | 0.1263 | 300 | 2.4896 |
80
+ | 2.5594 | 0.1305 | 310 | 2.5017 |
81
+ | 2.5594 | 0.1347 | 320 | 2.4770 |
82
+ | 2.5594 | 0.1389 | 330 | 2.4505 |
83
+ | 2.5594 | 0.1432 | 340 | 2.4653 |
84
+ | 2.5594 | 0.1474 | 350 | 2.4346 |
85
+ | 2.5594 | 0.1516 | 360 | 2.4276 |
86
+ | 2.5594 | 0.1558 | 370 | 2.4290 |
87
+ | 2.5594 | 0.16 | 380 | 2.4138 |
88
+ | 2.5594 | 0.1642 | 390 | 2.4161 |
89
+ | 2.479 | 0.1684 | 400 | 2.4047 |
90
+ | 2.479 | 0.1726 | 410 | 2.3960 |
91
+ | 2.479 | 0.1768 | 420 | 2.3859 |
92
+ | 2.479 | 0.1811 | 430 | 2.3767 |
93
+ | 2.479 | 0.1853 | 440 | 2.3586 |
94
+ | 2.479 | 0.1895 | 450 | 2.3607 |
95
+ | 2.479 | 0.1937 | 460 | 2.3640 |
96
+ | 2.479 | 0.1979 | 470 | 2.3740 |
97
+ | 2.479 | 0.2021 | 480 | 2.3381 |
98
+ | 2.479 | 0.2063 | 490 | 2.3369 |
99
+ | 2.3791 | 0.2105 | 500 | 2.3474 |
100
+ | 2.3791 | 0.2147 | 510 | 2.3366 |
101
+ | 2.3791 | 0.2189 | 520 | 2.3072 |
102
+ | 2.3791 | 0.2232 | 530 | 2.3107 |
103
+ | 2.3791 | 0.2274 | 540 | 2.2946 |
104
+ | 2.3791 | 0.2316 | 550 | 2.2940 |
105
+ | 2.3791 | 0.2358 | 560 | 2.2997 |
106
+ | 2.3791 | 0.24 | 570 | 2.2951 |
107
+ | 2.3791 | 0.2442 | 580 | 2.2775 |
108
+ | 2.3791 | 0.2484 | 590 | 2.2682 |
109
+ | 2.3271 | 0.2526 | 600 | 2.2648 |
110
+ | 2.3271 | 0.2568 | 610 | 2.2546 |
111
+ | 2.3271 | 0.2611 | 620 | 2.2553 |
112
+ | 2.3271 | 0.2653 | 630 | 2.2721 |
113
+ | 2.3271 | 0.2695 | 640 | 2.2441 |
114
+ | 2.3271 | 0.2737 | 650 | 2.2442 |
115
+ | 2.3271 | 0.2779 | 660 | 2.2481 |
116
+ | 2.3271 | 0.2821 | 670 | 2.2399 |
117
+ | 2.3271 | 0.2863 | 680 | 2.2300 |
118
+ | 2.3271 | 0.2905 | 690 | 2.2306 |
119
+ | 2.2042 | 0.2947 | 700 | 2.2152 |
120
+ | 2.2042 | 0.2989 | 710 | 2.2136 |
121
+ | 2.2042 | 0.3032 | 720 | 2.2069 |
122
+ | 2.2042 | 0.3074 | 730 | 2.2075 |
123
+ | 2.2042 | 0.3116 | 740 | 2.1964 |
124
+ | 2.2042 | 0.3158 | 750 | 2.1996 |
125
+ | 2.2042 | 0.32 | 760 | 2.1993 |
126
+ | 2.2042 | 0.3242 | 770 | 2.1808 |
127
+ | 2.2042 | 0.3284 | 780 | 2.1800 |
128
+ | 2.2042 | 0.3326 | 790 | 2.1723 |
129
+ | 2.1913 | 0.3368 | 800 | 2.1780 |
130
+ | 2.1913 | 0.3411 | 810 | 2.1986 |
131
+ | 2.1913 | 0.3453 | 820 | 2.1846 |
132
+ | 2.1913 | 0.3495 | 830 | 2.1629 |
133
+ | 2.1913 | 0.3537 | 840 | 2.1632 |
134
+ | 2.1913 | 0.3579 | 850 | 2.1667 |
135
+ | 2.1913 | 0.3621 | 860 | 2.1515 |
136
+ | 2.1913 | 0.3663 | 870 | 2.1521 |
137
+ | 2.1913 | 0.3705 | 880 | 2.1530 |
138
+ | 2.1913 | 0.3747 | 890 | 2.1422 |
139
+ | 2.1206 | 0.3789 | 900 | 2.1381 |
140
+ | 2.1206 | 0.3832 | 910 | 2.1520 |
141
+ | 2.1206 | 0.3874 | 920 | 2.1386 |
142
+ | 2.1206 | 0.3916 | 930 | 2.1290 |
143
+ | 2.1206 | 0.3958 | 940 | 2.1335 |
144
+ | 2.1206 | 0.4 | 950 | 2.1263 |
145
+ | 2.1206 | 0.4042 | 960 | 2.1153 |
146
+ | 2.1206 | 0.4084 | 970 | 2.1146 |
147
+ | 2.1206 | 0.4126 | 980 | 2.1127 |
148
+ | 2.1206 | 0.4168 | 990 | 2.1036 |
149
+ | 2.1323 | 0.4211 | 1000 | 2.1044 |
150
+ | 2.1323 | 0.4253 | 1010 | 2.1090 |
151
+ | 2.1323 | 0.4295 | 1020 | 2.1018 |
152
+ | 2.1323 | 0.4337 | 1030 | 2.0884 |
153
+ | 2.1323 | 0.4379 | 1040 | 2.1015 |
154
+ | 2.1323 | 0.4421 | 1050 | 2.0900 |
155
+ | 2.1323 | 0.4463 | 1060 | 2.0938 |
156
+ | 2.1323 | 0.4505 | 1070 | 2.0948 |
157
+ | 2.1323 | 0.4547 | 1080 | 2.0897 |
158
+ | 2.1323 | 0.4589 | 1090 | 2.0988 |
159
+ | 2.0922 | 0.4632 | 1100 | 2.0851 |
160
+ | 2.0922 | 0.4674 | 1110 | 2.0895 |
161
+ | 2.0922 | 0.4716 | 1120 | 2.0726 |
162
+ | 2.0922 | 0.4758 | 1130 | 2.0783 |
163
+ | 2.0922 | 0.48 | 1140 | 2.0718 |
164
+ | 2.0922 | 0.4842 | 1150 | 2.0685 |
165
+ | 2.0922 | 0.4884 | 1160 | 2.0839 |
166
+ | 2.0922 | 0.4926 | 1170 | 2.0627 |
167
+ | 2.0922 | 0.4968 | 1180 | 2.0651 |
168
+ | 2.0922 | 0.5011 | 1190 | 2.0706 |
169
+ | 2.0856 | 0.5053 | 1200 | 2.0570 |
170
+ | 2.0856 | 0.5095 | 1210 | 2.0602 |
171
+ | 2.0856 | 0.5137 | 1220 | 2.0573 |
172
+ | 2.0856 | 0.5179 | 1230 | 2.0576 |
173
+ | 2.0856 | 0.5221 | 1240 | 2.0464 |
174
+ | 2.0856 | 0.5263 | 1250 | 2.0399 |
175
+ | 2.0856 | 0.5305 | 1260 | 2.0511 |
176
+ | 2.0856 | 0.5347 | 1270 | 2.0358 |
177
+ | 2.0856 | 0.5389 | 1280 | 2.0363 |
178
+ | 2.0856 | 0.5432 | 1290 | 2.0448 |
179
+ | 2.0382 | 0.5474 | 1300 | 2.0321 |
180
+ | 2.0382 | 0.5516 | 1310 | 2.0312 |
181
+ | 2.0382 | 0.5558 | 1320 | 2.0251 |
182
+ | 2.0382 | 0.56 | 1330 | 2.0121 |
183
+ | 2.0382 | 0.5642 | 1340 | 2.0221 |
184
+ | 2.0382 | 0.5684 | 1350 | 2.0304 |
185
+ | 2.0382 | 0.5726 | 1360 | 2.0245 |
186
+ | 2.0382 | 0.5768 | 1370 | 2.0111 |
187
+ | 2.0382 | 0.5811 | 1380 | 2.0110 |
188
+ | 2.0382 | 0.5853 | 1390 | 2.0088 |
189
+ | 2.02 | 0.5895 | 1400 | 2.0078 |
190
+ | 2.02 | 0.5937 | 1410 | 2.0117 |
191
+ | 2.02 | 0.5979 | 1420 | 2.0071 |
192
+ | 2.02 | 0.6021 | 1430 | 2.0056 |
193
+ | 2.02 | 0.6063 | 1440 | 1.9949 |
194
+ | 2.02 | 0.6105 | 1450 | 1.9994 |
195
+ | 2.02 | 0.6147 | 1460 | 2.0030 |
196
+ | 2.02 | 0.6189 | 1470 | 1.9959 |
197
+ | 2.02 | 0.6232 | 1480 | 1.9896 |
198
+ | 2.02 | 0.6274 | 1490 | 1.9942 |
199
+ | 1.9807 | 0.6316 | 1500 | 1.9874 |
200
+ | 1.9807 | 0.6358 | 1510 | 1.9839 |
201
+ | 1.9807 | 0.64 | 1520 | 1.9780 |
202
+ | 1.9807 | 0.6442 | 1530 | 1.9741 |
203
+ | 1.9807 | 0.6484 | 1540 | 1.9846 |
204
+ | 1.9807 | 0.6526 | 1550 | 1.9765 |
205
+ | 1.9807 | 0.6568 | 1560 | 1.9707 |
206
+ | 1.9807 | 0.6611 | 1570 | 1.9685 |
207
+ | 1.9807 | 0.6653 | 1580 | 1.9633 |
208
+ | 1.9807 | 0.6695 | 1590 | 1.9818 |
209
+ | 1.9506 | 0.6737 | 1600 | 1.9634 |
210
+ | 1.9506 | 0.6779 | 1610 | 1.9707 |
211
+ | 1.9506 | 0.6821 | 1620 | 1.9697 |
212
+ | 1.9506 | 0.6863 | 1630 | 1.9601 |
213
+ | 1.9506 | 0.6905 | 1640 | 1.9645 |
214
+ | 1.9506 | 0.6947 | 1650 | 1.9559 |
215
+ | 1.9506 | 0.6989 | 1660 | 1.9598 |
216
+ | 1.9506 | 0.7032 | 1670 | 1.9658 |
217
+ | 1.9506 | 0.7074 | 1680 | 1.9567 |
218
+ | 1.9506 | 0.7116 | 1690 | 1.9612 |
219
+ | 1.9529 | 0.7158 | 1700 | 1.9584 |
220
+ | 1.9529 | 0.72 | 1710 | 1.9585 |
221
+ | 1.9529 | 0.7242 | 1720 | 1.9515 |
222
+ | 1.9529 | 0.7284 | 1730 | 1.9405 |
223
+ | 1.9529 | 0.7326 | 1740 | 1.9512 |
224
+ | 1.9529 | 0.7368 | 1750 | 1.9485 |
225
+ | 1.9529 | 0.7411 | 1760 | 1.9464 |
226
+ | 1.9529 | 0.7453 | 1770 | 1.9448 |
227
+ | 1.9529 | 0.7495 | 1780 | 1.9398 |
228
+ | 1.9529 | 0.7537 | 1790 | 1.9426 |
229
+ | 1.9258 | 0.7579 | 1800 | 1.9392 |
230
+ | 1.9258 | 0.7621 | 1810 | 1.9359 |
231
+ | 1.9258 | 0.7663 | 1820 | 1.9376 |
232
+ | 1.9258 | 0.7705 | 1830 | 1.9292 |
233
+ | 1.9258 | 0.7747 | 1840 | 1.9454 |
234
+ | 1.9258 | 0.7789 | 1850 | 1.9392 |
235
+ | 1.9258 | 0.7832 | 1860 | 1.9393 |
236
+ | 1.9258 | 0.7874 | 1870 | 1.9332 |
237
+ | 1.9258 | 0.7916 | 1880 | 1.9346 |
238
+ | 1.9258 | 0.7958 | 1890 | 1.9266 |
239
+ | 1.9351 | 0.8 | 1900 | 1.9306 |
240
+ | 1.9351 | 0.8042 | 1910 | 1.9278 |
241
+ | 1.9351 | 0.8084 | 1920 | 1.9212 |
242
+ | 1.9351 | 0.8126 | 1930 | 1.9129 |
243
+ | 1.9351 | 0.8168 | 1940 | 1.9212 |
244
+ | 1.9351 | 0.8211 | 1950 | 1.9114 |
245
+ | 1.9351 | 0.8253 | 1960 | 1.9168 |
246
+ | 1.9351 | 0.8295 | 1970 | 1.9125 |
247
+ | 1.9351 | 0.8337 | 1980 | 1.9075 |
248
+ | 1.9351 | 0.8379 | 1990 | 1.9114 |
249
+ | 1.9478 | 0.8421 | 2000 | 1.9043 |
250
+ | 1.9478 | 0.8463 | 2010 | 1.9068 |
251
+ | 1.9478 | 0.8505 | 2020 | 1.9073 |
252
+ | 1.9478 | 0.8547 | 2030 | 1.9001 |
253
+ | 1.9478 | 0.8589 | 2040 | 1.8955 |
254
+ | 1.9478 | 0.8632 | 2050 | 1.8919 |
255
+ | 1.9478 | 0.8674 | 2060 | 1.8945 |
256
+ | 1.9478 | 0.8716 | 2070 | 1.9009 |
257
+ | 1.9478 | 0.8758 | 2080 | 1.8951 |
258
+ | 1.9478 | 0.88 | 2090 | 1.8977 |
259
+ | 1.9038 | 0.8842 | 2100 | 1.8922 |
260
+ | 1.9038 | 0.8884 | 2110 | 1.8944 |
261
+ | 1.9038 | 0.8926 | 2120 | 1.8846 |
262
+ | 1.9038 | 0.8968 | 2130 | 1.8843 |
263
+ | 1.9038 | 0.9011 | 2140 | 1.8929 |
264
+ | 1.9038 | 0.9053 | 2150 | 1.8847 |
265
+ | 1.9038 | 0.9095 | 2160 | 1.8828 |
266
+ | 1.9038 | 0.9137 | 2170 | 1.8813 |
267
+ | 1.9038 | 0.9179 | 2180 | 1.8791 |
268
+ | 1.9038 | 0.9221 | 2190 | 1.8779 |
269
+ | 1.889 | 0.9263 | 2200 | 1.8855 |
270
+ | 1.889 | 0.9305 | 2210 | 1.8831 |
271
+ | 1.889 | 0.9347 | 2220 | 1.8763 |
272
+ | 1.889 | 0.9389 | 2230 | 1.8722 |
273
+ | 1.889 | 0.9432 | 2240 | 1.8756 |
274
+ | 1.889 | 0.9474 | 2250 | 1.8682 |
275
+ | 1.889 | 0.9516 | 2260 | 1.8722 |
276
+ | 1.889 | 0.9558 | 2270 | 1.8776 |
277
+ | 1.889 | 0.96 | 2280 | 1.8702 |
278
+ | 1.889 | 0.9642 | 2290 | 1.8705 |
279
+ | 1.8407 | 0.9684 | 2300 | 1.8739 |
280
+ | 1.8407 | 0.9726 | 2310 | 1.8697 |
281
+ | 1.8407 | 0.9768 | 2320 | 1.8693 |
282
+ | 1.8407 | 0.9811 | 2330 | 1.8649 |
283
+ | 1.8407 | 0.9853 | 2340 | 1.8654 |
284
+ | 1.8407 | 0.9895 | 2350 | 1.8674 |
285
+ | 1.8407 | 0.9937 | 2360 | 1.8691 |
286
+ | 1.8407 | 0.9979 | 2370 | 1.8616 |
287
+ | 1.8407 | 1.0021 | 2380 | 1.8589 |
288
+ | 1.8407 | 1.0063 | 2390 | 1.8600 |
289
+ | 1.8883 | 1.0105 | 2400 | 1.8562 |
290
+ | 1.8883 | 1.0147 | 2410 | 1.8567 |
291
+ | 1.8883 | 1.0189 | 2420 | 1.8575 |
292
+ | 1.8883 | 1.0232 | 2430 | 1.8599 |
293
+ | 1.8883 | 1.0274 | 2440 | 1.8600 |
294
+ | 1.8883 | 1.0316 | 2450 | 1.8532 |
295
+ | 1.8883 | 1.0358 | 2460 | 1.8494 |
296
+ | 1.8883 | 1.04 | 2470 | 1.8529 |
297
+ | 1.8883 | 1.0442 | 2480 | 1.8489 |
298
+ | 1.8883 | 1.0484 | 2490 | 1.8523 |
299
+ | 1.795 | 1.0526 | 2500 | 1.8443 |
300
+ | 1.795 | 1.0568 | 2510 | 1.8441 |
301
+ | 1.795 | 1.0611 | 2520 | 1.8420 |
302
+ | 1.795 | 1.0653 | 2530 | 1.8433 |
303
+ | 1.795 | 1.0695 | 2540 | 1.8470 |
304
+ | 1.795 | 1.0737 | 2550 | 1.8434 |
305
+ | 1.795 | 1.0779 | 2560 | 1.8439 |
306
+ | 1.795 | 1.0821 | 2570 | 1.8428 |
307
+ | 1.795 | 1.0863 | 2580 | 1.8422 |
308
+ | 1.795 | 1.0905 | 2590 | 1.8406 |
309
+ | 1.8274 | 1.0947 | 2600 | 1.8381 |
310
+ | 1.8274 | 1.0989 | 2610 | 1.8362 |
311
+ | 1.8274 | 1.1032 | 2620 | 1.8332 |
312
+ | 1.8274 | 1.1074 | 2630 | 1.8332 |
313
+ | 1.8274 | 1.1116 | 2640 | 1.8342 |
314
+ | 1.8274 | 1.1158 | 2650 | 1.8316 |
315
+ | 1.8274 | 1.12 | 2660 | 1.8306 |
316
+ | 1.8274 | 1.1242 | 2670 | 1.8335 |
317
+ | 1.8274 | 1.1284 | 2680 | 1.8329 |
318
+ | 1.8274 | 1.1326 | 2690 | 1.8346 |
319
+ | 1.7947 | 1.1368 | 2700 | 1.8340 |
320
+ | 1.7947 | 1.1411 | 2710 | 1.8295 |
321
+ | 1.7947 | 1.1453 | 2720 | 1.8249 |
322
+ | 1.7947 | 1.1495 | 2730 | 1.8286 |
323
+ | 1.7947 | 1.1537 | 2740 | 1.8299 |
324
+ | 1.7947 | 1.1579 | 2750 | 1.8261 |
325
+ | 1.7947 | 1.1621 | 2760 | 1.8261 |
326
+ | 1.7947 | 1.1663 | 2770 | 1.8315 |
327
+ | 1.7947 | 1.1705 | 2780 | 1.8241 |
328
+ | 1.7947 | 1.1747 | 2790 | 1.8232 |
329
+ | 1.729 | 1.1789 | 2800 | 1.8239 |
330
+ | 1.729 | 1.1832 | 2810 | 1.8192 |
331
+ | 1.729 | 1.1874 | 2820 | 1.8177 |
332
+ | 1.729 | 1.1916 | 2830 | 1.8215 |
333
+ | 1.729 | 1.1958 | 2840 | 1.8187 |
334
+ | 1.729 | 1.2 | 2850 | 1.8136 |
335
+ | 1.729 | 1.2042 | 2860 | 1.8134 |
336
+ | 1.729 | 1.2084 | 2870 | 1.8135 |
337
+ | 1.729 | 1.2126 | 2880 | 1.8166 |
338
+ | 1.729 | 1.2168 | 2890 | 1.8165 |
339
+ | 1.7822 | 1.2211 | 2900 | 1.8134 |
340
+ | 1.7822 | 1.2253 | 2910 | 1.8117 |
341
+ | 1.7822 | 1.2295 | 2920 | 1.8126 |
342
+ | 1.7822 | 1.2337 | 2930 | 1.8104 |
343
+ | 1.7822 | 1.2379 | 2940 | 1.8116 |
344
+ | 1.7822 | 1.2421 | 2950 | 1.8130 |
345
+ | 1.7822 | 1.2463 | 2960 | 1.8075 |
346
+ | 1.7822 | 1.2505 | 2970 | 1.8074 |
347
+ | 1.7822 | 1.2547 | 2980 | 1.8094 |
348
+ | 1.7822 | 1.2589 | 2990 | 1.8088 |
349
+ | 1.7872 | 1.2632 | 3000 | 1.8054 |
350
+ | 1.7872 | 1.2674 | 3010 | 1.8072 |
351
+ | 1.7872 | 1.2716 | 3020 | 1.8064 |
352
+ | 1.7872 | 1.2758 | 3030 | 1.8070 |
353
+ | 1.7872 | 1.28 | 3040 | 1.8037 |
354
+ | 1.7872 | 1.2842 | 3050 | 1.8001 |
355
+ | 1.7872 | 1.2884 | 3060 | 1.8036 |
356
+ | 1.7872 | 1.2926 | 3070 | 1.7994 |
357
+ | 1.7872 | 1.2968 | 3080 | 1.7983 |
358
+ | 1.7872 | 1.3011 | 3090 | 1.7974 |
359
+ | 1.7635 | 1.3053 | 3100 | 1.7962 |
360
+ | 1.7635 | 1.3095 | 3110 | 1.7930 |
361
+ | 1.7635 | 1.3137 | 3120 | 1.7957 |
362
+ | 1.7635 | 1.3179 | 3130 | 1.7958 |
363
+ | 1.7635 | 1.3221 | 3140 | 1.7937 |
364
+ | 1.7635 | 1.3263 | 3150 | 1.7975 |
365
+ | 1.7635 | 1.3305 | 3160 | 1.7970 |
366
+ | 1.7635 | 1.3347 | 3170 | 1.7909 |
367
+ | 1.7635 | 1.3389 | 3180 | 1.7902 |
368
+ | 1.7635 | 1.3432 | 3190 | 1.7894 |
369
+ | 1.7545 | 1.3474 | 3200 | 1.7908 |
370
+ | 1.7545 | 1.3516 | 3210 | 1.7890 |
371
+ | 1.7545 | 1.3558 | 3220 | 1.7892 |
372
+ | 1.7545 | 1.3600 | 3230 | 1.7866 |
373
+ | 1.7545 | 1.3642 | 3240 | 1.7887 |
374
+ | 1.7545 | 1.3684 | 3250 | 1.7885 |
375
+ | 1.7545 | 1.3726 | 3260 | 1.7880 |
376
+ | 1.7545 | 1.3768 | 3270 | 1.7880 |
377
+ | 1.7545 | 1.3811 | 3280 | 1.7864 |
378
+ | 1.7545 | 1.3853 | 3290 | 1.7831 |
379
+ | 1.7565 | 1.3895 | 3300 | 1.7822 |
380
+ | 1.7565 | 1.3937 | 3310 | 1.7842 |
381
+ | 1.7565 | 1.3979 | 3320 | 1.7818 |
382
+ | 1.7565 | 1.4021 | 3330 | 1.7814 |
383
+ | 1.7565 | 1.4063 | 3340 | 1.7805 |
384
+ | 1.7565 | 1.4105 | 3350 | 1.7813 |
385
+ | 1.7565 | 1.4147 | 3360 | 1.7826 |
386
+ | 1.7565 | 1.4189 | 3370 | 1.7787 |
387
+ | 1.7565 | 1.4232 | 3380 | 1.7769 |
388
+ | 1.7565 | 1.4274 | 3390 | 1.7775 |
389
+ | 1.7175 | 1.4316 | 3400 | 1.7791 |
390
+ | 1.7175 | 1.4358 | 3410 | 1.7761 |
391
+ | 1.7175 | 1.44 | 3420 | 1.7763 |
392
+ | 1.7175 | 1.4442 | 3430 | 1.7757 |
393
+ | 1.7175 | 1.4484 | 3440 | 1.7750 |
394
+ | 1.7175 | 1.4526 | 3450 | 1.7757 |
395
+ | 1.7175 | 1.4568 | 3460 | 1.7739 |
396
+ | 1.7175 | 1.4611 | 3470 | 1.7727 |
397
+ | 1.7175 | 1.4653 | 3480 | 1.7729 |
398
+ | 1.7175 | 1.4695 | 3490 | 1.7731 |
399
+ | 1.7427 | 1.4737 | 3500 | 1.7698 |
400
+ | 1.7427 | 1.4779 | 3510 | 1.7701 |
401
+ | 1.7427 | 1.4821 | 3520 | 1.7708 |
402
+ | 1.7427 | 1.4863 | 3530 | 1.7728 |
403
+ | 1.7427 | 1.4905 | 3540 | 1.7727 |
404
+ | 1.7427 | 1.4947 | 3550 | 1.7691 |
405
+ | 1.7427 | 1.4989 | 3560 | 1.7687 |
406
+ | 1.7427 | 1.5032 | 3570 | 1.7684 |
407
+ | 1.7427 | 1.5074 | 3580 | 1.7695 |
408
+ | 1.7427 | 1.5116 | 3590 | 1.7695 |
409
+ | 1.7435 | 1.5158 | 3600 | 1.7702 |
410
+ | 1.7435 | 1.52 | 3610 | 1.7702 |
411
+ | 1.7435 | 1.5242 | 3620 | 1.7681 |
412
+ | 1.7435 | 1.5284 | 3630 | 1.7656 |
413
+ | 1.7435 | 1.5326 | 3640 | 1.7656 |
414
+ | 1.7435 | 1.5368 | 3650 | 1.7654 |
415
+ | 1.7435 | 1.5411 | 3660 | 1.7651 |
416
+ | 1.7435 | 1.5453 | 3670 | 1.7642 |
417
+ | 1.7435 | 1.5495 | 3680 | 1.7627 |
418
+ | 1.7435 | 1.5537 | 3690 | 1.7621 |
419
+ | 1.702 | 1.5579 | 3700 | 1.7623 |
420
+ | 1.702 | 1.5621 | 3710 | 1.7626 |
421
+ | 1.702 | 1.5663 | 3720 | 1.7621 |
422
+ | 1.702 | 1.5705 | 3730 | 1.7615 |
423
+ | 1.702 | 1.5747 | 3740 | 1.7615 |
424
+ | 1.702 | 1.5789 | 3750 | 1.7615 |
425
+ | 1.702 | 1.5832 | 3760 | 1.7613 |
426
+ | 1.702 | 1.5874 | 3770 | 1.7617 |
427
+ | 1.702 | 1.5916 | 3780 | 1.7619 |
428
+ | 1.702 | 1.5958 | 3790 | 1.7612 |
429
+ | 1.69 | 1.6 | 3800 | 1.7609 |
430
+ | 1.69 | 1.6042 | 3810 | 1.7605 |
431
+ | 1.69 | 1.6084 | 3820 | 1.7604 |
432
+ | 1.69 | 1.6126 | 3830 | 1.7603 |
433
+ | 1.69 | 1.6168 | 3840 | 1.7598 |
434
+ | 1.69 | 1.6211 | 3850 | 1.7586 |
435
+ | 1.69 | 1.6253 | 3860 | 1.7586 |
436
+ | 1.69 | 1.6295 | 3870 | 1.7592 |
437
+ | 1.69 | 1.6337 | 3880 | 1.7584 |
438
+ | 1.69 | 1.6379 | 3890 | 1.7573 |
439
+ | 1.7157 | 1.6421 | 3900 | 1.7567 |
440
+ | 1.7157 | 1.6463 | 3910 | 1.7565 |
441
+ | 1.7157 | 1.6505 | 3920 | 1.7564 |
442
+ | 1.7157 | 1.6547 | 3930 | 1.7564 |
443
+ | 1.7157 | 1.6589 | 3940 | 1.7551 |
444
+ | 1.7157 | 1.6632 | 3950 | 1.7546 |
445
+ | 1.7157 | 1.6674 | 3960 | 1.7547 |
446
+ | 1.7157 | 1.6716 | 3970 | 1.7541 |
447
+ | 1.7157 | 1.6758 | 3980 | 1.7544 |
448
+ | 1.7157 | 1.6800 | 3990 | 1.7546 |
449
+ | 1.7061 | 1.6842 | 4000 | 1.7545 |
450
+ | 1.7061 | 1.6884 | 4010 | 1.7543 |
451
+ | 1.7061 | 1.6926 | 4020 | 1.7543 |
452
+ | 1.7061 | 1.6968 | 4030 | 1.7542 |
453
+ | 1.7061 | 1.7011 | 4040 | 1.7532 |
454
+ | 1.7061 | 1.7053 | 4050 | 1.7529 |
455
+ | 1.7061 | 1.7095 | 4060 | 1.7529 |
456
+ | 1.7061 | 1.7137 | 4070 | 1.7528 |
457
+ | 1.7061 | 1.7179 | 4080 | 1.7523 |
458
+ | 1.7061 | 1.7221 | 4090 | 1.7520 |
459
+ | 1.6821 | 1.7263 | 4100 | 1.7517 |
460
+ | 1.6821 | 1.7305 | 4110 | 1.7515 |
461
+ | 1.6821 | 1.7347 | 4120 | 1.7514 |
462
+ | 1.6821 | 1.7389 | 4130 | 1.7519 |
463
+ | 1.6821 | 1.7432 | 4140 | 1.7519 |
464
+ | 1.6821 | 1.7474 | 4150 | 1.7513 |
465
+ | 1.6821 | 1.7516 | 4160 | 1.7507 |
466
+ | 1.6821 | 1.7558 | 4170 | 1.7505 |
467
+ | 1.6821 | 1.76 | 4180 | 1.7503 |
468
+ | 1.6821 | 1.7642 | 4190 | 1.7501 |
469
+ | 1.7025 | 1.7684 | 4200 | 1.7498 |
470
+ | 1.7025 | 1.7726 | 4210 | 1.7498 |
471
+ | 1.7025 | 1.7768 | 4220 | 1.7499 |
472
+ | 1.7025 | 1.7811 | 4230 | 1.7499 |
473
+ | 1.7025 | 1.7853 | 4240 | 1.7497 |
474
+ | 1.7025 | 1.7895 | 4250 | 1.7496 |
475
+ | 1.7025 | 1.7937 | 4260 | 1.7494 |
476
+ | 1.7025 | 1.7979 | 4270 | 1.7492 |
477
+ | 1.7025 | 1.8021 | 4280 | 1.7491 |
478
+ | 1.7025 | 1.8063 | 4290 | 1.7489 |
479
+ | 1.7204 | 1.8105 | 4300 | 1.7489 |
480
+ | 1.7204 | 1.8147 | 4310 | 1.7490 |
481
+ | 1.7204 | 1.8189 | 4320 | 1.7490 |
482
+ | 1.7204 | 1.8232 | 4330 | 1.7488 |
483
+ | 1.7204 | 1.8274 | 4340 | 1.7487 |
484
+ | 1.7204 | 1.8316 | 4350 | 1.7485 |
485
+ | 1.7204 | 1.8358 | 4360 | 1.7485 |
486
+ | 1.7204 | 1.8400 | 4370 | 1.7484 |
487
+ | 1.7204 | 1.8442 | 4380 | 1.7483 |
488
+ | 1.7204 | 1.8484 | 4390 | 1.7481 |
489
+ | 1.707 | 1.8526 | 4400 | 1.7480 |
490
+ | 1.707 | 1.8568 | 4410 | 1.7480 |
491
+ | 1.707 | 1.8611 | 4420 | 1.7478 |
492
+ | 1.707 | 1.8653 | 4430 | 1.7477 |
493
+ | 1.707 | 1.8695 | 4440 | 1.7476 |
494
+ | 1.707 | 1.8737 | 4450 | 1.7475 |
495
+ | 1.707 | 1.8779 | 4460 | 1.7474 |
496
+ | 1.707 | 1.8821 | 4470 | 1.7473 |
497
+ | 1.707 | 1.8863 | 4480 | 1.7473 |
498
+ | 1.707 | 1.8905 | 4490 | 1.7473 |
499
+ | 1.7346 | 1.8947 | 4500 | 1.7472 |
500
+ | 1.7346 | 1.8989 | 4510 | 1.7473 |
501
+ | 1.7346 | 1.9032 | 4520 | 1.7471 |
502
+ | 1.7346 | 1.9074 | 4530 | 1.7471 |
503
+ | 1.7346 | 1.9116 | 4540 | 1.7471 |
504
+ | 1.7346 | 1.9158 | 4550 | 1.7471 |
505
+ | 1.7346 | 1.92 | 4560 | 1.7471 |
506
+ | 1.7346 | 1.9242 | 4570 | 1.7471 |
507
+ | 1.7346 | 1.9284 | 4580 | 1.7471 |
508
+ | 1.7346 | 1.9326 | 4590 | 1.7471 |
509
+ | 1.7311 | 1.9368 | 4600 | 1.7470 |
510
+ | 1.7311 | 1.9411 | 4610 | 1.7470 |
511
+ | 1.7311 | 1.9453 | 4620 | 1.7470 |
512
+ | 1.7311 | 1.9495 | 4630 | 1.7470 |
513
+ | 1.7311 | 1.9537 | 4640 | 1.7470 |
514
+ | 1.7311 | 1.9579 | 4650 | 1.7469 |
515
+ | 1.7311 | 1.9621 | 4660 | 1.7470 |
516
+ | 1.7311 | 1.9663 | 4670 | 1.7469 |
517
+ | 1.7311 | 1.9705 | 4680 | 1.7470 |
518
+ | 1.7311 | 1.9747 | 4690 | 1.7470 |
519
+ | 1.7096 | 1.9789 | 4700 | 1.7469 |
520
+ | 1.7096 | 1.9832 | 4710 | 1.7469 |
521
+ | 1.7096 | 1.9874 | 4720 | 1.7469 |
522
+ | 1.7096 | 1.9916 | 4730 | 1.7469 |
523
+ | 1.7096 | 1.9958 | 4740 | 1.7470 |
524
+ | 1.7096 | 2.0 | 4750 | 1.7469 |
525
 
526
 
527
  ### Framework versions
528
 
529
  - PEFT 0.11.1
530
+ - Transformers 4.42.4
531
+ - Pytorch 1.13.1+cu117
532
+ - Datasets 2.19.2
533
  - Tokenizers 0.19.1
runs/Jul20_02-34-31_cmle-training-16561430891479085200/events.out.tfevents.1721442872.cmle-training-16561430891479085200 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:97a8d4503f6c5891833cb08c97a784da5f0d91d9958926e5e2495fd54b9e4e23
3
- size 143959
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33d3c99354f9368e354c9dfd991cff660f4bbf4c58f4bbc3e4d1df35f35e11ca
3
+ size 144313