hazemessam commited on
Commit
2729feb
1 Parent(s): 7509a05

Upload trainer_state.json with huggingface_hub

Browse files
Files changed (1) hide show
  1. trainer_state.json +2022 -0
trainer_state.json ADDED
@@ -0,0 +1,2022 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.8526459553649132,
3
+ "best_model_checkpoint": "stability_weights_dora_norm_swap_v2/checkpoint-1820",
4
+ "epoch": 10.996978851963746,
5
+ "eval_steps": 500,
6
+ "global_step": 1820,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.06042296072507553,
13
+ "grad_norm": 42.27948760986328,
14
+ "learning_rate": 2.5e-06,
15
+ "loss": 3.9615,
16
+ "step": 10
17
+ },
18
+ {
19
+ "epoch": 0.12084592145015106,
20
+ "grad_norm": 45.654972076416016,
21
+ "learning_rate": 5e-06,
22
+ "loss": 2.8167,
23
+ "step": 20
24
+ },
25
+ {
26
+ "epoch": 0.18126888217522658,
27
+ "grad_norm": 33.64224624633789,
28
+ "learning_rate": 7.5e-06,
29
+ "loss": 3.9832,
30
+ "step": 30
31
+ },
32
+ {
33
+ "epoch": 0.24169184290030213,
34
+ "grad_norm": 38.766605377197266,
35
+ "learning_rate": 1e-05,
36
+ "loss": 3.483,
37
+ "step": 40
38
+ },
39
+ {
40
+ "epoch": 0.3021148036253776,
41
+ "grad_norm": 36.45051574707031,
42
+ "learning_rate": 1.25e-05,
43
+ "loss": 2.6488,
44
+ "step": 50
45
+ },
46
+ {
47
+ "epoch": 0.36253776435045315,
48
+ "grad_norm": 31.120370864868164,
49
+ "learning_rate": 1.5e-05,
50
+ "loss": 2.7824,
51
+ "step": 60
52
+ },
53
+ {
54
+ "epoch": 0.4229607250755287,
55
+ "grad_norm": 40.57133865356445,
56
+ "learning_rate": 1.75e-05,
57
+ "loss": 3.4959,
58
+ "step": 70
59
+ },
60
+ {
61
+ "epoch": 0.48338368580060426,
62
+ "grad_norm": 39.07503890991211,
63
+ "learning_rate": 2e-05,
64
+ "loss": 2.7043,
65
+ "step": 80
66
+ },
67
+ {
68
+ "epoch": 0.5438066465256798,
69
+ "grad_norm": 27.892555236816406,
70
+ "learning_rate": 2.25e-05,
71
+ "loss": 2.3916,
72
+ "step": 90
73
+ },
74
+ {
75
+ "epoch": 0.6042296072507553,
76
+ "grad_norm": 35.324249267578125,
77
+ "learning_rate": 2.5e-05,
78
+ "loss": 3.5084,
79
+ "step": 100
80
+ },
81
+ {
82
+ "epoch": 0.6646525679758308,
83
+ "grad_norm": 34.17033386230469,
84
+ "learning_rate": 2.7500000000000004e-05,
85
+ "loss": 2.15,
86
+ "step": 110
87
+ },
88
+ {
89
+ "epoch": 0.7250755287009063,
90
+ "grad_norm": 28.871646881103516,
91
+ "learning_rate": 3e-05,
92
+ "loss": 2.4973,
93
+ "step": 120
94
+ },
95
+ {
96
+ "epoch": 0.7854984894259819,
97
+ "grad_norm": 34.328651428222656,
98
+ "learning_rate": 3.2500000000000004e-05,
99
+ "loss": 2.2425,
100
+ "step": 130
101
+ },
102
+ {
103
+ "epoch": 0.8459214501510574,
104
+ "grad_norm": 27.76066017150879,
105
+ "learning_rate": 3.5e-05,
106
+ "loss": 2.4867,
107
+ "step": 140
108
+ },
109
+ {
110
+ "epoch": 0.9063444108761329,
111
+ "grad_norm": 46.40857696533203,
112
+ "learning_rate": 3.7500000000000003e-05,
113
+ "loss": 2.3429,
114
+ "step": 150
115
+ },
116
+ {
117
+ "epoch": 0.9667673716012085,
118
+ "grad_norm": 24.273012161254883,
119
+ "learning_rate": 4e-05,
120
+ "loss": 1.8543,
121
+ "step": 160
122
+ },
123
+ {
124
+ "epoch": 0.9969788519637462,
125
+ "eval_validation_loss": 2.0505075454711914,
126
+ "eval_validation_mae": 1.01314377784729,
127
+ "eval_validation_mse": 2.0505075454711914,
128
+ "eval_validation_pearson": 0.46795711140864427,
129
+ "eval_validation_rmse": 1.4319593906402588,
130
+ "eval_validation_runtime": 129.1022,
131
+ "eval_validation_samples_per_second": 2.649,
132
+ "eval_validation_spearman": 0.5000603914960909,
133
+ "eval_validation_steps_per_second": 2.649,
134
+ "step": 165
135
+ },
136
+ {
137
+ "epoch": 0.9969788519637462,
138
+ "eval_test_loss": 2.0498456954956055,
139
+ "eval_test_mae": 1.0129708051681519,
140
+ "eval_test_mse": 2.0498456954956055,
141
+ "eval_test_pearson": 0.4678271674806638,
142
+ "eval_test_rmse": 1.4317282438278198,
143
+ "eval_test_runtime": 128.8868,
144
+ "eval_test_samples_per_second": 2.653,
145
+ "eval_test_spearman": 0.49953701661422967,
146
+ "eval_test_steps_per_second": 2.653,
147
+ "step": 165
148
+ },
149
+ {
150
+ "epoch": 0.9969788519637462,
151
+ "eval_myoglobin_loss": 1.1084952354431152,
152
+ "eval_myoglobin_mae": 0.7779242396354675,
153
+ "eval_myoglobin_mse": 1.1084952354431152,
154
+ "eval_myoglobin_pearson": 0.3155295627311716,
155
+ "eval_myoglobin_rmse": 1.0528509616851807,
156
+ "eval_myoglobin_runtime": 50.3681,
157
+ "eval_myoglobin_samples_per_second": 2.66,
158
+ "eval_myoglobin_spearman": 0.3150316299942073,
159
+ "eval_myoglobin_steps_per_second": 2.66,
160
+ "step": 165
161
+ },
162
+ {
163
+ "epoch": 0.9969788519637462,
164
+ "eval_myoglobin_r_loss": 1.1083894968032837,
165
+ "eval_myoglobin_r_mae": 0.778092622756958,
166
+ "eval_myoglobin_r_mse": 1.1083894968032837,
167
+ "eval_myoglobin_r_pearson": 0.315758311809077,
168
+ "eval_myoglobin_r_rmse": 1.0528007745742798,
169
+ "eval_myoglobin_r_runtime": 50.3077,
170
+ "eval_myoglobin_r_samples_per_second": 2.664,
171
+ "eval_myoglobin_r_spearman": 0.3167998663343996,
172
+ "eval_myoglobin_r_steps_per_second": 2.664,
173
+ "step": 165
174
+ },
175
+ {
176
+ "epoch": 0.9969788519637462,
177
+ "eval_p53_loss": 4.773194789886475,
178
+ "eval_p53_mae": 1.6836440563201904,
179
+ "eval_p53_mse": 4.773194789886475,
180
+ "eval_p53_pearson": 0.260329486693081,
181
+ "eval_p53_rmse": 2.1847641468048096,
182
+ "eval_p53_runtime": 17.5157,
183
+ "eval_p53_samples_per_second": 2.398,
184
+ "eval_p53_spearman": 0.1950488231413393,
185
+ "eval_p53_steps_per_second": 2.398,
186
+ "step": 165
187
+ },
188
+ {
189
+ "epoch": 1.027190332326284,
190
+ "grad_norm": 25.500028610229492,
191
+ "learning_rate": 4.25e-05,
192
+ "loss": 2.1163,
193
+ "step": 170
194
+ },
195
+ {
196
+ "epoch": 1.0876132930513596,
197
+ "grad_norm": 31.47431755065918,
198
+ "learning_rate": 4.5e-05,
199
+ "loss": 1.7539,
200
+ "step": 180
201
+ },
202
+ {
203
+ "epoch": 1.148036253776435,
204
+ "grad_norm": 22.619182586669922,
205
+ "learning_rate": 4.75e-05,
206
+ "loss": 1.4633,
207
+ "step": 190
208
+ },
209
+ {
210
+ "epoch": 1.2084592145015105,
211
+ "grad_norm": 38.80492401123047,
212
+ "learning_rate": 5e-05,
213
+ "loss": 1.4391,
214
+ "step": 200
215
+ },
216
+ {
217
+ "epoch": 1.2688821752265862,
218
+ "grad_norm": 22.946945190429688,
219
+ "learning_rate": 5.25e-05,
220
+ "loss": 1.3334,
221
+ "step": 210
222
+ },
223
+ {
224
+ "epoch": 1.3293051359516617,
225
+ "grad_norm": 14.94211483001709,
226
+ "learning_rate": 5.500000000000001e-05,
227
+ "loss": 1.1311,
228
+ "step": 220
229
+ },
230
+ {
231
+ "epoch": 1.3897280966767371,
232
+ "grad_norm": 27.56308364868164,
233
+ "learning_rate": 5.7499999999999995e-05,
234
+ "loss": 1.6216,
235
+ "step": 230
236
+ },
237
+ {
238
+ "epoch": 1.4501510574018126,
239
+ "grad_norm": 19.250564575195312,
240
+ "learning_rate": 6e-05,
241
+ "loss": 1.273,
242
+ "step": 240
243
+ },
244
+ {
245
+ "epoch": 1.510574018126888,
246
+ "grad_norm": 19.437816619873047,
247
+ "learning_rate": 6.25e-05,
248
+ "loss": 0.9644,
249
+ "step": 250
250
+ },
251
+ {
252
+ "epoch": 1.5709969788519638,
253
+ "grad_norm": 31.529216766357422,
254
+ "learning_rate": 6.500000000000001e-05,
255
+ "loss": 1.7949,
256
+ "step": 260
257
+ },
258
+ {
259
+ "epoch": 1.6314199395770392,
260
+ "grad_norm": 32.55619430541992,
261
+ "learning_rate": 6.750000000000001e-05,
262
+ "loss": 1.241,
263
+ "step": 270
264
+ },
265
+ {
266
+ "epoch": 1.691842900302115,
267
+ "grad_norm": 26.349998474121094,
268
+ "learning_rate": 7e-05,
269
+ "loss": 1.3735,
270
+ "step": 280
271
+ },
272
+ {
273
+ "epoch": 1.7522658610271904,
274
+ "grad_norm": 29.46245002746582,
275
+ "learning_rate": 7.25e-05,
276
+ "loss": 1.4623,
277
+ "step": 290
278
+ },
279
+ {
280
+ "epoch": 1.8126888217522659,
281
+ "grad_norm": 14.214956283569336,
282
+ "learning_rate": 7.500000000000001e-05,
283
+ "loss": 0.8406,
284
+ "step": 300
285
+ },
286
+ {
287
+ "epoch": 1.8731117824773413,
288
+ "grad_norm": 16.597463607788086,
289
+ "learning_rate": 7.75e-05,
290
+ "loss": 0.9432,
291
+ "step": 310
292
+ },
293
+ {
294
+ "epoch": 1.9335347432024168,
295
+ "grad_norm": 17.1890869140625,
296
+ "learning_rate": 8e-05,
297
+ "loss": 1.195,
298
+ "step": 320
299
+ },
300
+ {
301
+ "epoch": 1.9939577039274925,
302
+ "grad_norm": 17.546321868896484,
303
+ "learning_rate": 8.25e-05,
304
+ "loss": 1.0398,
305
+ "step": 330
306
+ },
307
+ {
308
+ "epoch": 2.0,
309
+ "eval_validation_loss": 0.9950127005577087,
310
+ "eval_validation_mae": 0.6999762654304504,
311
+ "eval_validation_mse": 0.9950127005577087,
312
+ "eval_validation_pearson": 0.7727618943702055,
313
+ "eval_validation_rmse": 0.9975032210350037,
314
+ "eval_validation_runtime": 129.1954,
315
+ "eval_validation_samples_per_second": 2.647,
316
+ "eval_validation_spearman": 0.7928617789227591,
317
+ "eval_validation_steps_per_second": 2.647,
318
+ "step": 331
319
+ },
320
+ {
321
+ "epoch": 2.0,
322
+ "eval_test_loss": 0.9929149150848389,
323
+ "eval_test_mae": 0.7004790306091309,
324
+ "eval_test_mse": 0.9929149150848389,
325
+ "eval_test_pearson": 0.7726932487807971,
326
+ "eval_test_rmse": 0.9964511394500732,
327
+ "eval_test_runtime": 129.0193,
328
+ "eval_test_samples_per_second": 2.651,
329
+ "eval_test_spearman": 0.7933636966348652,
330
+ "eval_test_steps_per_second": 2.651,
331
+ "step": 331
332
+ },
333
+ {
334
+ "epoch": 2.0,
335
+ "eval_myoglobin_loss": 0.934596061706543,
336
+ "eval_myoglobin_mae": 0.7355605959892273,
337
+ "eval_myoglobin_mse": 0.934596061706543,
338
+ "eval_myoglobin_pearson": 0.5740624022673728,
339
+ "eval_myoglobin_rmse": 0.9667450785636902,
340
+ "eval_myoglobin_runtime": 50.363,
341
+ "eval_myoglobin_samples_per_second": 2.661,
342
+ "eval_myoglobin_spearman": 0.5940538377065167,
343
+ "eval_myoglobin_steps_per_second": 2.661,
344
+ "step": 331
345
+ },
346
+ {
347
+ "epoch": 2.0,
348
+ "eval_myoglobin_r_loss": 0.9417540431022644,
349
+ "eval_myoglobin_r_mae": 0.7385311722755432,
350
+ "eval_myoglobin_r_mse": 0.9417540431022644,
351
+ "eval_myoglobin_r_pearson": 0.5734044991540941,
352
+ "eval_myoglobin_r_rmse": 0.970440149307251,
353
+ "eval_myoglobin_r_runtime": 50.3941,
354
+ "eval_myoglobin_r_samples_per_second": 2.659,
355
+ "eval_myoglobin_r_spearman": 0.5919040214508811,
356
+ "eval_myoglobin_r_steps_per_second": 2.659,
357
+ "step": 331
358
+ },
359
+ {
360
+ "epoch": 2.0,
361
+ "eval_p53_loss": 4.290196895599365,
362
+ "eval_p53_mae": 1.622589349746704,
363
+ "eval_p53_mse": 4.290196895599365,
364
+ "eval_p53_pearson": 0.29006131482310593,
365
+ "eval_p53_rmse": 2.0712790489196777,
366
+ "eval_p53_runtime": 17.523,
367
+ "eval_p53_samples_per_second": 2.397,
368
+ "eval_p53_spearman": 0.26692597566579634,
369
+ "eval_p53_steps_per_second": 2.397,
370
+ "step": 331
371
+ },
372
+ {
373
+ "epoch": 2.054380664652568,
374
+ "grad_norm": 13.338666915893555,
375
+ "learning_rate": 8.5e-05,
376
+ "loss": 0.6917,
377
+ "step": 340
378
+ },
379
+ {
380
+ "epoch": 2.1148036253776437,
381
+ "grad_norm": 21.178325653076172,
382
+ "learning_rate": 8.75e-05,
383
+ "loss": 0.8003,
384
+ "step": 350
385
+ },
386
+ {
387
+ "epoch": 2.175226586102719,
388
+ "grad_norm": 14.95538330078125,
389
+ "learning_rate": 9e-05,
390
+ "loss": 0.6155,
391
+ "step": 360
392
+ },
393
+ {
394
+ "epoch": 2.2356495468277946,
395
+ "grad_norm": 10.555205345153809,
396
+ "learning_rate": 9.250000000000001e-05,
397
+ "loss": 0.6111,
398
+ "step": 370
399
+ },
400
+ {
401
+ "epoch": 2.29607250755287,
402
+ "grad_norm": 11.569594383239746,
403
+ "learning_rate": 9.5e-05,
404
+ "loss": 0.8681,
405
+ "step": 380
406
+ },
407
+ {
408
+ "epoch": 2.3564954682779455,
409
+ "grad_norm": 18.517148971557617,
410
+ "learning_rate": 9.75e-05,
411
+ "loss": 0.6754,
412
+ "step": 390
413
+ },
414
+ {
415
+ "epoch": 2.416918429003021,
416
+ "grad_norm": 21.002437591552734,
417
+ "learning_rate": 0.0001,
418
+ "loss": 0.5359,
419
+ "step": 400
420
+ },
421
+ {
422
+ "epoch": 2.477341389728097,
423
+ "grad_norm": 17.249919891357422,
424
+ "learning_rate": 9.999706613915566e-05,
425
+ "loss": 0.8332,
426
+ "step": 410
427
+ },
428
+ {
429
+ "epoch": 2.5377643504531724,
430
+ "grad_norm": 10.25106143951416,
431
+ "learning_rate": 9.998826490092421e-05,
432
+ "loss": 0.496,
433
+ "step": 420
434
+ },
435
+ {
436
+ "epoch": 2.598187311178248,
437
+ "grad_norm": 18.843320846557617,
438
+ "learning_rate": 9.997359731816998e-05,
439
+ "loss": 0.7244,
440
+ "step": 430
441
+ },
442
+ {
443
+ "epoch": 2.6586102719033233,
444
+ "grad_norm": 11.991676330566406,
445
+ "learning_rate": 9.995306511219885e-05,
446
+ "loss": 0.5112,
447
+ "step": 440
448
+ },
449
+ {
450
+ "epoch": 2.719033232628399,
451
+ "grad_norm": 21.79170036315918,
452
+ "learning_rate": 9.992667069255619e-05,
453
+ "loss": 0.5307,
454
+ "step": 450
455
+ },
456
+ {
457
+ "epoch": 2.7794561933534743,
458
+ "grad_norm": 9.106186866760254,
459
+ "learning_rate": 9.989441715674422e-05,
460
+ "loss": 0.4371,
461
+ "step": 460
462
+ },
463
+ {
464
+ "epoch": 2.8398791540785497,
465
+ "grad_norm": 10.968169212341309,
466
+ "learning_rate": 9.985630828985835e-05,
467
+ "loss": 0.5922,
468
+ "step": 470
469
+ },
470
+ {
471
+ "epoch": 2.900302114803625,
472
+ "grad_norm": 14.699440002441406,
473
+ "learning_rate": 9.981234856414307e-05,
474
+ "loss": 0.5858,
475
+ "step": 480
476
+ },
477
+ {
478
+ "epoch": 2.9607250755287007,
479
+ "grad_norm": 13.304734230041504,
480
+ "learning_rate": 9.97625431384671e-05,
481
+ "loss": 0.6854,
482
+ "step": 490
483
+ },
484
+ {
485
+ "epoch": 2.996978851963746,
486
+ "eval_validation_loss": 0.8451957702636719,
487
+ "eval_validation_mae": 0.6374915242195129,
488
+ "eval_validation_mse": 0.8451957702636719,
489
+ "eval_validation_pearson": 0.8118986310850005,
490
+ "eval_validation_rmse": 0.9193453192710876,
491
+ "eval_validation_runtime": 128.9657,
492
+ "eval_validation_samples_per_second": 2.652,
493
+ "eval_validation_spearman": 0.8191585157836645,
494
+ "eval_validation_steps_per_second": 2.652,
495
+ "step": 496
496
+ },
497
+ {
498
+ "epoch": 2.996978851963746,
499
+ "eval_test_loss": 0.8352708220481873,
500
+ "eval_test_mae": 0.6319109201431274,
501
+ "eval_test_mse": 0.8352708220481873,
502
+ "eval_test_pearson": 0.8117305964358366,
503
+ "eval_test_rmse": 0.9139315485954285,
504
+ "eval_test_runtime": 129.0007,
505
+ "eval_test_samples_per_second": 2.651,
506
+ "eval_test_spearman": 0.8192730040390717,
507
+ "eval_test_steps_per_second": 2.651,
508
+ "step": 496
509
+ },
510
+ {
511
+ "epoch": 2.996978851963746,
512
+ "eval_myoglobin_loss": 0.8053471446037292,
513
+ "eval_myoglobin_mae": 0.6675698757171631,
514
+ "eval_myoglobin_mse": 0.8053471446037292,
515
+ "eval_myoglobin_pearson": 0.6029393659750895,
516
+ "eval_myoglobin_rmse": 0.8974113464355469,
517
+ "eval_myoglobin_runtime": 50.3201,
518
+ "eval_myoglobin_samples_per_second": 2.663,
519
+ "eval_myoglobin_spearman": 0.6248470874841663,
520
+ "eval_myoglobin_steps_per_second": 2.663,
521
+ "step": 496
522
+ },
523
+ {
524
+ "epoch": 2.996978851963746,
525
+ "eval_myoglobin_r_loss": 0.8195508122444153,
526
+ "eval_myoglobin_r_mae": 0.6755225658416748,
527
+ "eval_myoglobin_r_mse": 0.8195508122444153,
528
+ "eval_myoglobin_r_pearson": 0.6011248022884956,
529
+ "eval_myoglobin_r_rmse": 0.9052904844284058,
530
+ "eval_myoglobin_r_runtime": 50.3351,
531
+ "eval_myoglobin_r_samples_per_second": 2.662,
532
+ "eval_myoglobin_r_spearman": 0.6238021071928538,
533
+ "eval_myoglobin_r_steps_per_second": 2.662,
534
+ "step": 496
535
+ },
536
+ {
537
+ "epoch": 2.996978851963746,
538
+ "eval_p53_loss": 4.1966938972473145,
539
+ "eval_p53_mae": 1.6126927137374878,
540
+ "eval_p53_mse": 4.1966938972473145,
541
+ "eval_p53_pearson": 0.3033285839483142,
542
+ "eval_p53_rmse": 2.0485832691192627,
543
+ "eval_p53_runtime": 17.5229,
544
+ "eval_p53_samples_per_second": 2.397,
545
+ "eval_p53_spearman": 0.2764069529435432,
546
+ "eval_p53_steps_per_second": 2.397,
547
+ "step": 496
548
+ },
549
+ {
550
+ "epoch": 3.0211480362537766,
551
+ "grad_norm": 7.5384063720703125,
552
+ "learning_rate": 9.970689785771798e-05,
553
+ "loss": 0.4402,
554
+ "step": 500
555
+ },
556
+ {
557
+ "epoch": 3.081570996978852,
558
+ "grad_norm": 13.131170272827148,
559
+ "learning_rate": 9.964541925211612e-05,
560
+ "loss": 0.3957,
561
+ "step": 510
562
+ },
563
+ {
564
+ "epoch": 3.1419939577039275,
565
+ "grad_norm": 5.908936977386475,
566
+ "learning_rate": 9.957811453644847e-05,
567
+ "loss": 0.2294,
568
+ "step": 520
569
+ },
570
+ {
571
+ "epoch": 3.202416918429003,
572
+ "grad_norm": 7.813998699188232,
573
+ "learning_rate": 9.950499160922183e-05,
574
+ "loss": 0.2424,
575
+ "step": 530
576
+ },
577
+ {
578
+ "epoch": 3.2628398791540785,
579
+ "grad_norm": 9.31921100616455,
580
+ "learning_rate": 9.942605905173592e-05,
581
+ "loss": 0.3864,
582
+ "step": 540
583
+ },
584
+ {
585
+ "epoch": 3.323262839879154,
586
+ "grad_norm": 6.702413558959961,
587
+ "learning_rate": 9.934132612707632e-05,
588
+ "loss": 0.3761,
589
+ "step": 550
590
+ },
591
+ {
592
+ "epoch": 3.38368580060423,
593
+ "grad_norm": 9.235360145568848,
594
+ "learning_rate": 9.925080277902743e-05,
595
+ "loss": 0.2577,
596
+ "step": 560
597
+ },
598
+ {
599
+ "epoch": 3.4441087613293053,
600
+ "grad_norm": 10.142770767211914,
601
+ "learning_rate": 9.91544996309055e-05,
602
+ "loss": 0.322,
603
+ "step": 570
604
+ },
605
+ {
606
+ "epoch": 3.504531722054381,
607
+ "grad_norm": 12.021828651428223,
608
+ "learning_rate": 9.905242798431196e-05,
609
+ "loss": 0.2585,
610
+ "step": 580
611
+ },
612
+ {
613
+ "epoch": 3.5649546827794563,
614
+ "grad_norm": 7.120593070983887,
615
+ "learning_rate": 9.894459981780711e-05,
616
+ "loss": 0.3731,
617
+ "step": 590
618
+ },
619
+ {
620
+ "epoch": 3.6253776435045317,
621
+ "grad_norm": 9.525044441223145,
622
+ "learning_rate": 9.883102778550434e-05,
623
+ "loss": 0.3116,
624
+ "step": 600
625
+ },
626
+ {
627
+ "epoch": 3.685800604229607,
628
+ "grad_norm": 13.72534465789795,
629
+ "learning_rate": 9.871172521558523e-05,
630
+ "loss": 0.3361,
631
+ "step": 610
632
+ },
633
+ {
634
+ "epoch": 3.7462235649546827,
635
+ "grad_norm": 12.432317733764648,
636
+ "learning_rate": 9.858670610873528e-05,
637
+ "loss": 0.2497,
638
+ "step": 620
639
+ },
640
+ {
641
+ "epoch": 3.806646525679758,
642
+ "grad_norm": 7.066479206085205,
643
+ "learning_rate": 9.845598513650103e-05,
644
+ "loss": 0.3698,
645
+ "step": 630
646
+ },
647
+ {
648
+ "epoch": 3.8670694864048336,
649
+ "grad_norm": 9.628741264343262,
650
+ "learning_rate": 9.831957763956813e-05,
651
+ "loss": 0.3154,
652
+ "step": 640
653
+ },
654
+ {
655
+ "epoch": 3.9274924471299095,
656
+ "grad_norm": 11.99616813659668,
657
+ "learning_rate": 9.817749962596115e-05,
658
+ "loss": 0.3181,
659
+ "step": 650
660
+ },
661
+ {
662
+ "epoch": 3.987915407854985,
663
+ "grad_norm": 9.52004623413086,
664
+ "learning_rate": 9.802976776916494e-05,
665
+ "loss": 0.2454,
666
+ "step": 660
667
+ },
668
+ {
669
+ "epoch": 4.0,
670
+ "eval_validation_loss": 0.7796794772148132,
671
+ "eval_validation_mae": 0.6025121808052063,
672
+ "eval_validation_mse": 0.7796794772148132,
673
+ "eval_validation_pearson": 0.8275993492481604,
674
+ "eval_validation_rmse": 0.8829945921897888,
675
+ "eval_validation_runtime": 128.9701,
676
+ "eval_validation_samples_per_second": 2.652,
677
+ "eval_validation_spearman": 0.8297559568405268,
678
+ "eval_validation_steps_per_second": 2.652,
679
+ "step": 662
680
+ },
681
+ {
682
+ "epoch": 4.0,
683
+ "eval_test_loss": 0.7812950015068054,
684
+ "eval_test_mae": 0.6039361953735352,
685
+ "eval_test_mse": 0.7812950015068054,
686
+ "eval_test_pearson": 0.8269227759203943,
687
+ "eval_test_rmse": 0.8839089274406433,
688
+ "eval_test_runtime": 128.9749,
689
+ "eval_test_samples_per_second": 2.652,
690
+ "eval_test_spearman": 0.8285234450057767,
691
+ "eval_test_steps_per_second": 2.652,
692
+ "step": 662
693
+ },
694
+ {
695
+ "epoch": 4.0,
696
+ "eval_myoglobin_loss": 0.8277855515480042,
697
+ "eval_myoglobin_mae": 0.670007050037384,
698
+ "eval_myoglobin_mse": 0.8277855515480042,
699
+ "eval_myoglobin_pearson": 0.6006203565841588,
700
+ "eval_myoglobin_rmse": 0.9098272323608398,
701
+ "eval_myoglobin_runtime": 50.3879,
702
+ "eval_myoglobin_samples_per_second": 2.659,
703
+ "eval_myoglobin_spearman": 0.6452977757628354,
704
+ "eval_myoglobin_steps_per_second": 2.659,
705
+ "step": 662
706
+ },
707
+ {
708
+ "epoch": 4.0,
709
+ "eval_myoglobin_r_loss": 0.8345361948013306,
710
+ "eval_myoglobin_r_mae": 0.6731795072555542,
711
+ "eval_myoglobin_r_mse": 0.8345361948013306,
712
+ "eval_myoglobin_r_pearson": 0.5986034577274495,
713
+ "eval_myoglobin_r_rmse": 0.9135295152664185,
714
+ "eval_myoglobin_r_runtime": 50.4043,
715
+ "eval_myoglobin_r_samples_per_second": 2.659,
716
+ "eval_myoglobin_r_spearman": 0.6437390342781807,
717
+ "eval_myoglobin_r_steps_per_second": 2.659,
718
+ "step": 662
719
+ },
720
+ {
721
+ "epoch": 4.0,
722
+ "eval_p53_loss": 3.8773045539855957,
723
+ "eval_p53_mae": 1.5222440958023071,
724
+ "eval_p53_mse": 3.8773045539855957,
725
+ "eval_p53_pearson": 0.3431499426951632,
726
+ "eval_p53_rmse": 1.9690872430801392,
727
+ "eval_p53_runtime": 17.5208,
728
+ "eval_p53_samples_per_second": 2.397,
729
+ "eval_p53_spearman": 0.3029050689249383,
730
+ "eval_p53_steps_per_second": 2.397,
731
+ "step": 662
732
+ },
733
+ {
734
+ "epoch": 4.04833836858006,
735
+ "grad_norm": 7.301741123199463,
736
+ "learning_rate": 9.787639940616788e-05,
737
+ "loss": 0.1719,
738
+ "step": 670
739
+ },
740
+ {
741
+ "epoch": 4.108761329305136,
742
+ "grad_norm": 7.330774784088135,
743
+ "learning_rate": 9.771741253542741e-05,
744
+ "loss": 0.2217,
745
+ "step": 680
746
+ },
747
+ {
748
+ "epoch": 4.169184290030212,
749
+ "grad_norm": 4.552008152008057,
750
+ "learning_rate": 9.755282581475769e-05,
751
+ "loss": 0.2123,
752
+ "step": 690
753
+ },
754
+ {
755
+ "epoch": 4.229607250755287,
756
+ "grad_norm": 4.030465126037598,
757
+ "learning_rate": 9.738265855914013e-05,
758
+ "loss": 0.1504,
759
+ "step": 700
760
+ },
761
+ {
762
+ "epoch": 4.290030211480363,
763
+ "grad_norm": 13.193011283874512,
764
+ "learning_rate": 9.720693073845667e-05,
765
+ "loss": 0.1922,
766
+ "step": 710
767
+ },
768
+ {
769
+ "epoch": 4.350453172205438,
770
+ "grad_norm": 7.253832817077637,
771
+ "learning_rate": 9.70256629751462e-05,
772
+ "loss": 0.1163,
773
+ "step": 720
774
+ },
775
+ {
776
+ "epoch": 4.410876132930514,
777
+ "grad_norm": 4.736353874206543,
778
+ "learning_rate": 9.683887654178445e-05,
779
+ "loss": 0.1391,
780
+ "step": 730
781
+ },
782
+ {
783
+ "epoch": 4.471299093655589,
784
+ "grad_norm": 8.901742935180664,
785
+ "learning_rate": 9.664659335858755e-05,
786
+ "loss": 0.1764,
787
+ "step": 740
788
+ },
789
+ {
790
+ "epoch": 4.531722054380665,
791
+ "grad_norm": 6.86702299118042,
792
+ "learning_rate": 9.644883599083958e-05,
793
+ "loss": 0.2754,
794
+ "step": 750
795
+ },
796
+ {
797
+ "epoch": 4.59214501510574,
798
+ "grad_norm": 7.574690818786621,
799
+ "learning_rate": 9.624562764624445e-05,
800
+ "loss": 0.1909,
801
+ "step": 760
802
+ },
803
+ {
804
+ "epoch": 4.652567975830816,
805
+ "grad_norm": 7.513609409332275,
806
+ "learning_rate": 9.603699217220239e-05,
807
+ "loss": 0.1752,
808
+ "step": 770
809
+ },
810
+ {
811
+ "epoch": 4.712990936555891,
812
+ "grad_norm": 6.482759475708008,
813
+ "learning_rate": 9.582295405301131e-05,
814
+ "loss": 0.1733,
815
+ "step": 780
816
+ },
817
+ {
818
+ "epoch": 4.7734138972809665,
819
+ "grad_norm": 12.048788070678711,
820
+ "learning_rate": 9.56035384069935e-05,
821
+ "loss": 0.1986,
822
+ "step": 790
823
+ },
824
+ {
825
+ "epoch": 4.833836858006042,
826
+ "grad_norm": 10.056185722351074,
827
+ "learning_rate": 9.537877098354786e-05,
828
+ "loss": 0.1798,
829
+ "step": 800
830
+ },
831
+ {
832
+ "epoch": 4.8942598187311175,
833
+ "grad_norm": 11.082551956176758,
834
+ "learning_rate": 9.514867816012809e-05,
835
+ "loss": 0.2064,
836
+ "step": 810
837
+ },
838
+ {
839
+ "epoch": 4.954682779456194,
840
+ "grad_norm": 11.846305847167969,
841
+ "learning_rate": 9.491328693914722e-05,
842
+ "loss": 0.1868,
843
+ "step": 820
844
+ },
845
+ {
846
+ "epoch": 4.996978851963746,
847
+ "eval_validation_loss": 0.7762272953987122,
848
+ "eval_validation_mae": 0.5832256078720093,
849
+ "eval_validation_mse": 0.7762272953987122,
850
+ "eval_validation_pearson": 0.8296356628068153,
851
+ "eval_validation_rmse": 0.8810376524925232,
852
+ "eval_validation_runtime": 129.148,
853
+ "eval_validation_samples_per_second": 2.648,
854
+ "eval_validation_spearman": 0.8313329837924658,
855
+ "eval_validation_steps_per_second": 2.648,
856
+ "step": 827
857
+ },
858
+ {
859
+ "epoch": 4.996978851963746,
860
+ "eval_test_loss": 0.7746825814247131,
861
+ "eval_test_mae": 0.5824557542800903,
862
+ "eval_test_mse": 0.7746825814247131,
863
+ "eval_test_pearson": 0.8289941215708723,
864
+ "eval_test_rmse": 0.8801605701446533,
865
+ "eval_test_runtime": 129.0007,
866
+ "eval_test_samples_per_second": 2.651,
867
+ "eval_test_spearman": 0.83056037563114,
868
+ "eval_test_steps_per_second": 2.651,
869
+ "step": 827
870
+ },
871
+ {
872
+ "epoch": 4.996978851963746,
873
+ "eval_myoglobin_loss": 0.7784543037414551,
874
+ "eval_myoglobin_mae": 0.6274142265319824,
875
+ "eval_myoglobin_mse": 0.7784543037414551,
876
+ "eval_myoglobin_pearson": 0.6119697266512958,
877
+ "eval_myoglobin_rmse": 0.8823005557060242,
878
+ "eval_myoglobin_runtime": 50.3439,
879
+ "eval_myoglobin_samples_per_second": 2.662,
880
+ "eval_myoglobin_spearman": 0.6659828987607965,
881
+ "eval_myoglobin_steps_per_second": 2.662,
882
+ "step": 827
883
+ },
884
+ {
885
+ "epoch": 4.996978851963746,
886
+ "eval_myoglobin_r_loss": 0.7818257212638855,
887
+ "eval_myoglobin_r_mae": 0.6293807625770569,
888
+ "eval_myoglobin_r_mse": 0.7818257212638855,
889
+ "eval_myoglobin_r_pearson": 0.6113029738762634,
890
+ "eval_myoglobin_r_rmse": 0.8842090964317322,
891
+ "eval_myoglobin_r_runtime": 50.3816,
892
+ "eval_myoglobin_r_samples_per_second": 2.66,
893
+ "eval_myoglobin_r_spearman": 0.6645264107175352,
894
+ "eval_myoglobin_r_steps_per_second": 2.66,
895
+ "step": 827
896
+ },
897
+ {
898
+ "epoch": 4.996978851963746,
899
+ "eval_p53_loss": 3.965853452682495,
900
+ "eval_p53_mae": 1.5425353050231934,
901
+ "eval_p53_mse": 3.965853452682495,
902
+ "eval_p53_pearson": 0.33295329636894166,
903
+ "eval_p53_rmse": 1.9914450645446777,
904
+ "eval_p53_runtime": 17.5238,
905
+ "eval_p53_samples_per_second": 2.397,
906
+ "eval_p53_spearman": 0.2773793608694659,
907
+ "eval_p53_steps_per_second": 2.397,
908
+ "step": 827
909
+ },
910
+ {
911
+ "epoch": 5.015105740181269,
912
+ "grad_norm": 6.842067241668701,
913
+ "learning_rate": 9.467262494480869e-05,
914
+ "loss": 0.1314,
915
+ "step": 830
916
+ },
917
+ {
918
+ "epoch": 5.075528700906345,
919
+ "grad_norm": 3.0206170082092285,
920
+ "learning_rate": 9.442672041986457e-05,
921
+ "loss": 0.077,
922
+ "step": 840
923
+ },
924
+ {
925
+ "epoch": 5.13595166163142,
926
+ "grad_norm": 4.4932355880737305,
927
+ "learning_rate": 9.417560222230115e-05,
928
+ "loss": 0.124,
929
+ "step": 850
930
+ },
931
+ {
932
+ "epoch": 5.196374622356496,
933
+ "grad_norm": 6.419702053070068,
934
+ "learning_rate": 9.391929982195232e-05,
935
+ "loss": 0.0903,
936
+ "step": 860
937
+ },
938
+ {
939
+ "epoch": 5.256797583081571,
940
+ "grad_norm": 5.199291229248047,
941
+ "learning_rate": 9.365784329704115e-05,
942
+ "loss": 0.117,
943
+ "step": 870
944
+ },
945
+ {
946
+ "epoch": 5.317220543806647,
947
+ "grad_norm": 6.824092864990234,
948
+ "learning_rate": 9.339126333065007e-05,
949
+ "loss": 0.1264,
950
+ "step": 880
951
+ },
952
+ {
953
+ "epoch": 5.377643504531722,
954
+ "grad_norm": 7.177854061126709,
955
+ "learning_rate": 9.31195912071201e-05,
956
+ "loss": 0.1153,
957
+ "step": 890
958
+ },
959
+ {
960
+ "epoch": 5.438066465256798,
961
+ "grad_norm": 6.354623317718506,
962
+ "learning_rate": 9.284285880837946e-05,
963
+ "loss": 0.1371,
964
+ "step": 900
965
+ },
966
+ {
967
+ "epoch": 5.498489425981873,
968
+ "grad_norm": 7.266890525817871,
969
+ "learning_rate": 9.256109861020213e-05,
970
+ "loss": 0.1253,
971
+ "step": 910
972
+ },
973
+ {
974
+ "epoch": 5.5589123867069485,
975
+ "grad_norm": 5.081086158752441,
976
+ "learning_rate": 9.22743436783966e-05,
977
+ "loss": 0.0986,
978
+ "step": 920
979
+ },
980
+ {
981
+ "epoch": 5.619335347432024,
982
+ "grad_norm": 4.2226762771606445,
983
+ "learning_rate": 9.198262766492554e-05,
984
+ "loss": 0.0786,
985
+ "step": 930
986
+ },
987
+ {
988
+ "epoch": 5.6797583081570995,
989
+ "grad_norm": 5.185201168060303,
990
+ "learning_rate": 9.168598480395651e-05,
991
+ "loss": 0.1014,
992
+ "step": 940
993
+ },
994
+ {
995
+ "epoch": 5.740181268882175,
996
+ "grad_norm": 8.60638427734375,
997
+ "learning_rate": 9.138444990784453e-05,
998
+ "loss": 0.1176,
999
+ "step": 950
1000
+ },
1001
+ {
1002
+ "epoch": 5.80060422960725,
1003
+ "grad_norm": 7.529942989349365,
1004
+ "learning_rate": 9.107805836304658e-05,
1005
+ "loss": 0.1253,
1006
+ "step": 960
1007
+ },
1008
+ {
1009
+ "epoch": 5.861027190332326,
1010
+ "grad_norm": 13.042976379394531,
1011
+ "learning_rate": 9.076684612596891e-05,
1012
+ "loss": 0.1317,
1013
+ "step": 970
1014
+ },
1015
+ {
1016
+ "epoch": 5.921450151057401,
1017
+ "grad_norm": 6.046395778656006,
1018
+ "learning_rate": 9.045084971874738e-05,
1019
+ "loss": 0.137,
1020
+ "step": 980
1021
+ },
1022
+ {
1023
+ "epoch": 5.981873111782478,
1024
+ "grad_norm": 7.686984062194824,
1025
+ "learning_rate": 9.013010622496144e-05,
1026
+ "loss": 0.116,
1027
+ "step": 990
1028
+ },
1029
+ {
1030
+ "epoch": 6.0,
1031
+ "eval_validation_loss": 0.7437558770179749,
1032
+ "eval_validation_mae": 0.560340940952301,
1033
+ "eval_validation_mse": 0.7437558770179749,
1034
+ "eval_validation_pearson": 0.8386764926773793,
1035
+ "eval_validation_rmse": 0.8624128103256226,
1036
+ "eval_validation_runtime": 128.9359,
1037
+ "eval_validation_samples_per_second": 2.652,
1038
+ "eval_validation_spearman": 0.8401724374307825,
1039
+ "eval_validation_steps_per_second": 2.652,
1040
+ "step": 993
1041
+ },
1042
+ {
1043
+ "epoch": 6.0,
1044
+ "eval_test_loss": 0.7406001091003418,
1045
+ "eval_test_mae": 0.558407723903656,
1046
+ "eval_test_mse": 0.7406001091003418,
1047
+ "eval_test_pearson": 0.8379222565005193,
1048
+ "eval_test_rmse": 0.8605812788009644,
1049
+ "eval_test_runtime": 128.8915,
1050
+ "eval_test_samples_per_second": 2.653,
1051
+ "eval_test_spearman": 0.8392344740172174,
1052
+ "eval_test_steps_per_second": 2.653,
1053
+ "step": 993
1054
+ },
1055
+ {
1056
+ "epoch": 6.0,
1057
+ "eval_myoglobin_loss": 0.7151281833648682,
1058
+ "eval_myoglobin_mae": 0.6059358716011047,
1059
+ "eval_myoglobin_mse": 0.7151281833648682,
1060
+ "eval_myoglobin_pearson": 0.6404604400648282,
1061
+ "eval_myoglobin_rmse": 0.8456525206565857,
1062
+ "eval_myoglobin_runtime": 50.3332,
1063
+ "eval_myoglobin_samples_per_second": 2.662,
1064
+ "eval_myoglobin_spearman": 0.695120141585149,
1065
+ "eval_myoglobin_steps_per_second": 2.662,
1066
+ "step": 993
1067
+ },
1068
+ {
1069
+ "epoch": 6.0,
1070
+ "eval_myoglobin_r_loss": 0.718505322933197,
1071
+ "eval_myoglobin_r_mae": 0.6069437861442566,
1072
+ "eval_myoglobin_r_mse": 0.718505322933197,
1073
+ "eval_myoglobin_r_pearson": 0.6396679973590835,
1074
+ "eval_myoglobin_r_rmse": 0.847646951675415,
1075
+ "eval_myoglobin_r_runtime": 50.3333,
1076
+ "eval_myoglobin_r_samples_per_second": 2.662,
1077
+ "eval_myoglobin_r_spearman": 0.6941175590622191,
1078
+ "eval_myoglobin_r_steps_per_second": 2.662,
1079
+ "step": 993
1080
+ },
1081
+ {
1082
+ "epoch": 6.0,
1083
+ "eval_p53_loss": 3.9644815921783447,
1084
+ "eval_p53_mae": 1.5399214029312134,
1085
+ "eval_p53_mse": 3.9644815921783447,
1086
+ "eval_p53_pearson": 0.3275408804183826,
1087
+ "eval_p53_rmse": 1.991100549697876,
1088
+ "eval_p53_runtime": 17.5252,
1089
+ "eval_p53_samples_per_second": 2.397,
1090
+ "eval_p53_spearman": 0.2655483977707391,
1091
+ "eval_p53_steps_per_second": 2.397,
1092
+ "step": 993
1093
+ },
1094
+ {
1095
+ "epoch": 6.042296072507553,
1096
+ "grad_norm": 4.583357334136963,
1097
+ "learning_rate": 8.980465328528219e-05,
1098
+ "loss": 0.0736,
1099
+ "step": 1000
1100
+ },
1101
+ {
1102
+ "epoch": 6.102719033232629,
1103
+ "grad_norm": 4.502609729766846,
1104
+ "learning_rate": 8.94745290930551e-05,
1105
+ "loss": 0.0453,
1106
+ "step": 1010
1107
+ },
1108
+ {
1109
+ "epoch": 6.163141993957704,
1110
+ "grad_norm": 4.533963203430176,
1111
+ "learning_rate": 8.913977238981778e-05,
1112
+ "loss": 0.0728,
1113
+ "step": 1020
1114
+ },
1115
+ {
1116
+ "epoch": 6.22356495468278,
1117
+ "grad_norm": 3.250356912612915,
1118
+ "learning_rate": 8.880042246075365e-05,
1119
+ "loss": 0.0644,
1120
+ "step": 1030
1121
+ },
1122
+ {
1123
+ "epoch": 6.283987915407855,
1124
+ "grad_norm": 5.3714470863342285,
1125
+ "learning_rate": 8.845651913008145e-05,
1126
+ "loss": 0.0654,
1127
+ "step": 1040
1128
+ },
1129
+ {
1130
+ "epoch": 6.3444108761329305,
1131
+ "grad_norm": 4.90172815322876,
1132
+ "learning_rate": 8.810810275638183e-05,
1133
+ "loss": 0.0976,
1134
+ "step": 1050
1135
+ },
1136
+ {
1137
+ "epoch": 6.404833836858006,
1138
+ "grad_norm": 4.008552551269531,
1139
+ "learning_rate": 8.775521422786104e-05,
1140
+ "loss": 0.0821,
1141
+ "step": 1060
1142
+ },
1143
+ {
1144
+ "epoch": 6.4652567975830815,
1145
+ "grad_norm": 6.362269878387451,
1146
+ "learning_rate": 8.739789495755253e-05,
1147
+ "loss": 0.0668,
1148
+ "step": 1070
1149
+ },
1150
+ {
1151
+ "epoch": 6.525679758308157,
1152
+ "grad_norm": 3.8847382068634033,
1153
+ "learning_rate": 8.703618687845696e-05,
1154
+ "loss": 0.0964,
1155
+ "step": 1080
1156
+ },
1157
+ {
1158
+ "epoch": 6.586102719033232,
1159
+ "grad_norm": 6.632302761077881,
1160
+ "learning_rate": 8.667013243862113e-05,
1161
+ "loss": 0.0588,
1162
+ "step": 1090
1163
+ },
1164
+ {
1165
+ "epoch": 6.646525679758308,
1166
+ "grad_norm": 4.578704833984375,
1167
+ "learning_rate": 8.629977459615655e-05,
1168
+ "loss": 0.1196,
1169
+ "step": 1100
1170
+ },
1171
+ {
1172
+ "epoch": 6.706948640483383,
1173
+ "grad_norm": 3.4470386505126953,
1174
+ "learning_rate": 8.592515681419813e-05,
1175
+ "loss": 0.099,
1176
+ "step": 1110
1177
+ },
1178
+ {
1179
+ "epoch": 6.76737160120846,
1180
+ "grad_norm": 5.420199871063232,
1181
+ "learning_rate": 8.554632305580354e-05,
1182
+ "loss": 0.08,
1183
+ "step": 1120
1184
+ },
1185
+ {
1186
+ "epoch": 6.827794561933535,
1187
+ "grad_norm": 2.7681386470794678,
1188
+ "learning_rate": 8.5163317778794e-05,
1189
+ "loss": 0.0618,
1190
+ "step": 1130
1191
+ },
1192
+ {
1193
+ "epoch": 6.888217522658611,
1194
+ "grad_norm": 6.834138870239258,
1195
+ "learning_rate": 8.477618593053693e-05,
1196
+ "loss": 0.1016,
1197
+ "step": 1140
1198
+ },
1199
+ {
1200
+ "epoch": 6.948640483383686,
1201
+ "grad_norm": 3.5948920249938965,
1202
+ "learning_rate": 8.438497294267117e-05,
1203
+ "loss": 0.0986,
1204
+ "step": 1150
1205
+ },
1206
+ {
1207
+ "epoch": 6.996978851963746,
1208
+ "eval_validation_loss": 0.7183637619018555,
1209
+ "eval_validation_mae": 0.546740710735321,
1210
+ "eval_validation_mse": 0.7183637619018555,
1211
+ "eval_validation_pearson": 0.8431698872855691,
1212
+ "eval_validation_rmse": 0.8475634455680847,
1213
+ "eval_validation_runtime": 129.0329,
1214
+ "eval_validation_samples_per_second": 2.65,
1215
+ "eval_validation_spearman": 0.8431269246509225,
1216
+ "eval_validation_steps_per_second": 2.65,
1217
+ "step": 1158
1218
+ },
1219
+ {
1220
+ "epoch": 6.996978851963746,
1221
+ "eval_test_loss": 0.7154301404953003,
1222
+ "eval_test_mae": 0.5448740720748901,
1223
+ "eval_test_mse": 0.7154301404953003,
1224
+ "eval_test_pearson": 0.8424761229464366,
1225
+ "eval_test_rmse": 0.845831036567688,
1226
+ "eval_test_runtime": 128.9049,
1227
+ "eval_test_samples_per_second": 2.653,
1228
+ "eval_test_spearman": 0.8418909616629952,
1229
+ "eval_test_steps_per_second": 2.653,
1230
+ "step": 1158
1231
+ },
1232
+ {
1233
+ "epoch": 6.996978851963746,
1234
+ "eval_myoglobin_loss": 0.7281521558761597,
1235
+ "eval_myoglobin_mae": 0.6114689707756042,
1236
+ "eval_myoglobin_mse": 0.7281521558761597,
1237
+ "eval_myoglobin_pearson": 0.6346894753809238,
1238
+ "eval_myoglobin_rmse": 0.8533183336257935,
1239
+ "eval_myoglobin_runtime": 50.3288,
1240
+ "eval_myoglobin_samples_per_second": 2.662,
1241
+ "eval_myoglobin_spearman": 0.6774278022377248,
1242
+ "eval_myoglobin_steps_per_second": 2.662,
1243
+ "step": 1158
1244
+ },
1245
+ {
1246
+ "epoch": 6.996978851963746,
1247
+ "eval_myoglobin_r_loss": 0.7321227192878723,
1248
+ "eval_myoglobin_r_mae": 0.6134014129638672,
1249
+ "eval_myoglobin_r_mse": 0.7321227192878723,
1250
+ "eval_myoglobin_r_pearson": 0.633906508155178,
1251
+ "eval_myoglobin_r_rmse": 0.8556417226791382,
1252
+ "eval_myoglobin_r_runtime": 50.3016,
1253
+ "eval_myoglobin_r_samples_per_second": 2.664,
1254
+ "eval_myoglobin_r_spearman": 0.6763329422189034,
1255
+ "eval_myoglobin_r_steps_per_second": 2.664,
1256
+ "step": 1158
1257
+ },
1258
+ {
1259
+ "epoch": 6.996978851963746,
1260
+ "eval_p53_loss": 4.01906156539917,
1261
+ "eval_p53_mae": 1.5535447597503662,
1262
+ "eval_p53_mse": 4.01906156539917,
1263
+ "eval_p53_pearson": 0.3308597233844921,
1264
+ "eval_p53_rmse": 2.0047597885131836,
1265
+ "eval_p53_runtime": 17.5174,
1266
+ "eval_p53_samples_per_second": 2.398,
1267
+ "eval_p53_spearman": 0.3005550831039583,
1268
+ "eval_p53_steps_per_second": 2.398,
1269
+ "step": 1158
1270
+ },
1271
+ {
1272
+ "epoch": 7.009063444108762,
1273
+ "grad_norm": 1.746163010597229,
1274
+ "learning_rate": 8.39897247257754e-05,
1275
+ "loss": 0.0779,
1276
+ "step": 1160
1277
+ },
1278
+ {
1279
+ "epoch": 7.069486404833837,
1280
+ "grad_norm": 3.740064859390259,
1281
+ "learning_rate": 8.359048766398031e-05,
1282
+ "loss": 0.0478,
1283
+ "step": 1170
1284
+ },
1285
+ {
1286
+ "epoch": 7.1299093655589125,
1287
+ "grad_norm": 3.6287150382995605,
1288
+ "learning_rate": 8.318730860952522e-05,
1289
+ "loss": 0.0989,
1290
+ "step": 1180
1291
+ },
1292
+ {
1293
+ "epoch": 7.190332326283988,
1294
+ "grad_norm": 5.2572784423828125,
1295
+ "learning_rate": 8.278023487725982e-05,
1296
+ "loss": 0.0487,
1297
+ "step": 1190
1298
+ },
1299
+ {
1300
+ "epoch": 7.2507552870090635,
1301
+ "grad_norm": 6.087055206298828,
1302
+ "learning_rate": 8.236931423909138e-05,
1303
+ "loss": 0.0552,
1304
+ "step": 1200
1305
+ },
1306
+ {
1307
+ "epoch": 7.311178247734139,
1308
+ "grad_norm": 4.305581092834473,
1309
+ "learning_rate": 8.19545949183788e-05,
1310
+ "loss": 0.0553,
1311
+ "step": 1210
1312
+ },
1313
+ {
1314
+ "epoch": 7.371601208459214,
1315
+ "grad_norm": 4.056112289428711,
1316
+ "learning_rate": 8.153612558427311e-05,
1317
+ "loss": 0.0463,
1318
+ "step": 1220
1319
+ },
1320
+ {
1321
+ "epoch": 7.43202416918429,
1322
+ "grad_norm": 2.302788734436035,
1323
+ "learning_rate": 8.111395534600603e-05,
1324
+ "loss": 0.0461,
1325
+ "step": 1230
1326
+ },
1327
+ {
1328
+ "epoch": 7.492447129909365,
1329
+ "grad_norm": 4.409842491149902,
1330
+ "learning_rate": 8.068813374712688e-05,
1331
+ "loss": 0.0692,
1332
+ "step": 1240
1333
+ },
1334
+ {
1335
+ "epoch": 7.552870090634441,
1336
+ "grad_norm": 5.059557914733887,
1337
+ "learning_rate": 8.025871075968828e-05,
1338
+ "loss": 0.0462,
1339
+ "step": 1250
1340
+ },
1341
+ {
1342
+ "epoch": 7.613293051359516,
1343
+ "grad_norm": 3.1334400177001953,
1344
+ "learning_rate": 7.982573677838172e-05,
1345
+ "loss": 0.0729,
1346
+ "step": 1260
1347
+ },
1348
+ {
1349
+ "epoch": 7.673716012084592,
1350
+ "grad_norm": 5.167521953582764,
1351
+ "learning_rate": 7.938926261462366e-05,
1352
+ "loss": 0.0728,
1353
+ "step": 1270
1354
+ },
1355
+ {
1356
+ "epoch": 7.734138972809667,
1357
+ "grad_norm": 7.676636219024658,
1358
+ "learning_rate": 7.894933949059245e-05,
1359
+ "loss": 0.0484,
1360
+ "step": 1280
1361
+ },
1362
+ {
1363
+ "epoch": 7.794561933534744,
1364
+ "grad_norm": 6.20352029800415,
1365
+ "learning_rate": 7.850601903321716e-05,
1366
+ "loss": 0.067,
1367
+ "step": 1290
1368
+ },
1369
+ {
1370
+ "epoch": 7.854984894259819,
1371
+ "grad_norm": 5.005100727081299,
1372
+ "learning_rate": 7.805935326811912e-05,
1373
+ "loss": 0.0519,
1374
+ "step": 1300
1375
+ },
1376
+ {
1377
+ "epoch": 7.9154078549848945,
1378
+ "grad_norm": 3.7345407009124756,
1379
+ "learning_rate": 7.760939461350623e-05,
1380
+ "loss": 0.0603,
1381
+ "step": 1310
1382
+ },
1383
+ {
1384
+ "epoch": 7.97583081570997,
1385
+ "grad_norm": 4.5358781814575195,
1386
+ "learning_rate": 7.715619587402164e-05,
1387
+ "loss": 0.0591,
1388
+ "step": 1320
1389
+ },
1390
+ {
1391
+ "epoch": 8.0,
1392
+ "eval_validation_loss": 0.7083848118782043,
1393
+ "eval_validation_mae": 0.5314592719078064,
1394
+ "eval_validation_mse": 0.7083848118782043,
1395
+ "eval_validation_pearson": 0.8431805287577845,
1396
+ "eval_validation_rmse": 0.841655969619751,
1397
+ "eval_validation_runtime": 129.0006,
1398
+ "eval_validation_samples_per_second": 2.651,
1399
+ "eval_validation_spearman": 0.8425645367331428,
1400
+ "eval_validation_steps_per_second": 2.651,
1401
+ "step": 1324
1402
+ },
1403
+ {
1404
+ "epoch": 8.0,
1405
+ "eval_test_loss": 0.7102228999137878,
1406
+ "eval_test_mae": 0.5320845246315002,
1407
+ "eval_test_mse": 0.7102228999137878,
1408
+ "eval_test_pearson": 0.8422782344343887,
1409
+ "eval_test_rmse": 0.8427472114562988,
1410
+ "eval_test_runtime": 129.0009,
1411
+ "eval_test_samples_per_second": 2.651,
1412
+ "eval_test_spearman": 0.841548097097326,
1413
+ "eval_test_steps_per_second": 2.651,
1414
+ "step": 1324
1415
+ },
1416
+ {
1417
+ "epoch": 8.0,
1418
+ "eval_myoglobin_loss": 0.7520159482955933,
1419
+ "eval_myoglobin_mae": 0.6083745956420898,
1420
+ "eval_myoglobin_mse": 0.7520159482955933,
1421
+ "eval_myoglobin_pearson": 0.6194230091693953,
1422
+ "eval_myoglobin_rmse": 0.8671885132789612,
1423
+ "eval_myoglobin_runtime": 50.3253,
1424
+ "eval_myoglobin_samples_per_second": 2.663,
1425
+ "eval_myoglobin_spearman": 0.6743252831866682,
1426
+ "eval_myoglobin_steps_per_second": 2.663,
1427
+ "step": 1324
1428
+ },
1429
+ {
1430
+ "epoch": 8.0,
1431
+ "eval_myoglobin_r_loss": 0.753574013710022,
1432
+ "eval_myoglobin_r_mae": 0.6102104783058167,
1433
+ "eval_myoglobin_r_mse": 0.753574013710022,
1434
+ "eval_myoglobin_r_pearson": 0.6190984211925752,
1435
+ "eval_myoglobin_r_rmse": 0.8680863976478577,
1436
+ "eval_myoglobin_r_runtime": 50.3474,
1437
+ "eval_myoglobin_r_samples_per_second": 2.662,
1438
+ "eval_myoglobin_r_spearman": 0.6751208648404359,
1439
+ "eval_myoglobin_r_steps_per_second": 2.662,
1440
+ "step": 1324
1441
+ },
1442
+ {
1443
+ "epoch": 8.0,
1444
+ "eval_p53_loss": 3.998952627182007,
1445
+ "eval_p53_mae": 1.5620994567871094,
1446
+ "eval_p53_mse": 3.998952627182007,
1447
+ "eval_p53_pearson": 0.32915902114473283,
1448
+ "eval_p53_rmse": 1.999738097190857,
1449
+ "eval_p53_runtime": 17.5293,
1450
+ "eval_p53_samples_per_second": 2.396,
1451
+ "eval_p53_spearman": 0.2918034117706535,
1452
+ "eval_p53_steps_per_second": 2.396,
1453
+ "step": 1324
1454
+ },
1455
+ {
1456
+ "epoch": 8.036253776435045,
1457
+ "grad_norm": 2.709134101867676,
1458
+ "learning_rate": 7.669981023454682e-05,
1459
+ "loss": 0.0448,
1460
+ "step": 1330
1461
+ },
1462
+ {
1463
+ "epoch": 8.09667673716012,
1464
+ "grad_norm": 3.791492223739624,
1465
+ "learning_rate": 7.624029125396004e-05,
1466
+ "loss": 0.0381,
1467
+ "step": 1340
1468
+ },
1469
+ {
1470
+ "epoch": 8.157099697885196,
1471
+ "grad_norm": 2.499840259552002,
1472
+ "learning_rate": 7.577769285885109e-05,
1473
+ "loss": 0.0399,
1474
+ "step": 1350
1475
+ },
1476
+ {
1477
+ "epoch": 8.217522658610273,
1478
+ "grad_norm": 4.001781463623047,
1479
+ "learning_rate": 7.53120693371927e-05,
1480
+ "loss": 0.0473,
1481
+ "step": 1360
1482
+ },
1483
+ {
1484
+ "epoch": 8.277945619335348,
1485
+ "grad_norm": 4.293058395385742,
1486
+ "learning_rate": 7.484347533196961e-05,
1487
+ "loss": 0.0423,
1488
+ "step": 1370
1489
+ },
1490
+ {
1491
+ "epoch": 8.338368580060424,
1492
+ "grad_norm": 6.312211990356445,
1493
+ "learning_rate": 7.437196583476596e-05,
1494
+ "loss": 0.0607,
1495
+ "step": 1380
1496
+ },
1497
+ {
1498
+ "epoch": 8.3987915407855,
1499
+ "grad_norm": 5.174694538116455,
1500
+ "learning_rate": 7.389759617931182e-05,
1501
+ "loss": 0.0513,
1502
+ "step": 1390
1503
+ },
1504
+ {
1505
+ "epoch": 8.459214501510575,
1506
+ "grad_norm": 3.2399227619171143,
1507
+ "learning_rate": 7.342042203498951e-05,
1508
+ "loss": 0.0264,
1509
+ "step": 1400
1510
+ },
1511
+ {
1512
+ "epoch": 8.51963746223565,
1513
+ "grad_norm": 3.2468111515045166,
1514
+ "learning_rate": 7.294049940030055e-05,
1515
+ "loss": 0.0397,
1516
+ "step": 1410
1517
+ },
1518
+ {
1519
+ "epoch": 8.580060422960726,
1520
+ "grad_norm": 4.853147983551025,
1521
+ "learning_rate": 7.245788459629396e-05,
1522
+ "loss": 0.0504,
1523
+ "step": 1420
1524
+ },
1525
+ {
1526
+ "epoch": 8.640483383685801,
1527
+ "grad_norm": 3.306142568588257,
1528
+ "learning_rate": 7.197263425995682e-05,
1529
+ "loss": 0.0405,
1530
+ "step": 1430
1531
+ },
1532
+ {
1533
+ "epoch": 8.700906344410877,
1534
+ "grad_norm": 3.277174711227417,
1535
+ "learning_rate": 7.14848053375676e-05,
1536
+ "loss": 0.0382,
1537
+ "step": 1440
1538
+ },
1539
+ {
1540
+ "epoch": 8.761329305135952,
1541
+ "grad_norm": 3.5067923069000244,
1542
+ "learning_rate": 7.099445507801323e-05,
1543
+ "loss": 0.0402,
1544
+ "step": 1450
1545
+ },
1546
+ {
1547
+ "epoch": 8.821752265861027,
1548
+ "grad_norm": 4.621323108673096,
1549
+ "learning_rate": 7.05016410260708e-05,
1550
+ "loss": 0.0601,
1551
+ "step": 1460
1552
+ },
1553
+ {
1554
+ "epoch": 8.882175226586103,
1555
+ "grad_norm": 2.868621587753296,
1556
+ "learning_rate": 7.000642101565434e-05,
1557
+ "loss": 0.0389,
1558
+ "step": 1470
1559
+ },
1560
+ {
1561
+ "epoch": 8.942598187311178,
1562
+ "grad_norm": 3.935192584991455,
1563
+ "learning_rate": 6.950885316302773e-05,
1564
+ "loss": 0.0809,
1565
+ "step": 1480
1566
+ },
1567
+ {
1568
+ "epoch": 8.996978851963746,
1569
+ "eval_validation_loss": 0.7205836176872253,
1570
+ "eval_validation_mae": 0.5384864807128906,
1571
+ "eval_validation_mse": 0.7205836176872253,
1572
+ "eval_validation_pearson": 0.8410535584549592,
1573
+ "eval_validation_rmse": 0.8488719463348389,
1574
+ "eval_validation_runtime": 129.1697,
1575
+ "eval_validation_samples_per_second": 2.648,
1576
+ "eval_validation_spearman": 0.8456297609552365,
1577
+ "eval_validation_steps_per_second": 2.648,
1578
+ "step": 1489
1579
+ },
1580
+ {
1581
+ "epoch": 8.996978851963746,
1582
+ "eval_test_loss": 0.7227678298950195,
1583
+ "eval_test_mae": 0.5399196743965149,
1584
+ "eval_test_mse": 0.7227678298950195,
1585
+ "eval_test_pearson": 0.8400645100430921,
1586
+ "eval_test_rmse": 0.8501575589179993,
1587
+ "eval_test_runtime": 129.1979,
1588
+ "eval_test_samples_per_second": 2.647,
1589
+ "eval_test_spearman": 0.8440572355074423,
1590
+ "eval_test_steps_per_second": 2.647,
1591
+ "step": 1489
1592
+ },
1593
+ {
1594
+ "epoch": 8.996978851963746,
1595
+ "eval_myoglobin_loss": 0.7287605404853821,
1596
+ "eval_myoglobin_mae": 0.611557126045227,
1597
+ "eval_myoglobin_mse": 0.7287605404853821,
1598
+ "eval_myoglobin_pearson": 0.6309798982049415,
1599
+ "eval_myoglobin_rmse": 0.8536747097969055,
1600
+ "eval_myoglobin_runtime": 50.4254,
1601
+ "eval_myoglobin_samples_per_second": 2.657,
1602
+ "eval_myoglobin_spearman": 0.6759837841263407,
1603
+ "eval_myoglobin_steps_per_second": 2.657,
1604
+ "step": 1489
1605
+ },
1606
+ {
1607
+ "epoch": 8.996978851963746,
1608
+ "eval_myoglobin_r_loss": 0.7296391725540161,
1609
+ "eval_myoglobin_r_mae": 0.6120913028717041,
1610
+ "eval_myoglobin_r_mse": 0.7296391725540161,
1611
+ "eval_myoglobin_r_pearson": 0.63082406938247,
1612
+ "eval_myoglobin_r_rmse": 0.8541892170906067,
1613
+ "eval_myoglobin_r_runtime": 50.4386,
1614
+ "eval_myoglobin_r_samples_per_second": 2.657,
1615
+ "eval_myoglobin_r_spearman": 0.6778293340441719,
1616
+ "eval_myoglobin_r_steps_per_second": 2.657,
1617
+ "step": 1489
1618
+ },
1619
+ {
1620
+ "epoch": 8.996978851963746,
1621
+ "eval_p53_loss": 3.9503331184387207,
1622
+ "eval_p53_mae": 1.5360697507858276,
1623
+ "eval_p53_mse": 3.9503331184387207,
1624
+ "eval_p53_pearson": 0.32911657672957517,
1625
+ "eval_p53_rmse": 1.9875445365905762,
1626
+ "eval_p53_runtime": 17.5407,
1627
+ "eval_p53_samples_per_second": 2.394,
1628
+ "eval_p53_spearman": 0.2984481992644589,
1629
+ "eval_p53_steps_per_second": 2.394,
1630
+ "step": 1489
1631
+ },
1632
+ {
1633
+ "epoch": 9.003021148036254,
1634
+ "grad_norm": 3.8243162631988525,
1635
+ "learning_rate": 6.90089958599846e-05,
1636
+ "loss": 0.0256,
1637
+ "step": 1490
1638
+ },
1639
+ {
1640
+ "epoch": 9.06344410876133,
1641
+ "grad_norm": 1.7380579710006714,
1642
+ "learning_rate": 6.850690776699573e-05,
1643
+ "loss": 0.0361,
1644
+ "step": 1500
1645
+ },
1646
+ {
1647
+ "epoch": 9.123867069486405,
1648
+ "grad_norm": 3.18019962310791,
1649
+ "learning_rate": 6.800264780632494e-05,
1650
+ "loss": 0.0318,
1651
+ "step": 1510
1652
+ },
1653
+ {
1654
+ "epoch": 9.18429003021148,
1655
+ "grad_norm": 3.6475789546966553,
1656
+ "learning_rate": 6.749627515511442e-05,
1657
+ "loss": 0.0308,
1658
+ "step": 1520
1659
+ },
1660
+ {
1661
+ "epoch": 9.244712990936556,
1662
+ "grad_norm": 2.547883987426758,
1663
+ "learning_rate": 6.698784923843992e-05,
1664
+ "loss": 0.0266,
1665
+ "step": 1530
1666
+ },
1667
+ {
1668
+ "epoch": 9.305135951661631,
1669
+ "grad_norm": 2.434000015258789,
1670
+ "learning_rate": 6.647742972233703e-05,
1671
+ "loss": 0.0302,
1672
+ "step": 1540
1673
+ },
1674
+ {
1675
+ "epoch": 9.365558912386707,
1676
+ "grad_norm": 5.521008014678955,
1677
+ "learning_rate": 6.5965076506799e-05,
1678
+ "loss": 0.0358,
1679
+ "step": 1550
1680
+ },
1681
+ {
1682
+ "epoch": 9.425981873111782,
1683
+ "grad_norm": 3.3292031288146973,
1684
+ "learning_rate": 6.545084971874738e-05,
1685
+ "loss": 0.037,
1686
+ "step": 1560
1687
+ },
1688
+ {
1689
+ "epoch": 9.486404833836858,
1690
+ "grad_norm": 3.638929605484009,
1691
+ "learning_rate": 6.493480970497569e-05,
1692
+ "loss": 0.0422,
1693
+ "step": 1570
1694
+ },
1695
+ {
1696
+ "epoch": 9.546827794561933,
1697
+ "grad_norm": 2.6074700355529785,
1698
+ "learning_rate": 6.441701702506754e-05,
1699
+ "loss": 0.0292,
1700
+ "step": 1580
1701
+ },
1702
+ {
1703
+ "epoch": 9.607250755287009,
1704
+ "grad_norm": 2.0434532165527344,
1705
+ "learning_rate": 6.389753244428972e-05,
1706
+ "loss": 0.021,
1707
+ "step": 1590
1708
+ },
1709
+ {
1710
+ "epoch": 9.667673716012084,
1711
+ "grad_norm": 4.6780524253845215,
1712
+ "learning_rate": 6.337641692646106e-05,
1713
+ "loss": 0.0366,
1714
+ "step": 1600
1715
+ },
1716
+ {
1717
+ "epoch": 9.72809667673716,
1718
+ "grad_norm": 3.076948642730713,
1719
+ "learning_rate": 6.285373162679803e-05,
1720
+ "loss": 0.0295,
1721
+ "step": 1610
1722
+ },
1723
+ {
1724
+ "epoch": 9.788519637462235,
1725
+ "grad_norm": 4.26541805267334,
1726
+ "learning_rate": 6.232953788473811e-05,
1727
+ "loss": 0.0274,
1728
+ "step": 1620
1729
+ },
1730
+ {
1731
+ "epoch": 9.84894259818731,
1732
+ "grad_norm": 2.9073691368103027,
1733
+ "learning_rate": 6.1803897216741e-05,
1734
+ "loss": 0.0324,
1735
+ "step": 1630
1736
+ },
1737
+ {
1738
+ "epoch": 9.909365558912386,
1739
+ "grad_norm": 4.8707990646362305,
1740
+ "learning_rate": 6.127687130906972e-05,
1741
+ "loss": 0.0401,
1742
+ "step": 1640
1743
+ },
1744
+ {
1745
+ "epoch": 9.969788519637461,
1746
+ "grad_norm": 4.892948150634766,
1747
+ "learning_rate": 6.0748522010551215e-05,
1748
+ "loss": 0.0461,
1749
+ "step": 1650
1750
+ },
1751
+ {
1752
+ "epoch": 10.0,
1753
+ "eval_validation_loss": 0.7096761465072632,
1754
+ "eval_validation_mae": 0.5233484506607056,
1755
+ "eval_validation_mse": 0.7096761465072632,
1756
+ "eval_validation_pearson": 0.8435451256537725,
1757
+ "eval_validation_rmse": 0.8424227833747864,
1758
+ "eval_validation_runtime": 128.988,
1759
+ "eval_validation_samples_per_second": 2.651,
1760
+ "eval_validation_spearman": 0.8466052369033387,
1761
+ "eval_validation_steps_per_second": 2.651,
1762
+ "step": 1655
1763
+ },
1764
+ {
1765
+ "epoch": 10.0,
1766
+ "eval_test_loss": 0.7089855074882507,
1767
+ "eval_test_mae": 0.5238877534866333,
1768
+ "eval_test_mse": 0.7089855074882507,
1769
+ "eval_test_pearson": 0.8428611030663262,
1770
+ "eval_test_rmse": 0.8420127630233765,
1771
+ "eval_test_runtime": 128.948,
1772
+ "eval_test_samples_per_second": 2.652,
1773
+ "eval_test_spearman": 0.8455169232513489,
1774
+ "eval_test_steps_per_second": 2.652,
1775
+ "step": 1655
1776
+ },
1777
+ {
1778
+ "epoch": 10.0,
1779
+ "eval_myoglobin_loss": 0.7252078056335449,
1780
+ "eval_myoglobin_mae": 0.6012104749679565,
1781
+ "eval_myoglobin_mse": 0.7252078056335449,
1782
+ "eval_myoglobin_pearson": 0.636367763746648,
1783
+ "eval_myoglobin_rmse": 0.8515913486480713,
1784
+ "eval_myoglobin_runtime": 50.3593,
1785
+ "eval_myoglobin_samples_per_second": 2.661,
1786
+ "eval_myoglobin_spearman": 0.6746395254699745,
1787
+ "eval_myoglobin_steps_per_second": 2.661,
1788
+ "step": 1655
1789
+ },
1790
+ {
1791
+ "epoch": 10.0,
1792
+ "eval_myoglobin_r_loss": 0.726817786693573,
1793
+ "eval_myoglobin_r_mae": 0.6012217402458191,
1794
+ "eval_myoglobin_r_mse": 0.726817786693573,
1795
+ "eval_myoglobin_r_pearson": 0.6361235811248207,
1796
+ "eval_myoglobin_r_rmse": 0.8525360822677612,
1797
+ "eval_myoglobin_r_runtime": 50.3696,
1798
+ "eval_myoglobin_r_samples_per_second": 2.66,
1799
+ "eval_myoglobin_r_spearman": 0.6742305117044012,
1800
+ "eval_myoglobin_r_steps_per_second": 2.66,
1801
+ "step": 1655
1802
+ },
1803
+ {
1804
+ "epoch": 10.0,
1805
+ "eval_p53_loss": 3.99676775932312,
1806
+ "eval_p53_mae": 1.5769025087356567,
1807
+ "eval_p53_mse": 3.99676775932312,
1808
+ "eval_p53_pearson": 0.3403924251906574,
1809
+ "eval_p53_rmse": 1.9991917610168457,
1810
+ "eval_p53_runtime": 17.5283,
1811
+ "eval_p53_samples_per_second": 2.396,
1812
+ "eval_p53_spearman": 0.2857258622336363,
1813
+ "eval_p53_steps_per_second": 2.396,
1814
+ "step": 1655
1815
+ },
1816
+ {
1817
+ "epoch": 10.030211480362539,
1818
+ "grad_norm": 2.2076847553253174,
1819
+ "learning_rate": 6.021891132531825e-05,
1820
+ "loss": 0.0307,
1821
+ "step": 1660
1822
+ },
1823
+ {
1824
+ "epoch": 10.090634441087614,
1825
+ "grad_norm": 2.394089937210083,
1826
+ "learning_rate": 5.9688101405532925e-05,
1827
+ "loss": 0.0187,
1828
+ "step": 1670
1829
+ },
1830
+ {
1831
+ "epoch": 10.15105740181269,
1832
+ "grad_norm": 2.8219094276428223,
1833
+ "learning_rate": 5.9156154544092815e-05,
1834
+ "loss": 0.0259,
1835
+ "step": 1680
1836
+ },
1837
+ {
1838
+ "epoch": 10.211480362537765,
1839
+ "grad_norm": 2.308112144470215,
1840
+ "learning_rate": 5.862313316732063e-05,
1841
+ "loss": 0.027,
1842
+ "step": 1690
1843
+ },
1844
+ {
1845
+ "epoch": 10.27190332326284,
1846
+ "grad_norm": 2.3153975009918213,
1847
+ "learning_rate": 5.808909982763825e-05,
1848
+ "loss": 0.0238,
1849
+ "step": 1700
1850
+ },
1851
+ {
1852
+ "epoch": 10.332326283987916,
1853
+ "grad_norm": 3.1248135566711426,
1854
+ "learning_rate": 5.7554117196225846e-05,
1855
+ "loss": 0.0228,
1856
+ "step": 1710
1857
+ },
1858
+ {
1859
+ "epoch": 10.392749244712991,
1860
+ "grad_norm": 3.748237371444702,
1861
+ "learning_rate": 5.701824805566722e-05,
1862
+ "loss": 0.024,
1863
+ "step": 1720
1864
+ },
1865
+ {
1866
+ "epoch": 10.453172205438067,
1867
+ "grad_norm": 3.601746082305908,
1868
+ "learning_rate": 5.6481555292581946e-05,
1869
+ "loss": 0.0227,
1870
+ "step": 1730
1871
+ },
1872
+ {
1873
+ "epoch": 10.513595166163142,
1874
+ "grad_norm": 2.709296703338623,
1875
+ "learning_rate": 5.5944101890245324e-05,
1876
+ "loss": 0.0174,
1877
+ "step": 1740
1878
+ },
1879
+ {
1880
+ "epoch": 10.574018126888218,
1881
+ "grad_norm": 2.704521656036377,
1882
+ "learning_rate": 5.540595092119709e-05,
1883
+ "loss": 0.0315,
1884
+ "step": 1750
1885
+ },
1886
+ {
1887
+ "epoch": 10.634441087613293,
1888
+ "grad_norm": 3.2573423385620117,
1889
+ "learning_rate": 5.486716553983951e-05,
1890
+ "loss": 0.0287,
1891
+ "step": 1760
1892
+ },
1893
+ {
1894
+ "epoch": 10.694864048338369,
1895
+ "grad_norm": 4.1254682540893555,
1896
+ "learning_rate": 5.432780897502589e-05,
1897
+ "loss": 0.0311,
1898
+ "step": 1770
1899
+ },
1900
+ {
1901
+ "epoch": 10.755287009063444,
1902
+ "grad_norm": 4.715117454528809,
1903
+ "learning_rate": 5.378794452264053e-05,
1904
+ "loss": 0.022,
1905
+ "step": 1780
1906
+ },
1907
+ {
1908
+ "epoch": 10.81570996978852,
1909
+ "grad_norm": 2.1660892963409424,
1910
+ "learning_rate": 5.324763553817054e-05,
1911
+ "loss": 0.0211,
1912
+ "step": 1790
1913
+ },
1914
+ {
1915
+ "epoch": 10.876132930513595,
1916
+ "grad_norm": 3.1040217876434326,
1917
+ "learning_rate": 5.270694542927088e-05,
1918
+ "loss": 0.0347,
1919
+ "step": 1800
1920
+ },
1921
+ {
1922
+ "epoch": 10.93655589123867,
1923
+ "grad_norm": 2.742941379547119,
1924
+ "learning_rate": 5.216593764832311e-05,
1925
+ "loss": 0.0226,
1926
+ "step": 1810
1927
+ },
1928
+ {
1929
+ "epoch": 10.996978851963746,
1930
+ "grad_norm": 3.172571897506714,
1931
+ "learning_rate": 5.162467568498903e-05,
1932
+ "loss": 0.025,
1933
+ "step": 1820
1934
+ },
1935
+ {
1936
+ "epoch": 10.996978851963746,
1937
+ "eval_validation_loss": 0.6895685195922852,
1938
+ "eval_validation_mae": 0.5120286345481873,
1939
+ "eval_validation_mse": 0.6895685195922852,
1940
+ "eval_validation_pearson": 0.8489549431017167,
1941
+ "eval_validation_rmse": 0.8304026126861572,
1942
+ "eval_validation_runtime": 128.9581,
1943
+ "eval_validation_samples_per_second": 2.652,
1944
+ "eval_validation_spearman": 0.8526459553649132,
1945
+ "eval_validation_steps_per_second": 2.652,
1946
+ "step": 1820
1947
+ },
1948
+ {
1949
+ "epoch": 10.996978851963746,
1950
+ "eval_test_loss": 0.6883087158203125,
1951
+ "eval_test_mae": 0.5119835734367371,
1952
+ "eval_test_mse": 0.6883087158203125,
1953
+ "eval_test_pearson": 0.848377152226476,
1954
+ "eval_test_rmse": 0.829643726348877,
1955
+ "eval_test_runtime": 128.9938,
1956
+ "eval_test_samples_per_second": 2.651,
1957
+ "eval_test_spearman": 0.8513768312964548,
1958
+ "eval_test_steps_per_second": 2.651,
1959
+ "step": 1820
1960
+ },
1961
+ {
1962
+ "epoch": 10.996978851963746,
1963
+ "eval_myoglobin_loss": 0.7179906368255615,
1964
+ "eval_myoglobin_mae": 0.5888025760650635,
1965
+ "eval_myoglobin_mse": 0.7179906368255615,
1966
+ "eval_myoglobin_pearson": 0.6391641538657495,
1967
+ "eval_myoglobin_rmse": 0.8473432660102844,
1968
+ "eval_myoglobin_runtime": 50.3591,
1969
+ "eval_myoglobin_samples_per_second": 2.661,
1970
+ "eval_myoglobin_spearman": 0.6862166102248016,
1971
+ "eval_myoglobin_steps_per_second": 2.661,
1972
+ "step": 1820
1973
+ },
1974
+ {
1975
+ "epoch": 10.996978851963746,
1976
+ "eval_myoglobin_r_loss": 0.7204333543777466,
1977
+ "eval_myoglobin_r_mae": 0.5895242691040039,
1978
+ "eval_myoglobin_r_mse": 0.7204333543777466,
1979
+ "eval_myoglobin_r_pearson": 0.6383291059933072,
1980
+ "eval_myoglobin_r_rmse": 0.8487834334373474,
1981
+ "eval_myoglobin_r_runtime": 50.3628,
1982
+ "eval_myoglobin_r_samples_per_second": 2.661,
1983
+ "eval_myoglobin_r_spearman": 0.683303634138279,
1984
+ "eval_myoglobin_r_steps_per_second": 2.661,
1985
+ "step": 1820
1986
+ },
1987
+ {
1988
+ "epoch": 10.996978851963746,
1989
+ "eval_p53_loss": 4.048507213592529,
1990
+ "eval_p53_mae": 1.5837998390197754,
1991
+ "eval_p53_mse": 4.048507213592529,
1992
+ "eval_p53_pearson": 0.3178191790357232,
1993
+ "eval_p53_rmse": 2.0120902061462402,
1994
+ "eval_p53_runtime": 17.5292,
1995
+ "eval_p53_samples_per_second": 2.396,
1996
+ "eval_p53_spearman": 0.2873465421101742,
1997
+ "eval_p53_steps_per_second": 2.396,
1998
+ "step": 1820
1999
+ }
2000
+ ],
2001
+ "logging_steps": 10,
2002
+ "max_steps": 3300,
2003
+ "num_input_tokens_seen": 0,
2004
+ "num_train_epochs": 20,
2005
+ "save_steps": 500,
2006
+ "stateful_callbacks": {
2007
+ "TrainerControl": {
2008
+ "args": {
2009
+ "should_epoch_stop": false,
2010
+ "should_evaluate": false,
2011
+ "should_log": false,
2012
+ "should_save": true,
2013
+ "should_training_stop": false
2014
+ },
2015
+ "attributes": {}
2016
+ }
2017
+ },
2018
+ "total_flos": 0.0,
2019
+ "train_batch_size": 1,
2020
+ "trial_name": null,
2021
+ "trial_params": null
2022
+ }