File size: 23,962 Bytes
19d3037
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
2023-10-17 08:54:26,109 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,111 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): ElectraModel(
      (embeddings): ElectraEmbeddings(
        (word_embeddings): Embedding(32001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): ElectraEncoder(
        (layer): ModuleList(
          (0-11): 12 x ElectraLayer(
            (attention): ElectraAttention(
              (self): ElectraSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): ElectraSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): ElectraIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): ElectraOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=13, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-17 08:54:26,111 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,112 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
 - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 08:54:26,112 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,112 Train:  6183 sentences
2023-10-17 08:54:26,112         (train_with_dev=False, train_with_test=False)
2023-10-17 08:54:26,112 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,112 Training Params:
2023-10-17 08:54:26,112  - learning_rate: "3e-05" 
2023-10-17 08:54:26,112  - mini_batch_size: "4"
2023-10-17 08:54:26,112  - max_epochs: "10"
2023-10-17 08:54:26,112  - shuffle: "True"
2023-10-17 08:54:26,112 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,112 Plugins:
2023-10-17 08:54:26,112  - TensorboardLogger
2023-10-17 08:54:26,112  - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 08:54:26,113 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,113 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 08:54:26,113  - metric: "('micro avg', 'f1-score')"
2023-10-17 08:54:26,113 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,113 Computation:
2023-10-17 08:54:26,113  - compute on device: cuda:0
2023-10-17 08:54:26,113  - embedding storage: none
2023-10-17 08:54:26,113 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,113 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 08:54:26,113 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,113 ----------------------------------------------------------------------------------------------------
2023-10-17 08:54:26,113 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 08:54:38,761 epoch 1 - iter 154/1546 - loss 2.03813618 - time (sec): 12.65 - samples/sec: 1016.55 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:54:50,424 epoch 1 - iter 308/1546 - loss 1.16576737 - time (sec): 24.31 - samples/sec: 1031.97 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:55:01,974 epoch 1 - iter 462/1546 - loss 0.82780908 - time (sec): 35.86 - samples/sec: 1043.52 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:55:13,456 epoch 1 - iter 616/1546 - loss 0.64844751 - time (sec): 47.34 - samples/sec: 1064.97 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:55:26,298 epoch 1 - iter 770/1546 - loss 0.54447510 - time (sec): 60.18 - samples/sec: 1040.86 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:55:38,367 epoch 1 - iter 924/1546 - loss 0.47295779 - time (sec): 72.25 - samples/sec: 1037.01 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:55:50,285 epoch 1 - iter 1078/1546 - loss 0.43087146 - time (sec): 84.17 - samples/sec: 1030.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:56:01,700 epoch 1 - iter 1232/1546 - loss 0.39626057 - time (sec): 95.59 - samples/sec: 1033.71 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:56:13,219 epoch 1 - iter 1386/1546 - loss 0.36195701 - time (sec): 107.10 - samples/sec: 1041.77 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:56:25,365 epoch 1 - iter 1540/1546 - loss 0.33627435 - time (sec): 119.25 - samples/sec: 1039.81 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:56:25,820 ----------------------------------------------------------------------------------------------------
2023-10-17 08:56:25,821 EPOCH 1 done: loss 0.3357 - lr: 0.000030
2023-10-17 08:56:28,051 DEV : loss 0.06119954213500023 - f1-score (micro avg)  0.7417
2023-10-17 08:56:28,079 saving best model
2023-10-17 08:56:28,614 ----------------------------------------------------------------------------------------------------
2023-10-17 08:56:40,142 epoch 2 - iter 154/1546 - loss 0.10453700 - time (sec): 11.53 - samples/sec: 1025.22 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:56:52,687 epoch 2 - iter 308/1546 - loss 0.08704820 - time (sec): 24.07 - samples/sec: 1003.63 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:57:05,545 epoch 2 - iter 462/1546 - loss 0.08423983 - time (sec): 36.93 - samples/sec: 1020.87 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:57:17,766 epoch 2 - iter 616/1546 - loss 0.08543159 - time (sec): 49.15 - samples/sec: 1019.96 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:57:30,051 epoch 2 - iter 770/1546 - loss 0.08700069 - time (sec): 61.43 - samples/sec: 1021.39 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:57:42,483 epoch 2 - iter 924/1546 - loss 0.08706791 - time (sec): 73.87 - samples/sec: 1014.33 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:57:55,337 epoch 2 - iter 1078/1546 - loss 0.08512417 - time (sec): 86.72 - samples/sec: 1008.52 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:58:07,949 epoch 2 - iter 1232/1546 - loss 0.08386350 - time (sec): 99.33 - samples/sec: 1012.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:58:20,787 epoch 2 - iter 1386/1546 - loss 0.08241631 - time (sec): 112.17 - samples/sec: 1000.28 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:58:32,838 epoch 2 - iter 1540/1546 - loss 0.08294873 - time (sec): 124.22 - samples/sec: 998.40 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:58:33,293 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:33,293 EPOCH 2 done: loss 0.0831 - lr: 0.000027
2023-10-17 08:58:36,749 DEV : loss 0.06554654985666275 - f1-score (micro avg)  0.7308
2023-10-17 08:58:36,779 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:48,693 epoch 3 - iter 154/1546 - loss 0.05336694 - time (sec): 11.91 - samples/sec: 981.36 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:59:01,019 epoch 3 - iter 308/1546 - loss 0.05246151 - time (sec): 24.24 - samples/sec: 1024.44 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:59:13,098 epoch 3 - iter 462/1546 - loss 0.04970324 - time (sec): 36.32 - samples/sec: 1051.46 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:59:25,184 epoch 3 - iter 616/1546 - loss 0.04604475 - time (sec): 48.40 - samples/sec: 1045.28 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:59:37,027 epoch 3 - iter 770/1546 - loss 0.04734557 - time (sec): 60.25 - samples/sec: 1036.50 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:59:48,854 epoch 3 - iter 924/1546 - loss 0.04896834 - time (sec): 72.07 - samples/sec: 1041.64 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:00:00,733 epoch 3 - iter 1078/1546 - loss 0.05098204 - time (sec): 83.95 - samples/sec: 1038.22 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:00:13,133 epoch 3 - iter 1232/1546 - loss 0.05011942 - time (sec): 96.35 - samples/sec: 1032.97 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:00:25,611 epoch 3 - iter 1386/1546 - loss 0.05242302 - time (sec): 108.83 - samples/sec: 1013.79 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:00:37,699 epoch 3 - iter 1540/1546 - loss 0.05287108 - time (sec): 120.92 - samples/sec: 1024.65 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:00:38,164 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:38,164 EPOCH 3 done: loss 0.0528 - lr: 0.000023
2023-10-17 09:00:41,078 DEV : loss 0.06000832840800285 - f1-score (micro avg)  0.8048
2023-10-17 09:00:41,106 saving best model
2023-10-17 09:00:42,500 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:54,331 epoch 4 - iter 154/1546 - loss 0.03629651 - time (sec): 11.83 - samples/sec: 1087.02 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:01:06,052 epoch 4 - iter 308/1546 - loss 0.03234710 - time (sec): 23.55 - samples/sec: 1041.51 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:01:18,027 epoch 4 - iter 462/1546 - loss 0.03335157 - time (sec): 35.52 - samples/sec: 1056.04 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:01:30,170 epoch 4 - iter 616/1546 - loss 0.03256761 - time (sec): 47.67 - samples/sec: 1049.02 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:01:42,048 epoch 4 - iter 770/1546 - loss 0.03247698 - time (sec): 59.54 - samples/sec: 1048.24 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:01:53,943 epoch 4 - iter 924/1546 - loss 0.03387399 - time (sec): 71.44 - samples/sec: 1054.17 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:02:05,990 epoch 4 - iter 1078/1546 - loss 0.03389742 - time (sec): 83.49 - samples/sec: 1054.23 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:02:17,845 epoch 4 - iter 1232/1546 - loss 0.03395574 - time (sec): 95.34 - samples/sec: 1045.74 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:02:29,692 epoch 4 - iter 1386/1546 - loss 0.03458275 - time (sec): 107.19 - samples/sec: 1040.29 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:02:41,652 epoch 4 - iter 1540/1546 - loss 0.03469566 - time (sec): 119.15 - samples/sec: 1040.31 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:02:42,104 ----------------------------------------------------------------------------------------------------
2023-10-17 09:02:42,104 EPOCH 4 done: loss 0.0350 - lr: 0.000020
2023-10-17 09:02:44,928 DEV : loss 0.08604831993579865 - f1-score (micro avg)  0.7776
2023-10-17 09:02:44,958 ----------------------------------------------------------------------------------------------------
2023-10-17 09:02:57,022 epoch 5 - iter 154/1546 - loss 0.02284007 - time (sec): 12.06 - samples/sec: 983.65 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:03:09,189 epoch 5 - iter 308/1546 - loss 0.01803104 - time (sec): 24.23 - samples/sec: 1002.50 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:03:21,262 epoch 5 - iter 462/1546 - loss 0.01912710 - time (sec): 36.30 - samples/sec: 996.31 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:03:33,353 epoch 5 - iter 616/1546 - loss 0.01989178 - time (sec): 48.39 - samples/sec: 1000.35 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:03:45,469 epoch 5 - iter 770/1546 - loss 0.02252057 - time (sec): 60.51 - samples/sec: 1014.50 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:03:57,436 epoch 5 - iter 924/1546 - loss 0.02304384 - time (sec): 72.48 - samples/sec: 1021.05 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:04:09,542 epoch 5 - iter 1078/1546 - loss 0.02187336 - time (sec): 84.58 - samples/sec: 1022.70 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:04:21,529 epoch 5 - iter 1232/1546 - loss 0.02255478 - time (sec): 96.57 - samples/sec: 1021.76 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:04:33,945 epoch 5 - iter 1386/1546 - loss 0.02290087 - time (sec): 108.98 - samples/sec: 1026.82 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:04:46,517 epoch 5 - iter 1540/1546 - loss 0.02398018 - time (sec): 121.56 - samples/sec: 1018.04 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:04:47,010 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:47,010 EPOCH 5 done: loss 0.0242 - lr: 0.000017
2023-10-17 09:04:50,104 DEV : loss 0.09960237890481949 - f1-score (micro avg)  0.7876
2023-10-17 09:04:50,137 ----------------------------------------------------------------------------------------------------
2023-10-17 09:05:02,656 epoch 6 - iter 154/1546 - loss 0.01677459 - time (sec): 12.52 - samples/sec: 1024.75 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:05:14,857 epoch 6 - iter 308/1546 - loss 0.01372105 - time (sec): 24.72 - samples/sec: 1041.31 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:05:27,217 epoch 6 - iter 462/1546 - loss 0.01421126 - time (sec): 37.08 - samples/sec: 1025.57 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:05:39,731 epoch 6 - iter 616/1546 - loss 0.01600038 - time (sec): 49.59 - samples/sec: 1019.56 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:05:52,141 epoch 6 - iter 770/1546 - loss 0.01705876 - time (sec): 62.00 - samples/sec: 1024.70 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:06:04,991 epoch 6 - iter 924/1546 - loss 0.01729932 - time (sec): 74.85 - samples/sec: 1003.82 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:06:17,860 epoch 6 - iter 1078/1546 - loss 0.01687148 - time (sec): 87.72 - samples/sec: 990.68 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:06:30,814 epoch 6 - iter 1232/1546 - loss 0.01640763 - time (sec): 100.67 - samples/sec: 980.55 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:06:44,004 epoch 6 - iter 1386/1546 - loss 0.01674242 - time (sec): 113.86 - samples/sec: 978.03 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:06:57,512 epoch 6 - iter 1540/1546 - loss 0.01702449 - time (sec): 127.37 - samples/sec: 972.60 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:06:58,036 ----------------------------------------------------------------------------------------------------
2023-10-17 09:06:58,037 EPOCH 6 done: loss 0.0170 - lr: 0.000013
2023-10-17 09:07:00,854 DEV : loss 0.09961654990911484 - f1-score (micro avg)  0.7976
2023-10-17 09:07:00,882 ----------------------------------------------------------------------------------------------------
2023-10-17 09:07:14,298 epoch 7 - iter 154/1546 - loss 0.00495503 - time (sec): 13.41 - samples/sec: 873.42 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:07:26,536 epoch 7 - iter 308/1546 - loss 0.01087828 - time (sec): 25.65 - samples/sec: 925.23 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:07:38,433 epoch 7 - iter 462/1546 - loss 0.01334889 - time (sec): 37.55 - samples/sec: 964.89 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:07:50,256 epoch 7 - iter 616/1546 - loss 0.01321695 - time (sec): 49.37 - samples/sec: 991.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:08:01,813 epoch 7 - iter 770/1546 - loss 0.01313843 - time (sec): 60.93 - samples/sec: 1010.71 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:08:13,352 epoch 7 - iter 924/1546 - loss 0.01154668 - time (sec): 72.47 - samples/sec: 1020.16 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:08:24,980 epoch 7 - iter 1078/1546 - loss 0.01103876 - time (sec): 84.10 - samples/sec: 1022.36 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:08:36,738 epoch 7 - iter 1232/1546 - loss 0.01096499 - time (sec): 95.85 - samples/sec: 1033.68 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:08:48,467 epoch 7 - iter 1386/1546 - loss 0.01135222 - time (sec): 107.58 - samples/sec: 1039.84 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:09:01,265 epoch 7 - iter 1540/1546 - loss 0.01229847 - time (sec): 120.38 - samples/sec: 1027.48 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:09:01,773 ----------------------------------------------------------------------------------------------------
2023-10-17 09:09:01,774 EPOCH 7 done: loss 0.0122 - lr: 0.000010
2023-10-17 09:09:04,924 DEV : loss 0.10686086863279343 - f1-score (micro avg)  0.8068
2023-10-17 09:09:04,958 saving best model
2023-10-17 09:09:06,388 ----------------------------------------------------------------------------------------------------
2023-10-17 09:09:18,894 epoch 8 - iter 154/1546 - loss 0.00730795 - time (sec): 12.50 - samples/sec: 990.35 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:09:31,198 epoch 8 - iter 308/1546 - loss 0.00668288 - time (sec): 24.80 - samples/sec: 1018.51 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:09:43,852 epoch 8 - iter 462/1546 - loss 0.00776588 - time (sec): 37.46 - samples/sec: 997.70 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:09:56,384 epoch 8 - iter 616/1546 - loss 0.00746374 - time (sec): 49.99 - samples/sec: 990.02 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:10:08,876 epoch 8 - iter 770/1546 - loss 0.00675501 - time (sec): 62.48 - samples/sec: 984.51 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:10:21,661 epoch 8 - iter 924/1546 - loss 0.00715765 - time (sec): 75.27 - samples/sec: 991.92 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:10:34,171 epoch 8 - iter 1078/1546 - loss 0.00694728 - time (sec): 87.78 - samples/sec: 998.22 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:10:46,699 epoch 8 - iter 1232/1546 - loss 0.00699978 - time (sec): 100.30 - samples/sec: 991.84 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:10:58,637 epoch 8 - iter 1386/1546 - loss 0.00720570 - time (sec): 112.24 - samples/sec: 988.67 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:11:10,512 epoch 8 - iter 1540/1546 - loss 0.00770745 - time (sec): 124.12 - samples/sec: 998.57 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:11:10,965 ----------------------------------------------------------------------------------------------------
2023-10-17 09:11:10,965 EPOCH 8 done: loss 0.0077 - lr: 0.000007
2023-10-17 09:11:13,892 DEV : loss 0.10704014450311661 - f1-score (micro avg)  0.8
2023-10-17 09:11:13,922 ----------------------------------------------------------------------------------------------------
2023-10-17 09:11:25,981 epoch 9 - iter 154/1546 - loss 0.00378941 - time (sec): 12.05 - samples/sec: 1045.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:11:37,802 epoch 9 - iter 308/1546 - loss 0.00283362 - time (sec): 23.88 - samples/sec: 1027.32 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:11:50,093 epoch 9 - iter 462/1546 - loss 0.00396993 - time (sec): 36.17 - samples/sec: 1034.15 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:12:03,236 epoch 9 - iter 616/1546 - loss 0.00400059 - time (sec): 49.31 - samples/sec: 996.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:12:15,466 epoch 9 - iter 770/1546 - loss 0.00385112 - time (sec): 61.54 - samples/sec: 1005.67 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:12:27,700 epoch 9 - iter 924/1546 - loss 0.00459493 - time (sec): 73.77 - samples/sec: 1003.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:12:39,832 epoch 9 - iter 1078/1546 - loss 0.00412718 - time (sec): 85.91 - samples/sec: 1011.44 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:12:51,879 epoch 9 - iter 1232/1546 - loss 0.00405342 - time (sec): 97.95 - samples/sec: 1010.84 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:13:03,846 epoch 9 - iter 1386/1546 - loss 0.00404944 - time (sec): 109.92 - samples/sec: 1021.84 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:13:15,705 epoch 9 - iter 1540/1546 - loss 0.00452713 - time (sec): 121.78 - samples/sec: 1016.74 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:13:16,160 ----------------------------------------------------------------------------------------------------
2023-10-17 09:13:16,160 EPOCH 9 done: loss 0.0045 - lr: 0.000003
2023-10-17 09:13:18,878 DEV : loss 0.12154770642518997 - f1-score (micro avg)  0.7983
2023-10-17 09:13:18,904 ----------------------------------------------------------------------------------------------------
2023-10-17 09:13:30,793 epoch 10 - iter 154/1546 - loss 0.00274083 - time (sec): 11.89 - samples/sec: 1053.40 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:13:42,788 epoch 10 - iter 308/1546 - loss 0.00390665 - time (sec): 23.88 - samples/sec: 1038.05 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:13:54,809 epoch 10 - iter 462/1546 - loss 0.00330672 - time (sec): 35.90 - samples/sec: 1054.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:14:07,461 epoch 10 - iter 616/1546 - loss 0.00342831 - time (sec): 48.56 - samples/sec: 1035.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:14:20,606 epoch 10 - iter 770/1546 - loss 0.00329853 - time (sec): 61.70 - samples/sec: 1013.00 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:14:33,362 epoch 10 - iter 924/1546 - loss 0.00299964 - time (sec): 74.46 - samples/sec: 996.99 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:14:46,754 epoch 10 - iter 1078/1546 - loss 0.00313900 - time (sec): 87.85 - samples/sec: 990.71 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:14:59,125 epoch 10 - iter 1232/1546 - loss 0.00324827 - time (sec): 100.22 - samples/sec: 986.19 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:15:12,274 epoch 10 - iter 1386/1546 - loss 0.00311186 - time (sec): 113.37 - samples/sec: 983.00 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:15:25,848 epoch 10 - iter 1540/1546 - loss 0.00341137 - time (sec): 126.94 - samples/sec: 975.57 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:15:26,351 ----------------------------------------------------------------------------------------------------
2023-10-17 09:15:26,351 EPOCH 10 done: loss 0.0034 - lr: 0.000000
2023-10-17 09:15:29,674 DEV : loss 0.11997128278017044 - f1-score (micro avg)  0.7886
2023-10-17 09:15:30,263 ----------------------------------------------------------------------------------------------------
2023-10-17 09:15:30,265 Loading model from best epoch ...
2023-10-17 09:15:32,732 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 09:15:40,874 
Results:
- F-score (micro) 0.8096
- F-score (macro) 0.7186
- Accuracy 0.6984

By class:
              precision    recall  f1-score   support

         LOC     0.8731    0.8362    0.8542       946
    BUILDING     0.6806    0.5297    0.5957       185
      STREET     0.6667    0.7500    0.7059        56

   micro avg     0.8365    0.7843    0.8096      1187
   macro avg     0.7401    0.7053    0.7186      1187
weighted avg     0.8333    0.7843    0.8069      1187

2023-10-17 09:15:40,875 ----------------------------------------------------------------------------------------------------