stefan-it's picture
Upload folder using huggingface_hub
d0814d6
2023-10-17 15:20:15,009 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,010 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,010 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,010 Train: 5777 sentences
2023-10-17 15:20:15,011 (train_with_dev=False, train_with_test=False)
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Training Params:
2023-10-17 15:20:15,011 - learning_rate: "3e-05"
2023-10-17 15:20:15,011 - mini_batch_size: "4"
2023-10-17 15:20:15,011 - max_epochs: "10"
2023-10-17 15:20:15,011 - shuffle: "True"
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Plugins:
2023-10-17 15:20:15,011 - TensorboardLogger
2023-10-17 15:20:15,011 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 15:20:15,011 - metric: "('micro avg', 'f1-score')"
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Computation:
2023-10-17 15:20:15,011 - compute on device: cuda:0
2023-10-17 15:20:15,011 - embedding storage: none
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
2023-10-17 15:20:15,011 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 15:20:23,073 epoch 1 - iter 144/1445 - loss 2.33324949 - time (sec): 8.06 - samples/sec: 2302.06 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:20:30,031 epoch 1 - iter 288/1445 - loss 1.40669834 - time (sec): 15.02 - samples/sec: 2293.24 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:20:37,277 epoch 1 - iter 432/1445 - loss 0.99311609 - time (sec): 22.26 - samples/sec: 2340.14 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:20:44,228 epoch 1 - iter 576/1445 - loss 0.79532458 - time (sec): 29.22 - samples/sec: 2337.21 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:20:51,036 epoch 1 - iter 720/1445 - loss 0.66249281 - time (sec): 36.02 - samples/sec: 2401.24 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:20:58,193 epoch 1 - iter 864/1445 - loss 0.56791320 - time (sec): 43.18 - samples/sec: 2443.46 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:21:05,065 epoch 1 - iter 1008/1445 - loss 0.50430390 - time (sec): 50.05 - samples/sec: 2460.17 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:21:12,157 epoch 1 - iter 1152/1445 - loss 0.45583800 - time (sec): 57.14 - samples/sec: 2469.61 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:21:19,146 epoch 1 - iter 1296/1445 - loss 0.42025604 - time (sec): 64.13 - samples/sec: 2471.73 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:21:26,123 epoch 1 - iter 1440/1445 - loss 0.39100512 - time (sec): 71.11 - samples/sec: 2470.36 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:21:26,351 ----------------------------------------------------------------------------------------------------
2023-10-17 15:21:26,351 EPOCH 1 done: loss 0.3901 - lr: 0.000030
2023-10-17 15:21:29,006 DEV : loss 0.1134478896856308 - f1-score (micro avg) 0.6988
2023-10-17 15:21:29,021 saving best model
2023-10-17 15:21:29,419 ----------------------------------------------------------------------------------------------------
2023-10-17 15:21:36,099 epoch 2 - iter 144/1445 - loss 0.11423400 - time (sec): 6.68 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:21:42,808 epoch 2 - iter 288/1445 - loss 0.11006024 - time (sec): 13.39 - samples/sec: 2534.73 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:21:49,527 epoch 2 - iter 432/1445 - loss 0.10104983 - time (sec): 20.11 - samples/sec: 2552.67 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:21:56,423 epoch 2 - iter 576/1445 - loss 0.09617376 - time (sec): 27.00 - samples/sec: 2535.33 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:22:03,695 epoch 2 - iter 720/1445 - loss 0.09369082 - time (sec): 34.27 - samples/sec: 2531.86 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:22:11,248 epoch 2 - iter 864/1445 - loss 0.09042169 - time (sec): 41.83 - samples/sec: 2531.47 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:22:18,355 epoch 2 - iter 1008/1445 - loss 0.09025166 - time (sec): 48.93 - samples/sec: 2507.99 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:22:25,543 epoch 2 - iter 1152/1445 - loss 0.09048999 - time (sec): 56.12 - samples/sec: 2502.98 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:22:32,470 epoch 2 - iter 1296/1445 - loss 0.09136875 - time (sec): 63.05 - samples/sec: 2497.21 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:22:39,747 epoch 2 - iter 1440/1445 - loss 0.09254084 - time (sec): 70.33 - samples/sec: 2499.02 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:22:39,976 ----------------------------------------------------------------------------------------------------
2023-10-17 15:22:39,977 EPOCH 2 done: loss 0.0928 - lr: 0.000027
2023-10-17 15:22:43,507 DEV : loss 0.09797008335590363 - f1-score (micro avg) 0.7636
2023-10-17 15:22:43,523 saving best model
2023-10-17 15:22:44,069 ----------------------------------------------------------------------------------------------------
2023-10-17 15:22:51,178 epoch 3 - iter 144/1445 - loss 0.07299256 - time (sec): 7.11 - samples/sec: 2444.22 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:22:57,977 epoch 3 - iter 288/1445 - loss 0.06634279 - time (sec): 13.91 - samples/sec: 2490.30 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:23:05,074 epoch 3 - iter 432/1445 - loss 0.06581683 - time (sec): 21.00 - samples/sec: 2551.84 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:23:11,936 epoch 3 - iter 576/1445 - loss 0.07201697 - time (sec): 27.87 - samples/sec: 2538.19 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:23:18,903 epoch 3 - iter 720/1445 - loss 0.07104181 - time (sec): 34.83 - samples/sec: 2511.65 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:23:25,927 epoch 3 - iter 864/1445 - loss 0.07023237 - time (sec): 41.86 - samples/sec: 2516.01 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:23:33,061 epoch 3 - iter 1008/1445 - loss 0.06913832 - time (sec): 48.99 - samples/sec: 2490.96 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:23:40,153 epoch 3 - iter 1152/1445 - loss 0.06852697 - time (sec): 56.08 - samples/sec: 2486.94 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:23:47,600 epoch 3 - iter 1296/1445 - loss 0.06920774 - time (sec): 63.53 - samples/sec: 2481.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:23:54,842 epoch 3 - iter 1440/1445 - loss 0.06776713 - time (sec): 70.77 - samples/sec: 2483.99 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:23:55,074 ----------------------------------------------------------------------------------------------------
2023-10-17 15:23:55,074 EPOCH 3 done: loss 0.0679 - lr: 0.000023
2023-10-17 15:23:58,305 DEV : loss 0.07238871604204178 - f1-score (micro avg) 0.8599
2023-10-17 15:23:58,320 saving best model
2023-10-17 15:23:58,869 ----------------------------------------------------------------------------------------------------
2023-10-17 15:24:05,938 epoch 4 - iter 144/1445 - loss 0.03825652 - time (sec): 7.07 - samples/sec: 2582.66 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:24:13,011 epoch 4 - iter 288/1445 - loss 0.05121297 - time (sec): 14.14 - samples/sec: 2515.02 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:24:20,248 epoch 4 - iter 432/1445 - loss 0.04707007 - time (sec): 21.38 - samples/sec: 2475.52 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:24:27,099 epoch 4 - iter 576/1445 - loss 0.04864143 - time (sec): 28.23 - samples/sec: 2490.67 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:24:34,071 epoch 4 - iter 720/1445 - loss 0.05041580 - time (sec): 35.20 - samples/sec: 2469.04 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:24:41,242 epoch 4 - iter 864/1445 - loss 0.05057361 - time (sec): 42.37 - samples/sec: 2466.87 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:24:48,213 epoch 4 - iter 1008/1445 - loss 0.04981836 - time (sec): 49.34 - samples/sec: 2478.95 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:24:54,867 epoch 4 - iter 1152/1445 - loss 0.05022101 - time (sec): 56.00 - samples/sec: 2498.44 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:25:01,564 epoch 4 - iter 1296/1445 - loss 0.05005487 - time (sec): 62.69 - samples/sec: 2514.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:25:08,655 epoch 4 - iter 1440/1445 - loss 0.05140703 - time (sec): 69.78 - samples/sec: 2519.10 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:25:08,886 ----------------------------------------------------------------------------------------------------
2023-10-17 15:25:08,886 EPOCH 4 done: loss 0.0513 - lr: 0.000020
2023-10-17 15:25:12,521 DEV : loss 0.08748035877943039 - f1-score (micro avg) 0.8513
2023-10-17 15:25:12,536 ----------------------------------------------------------------------------------------------------
2023-10-17 15:25:20,019 epoch 5 - iter 144/1445 - loss 0.02484292 - time (sec): 7.48 - samples/sec: 2365.34 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:25:27,347 epoch 5 - iter 288/1445 - loss 0.02639251 - time (sec): 14.81 - samples/sec: 2426.84 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:25:34,576 epoch 5 - iter 432/1445 - loss 0.03062150 - time (sec): 22.04 - samples/sec: 2438.65 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:25:41,500 epoch 5 - iter 576/1445 - loss 0.03478141 - time (sec): 28.96 - samples/sec: 2433.14 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:25:48,647 epoch 5 - iter 720/1445 - loss 0.03435012 - time (sec): 36.11 - samples/sec: 2431.81 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:25:55,712 epoch 5 - iter 864/1445 - loss 0.03915073 - time (sec): 43.17 - samples/sec: 2429.56 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:26:02,874 epoch 5 - iter 1008/1445 - loss 0.04019423 - time (sec): 50.34 - samples/sec: 2424.89 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:26:09,941 epoch 5 - iter 1152/1445 - loss 0.04137874 - time (sec): 57.40 - samples/sec: 2449.49 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:26:16,823 epoch 5 - iter 1296/1445 - loss 0.04231658 - time (sec): 64.29 - samples/sec: 2455.18 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:26:24,165 epoch 5 - iter 1440/1445 - loss 0.04229848 - time (sec): 71.63 - samples/sec: 2452.34 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:26:24,411 ----------------------------------------------------------------------------------------------------
2023-10-17 15:26:24,411 EPOCH 5 done: loss 0.0423 - lr: 0.000017
2023-10-17 15:26:27,792 DEV : loss 0.11643949151039124 - f1-score (micro avg) 0.7813
2023-10-17 15:26:27,810 ----------------------------------------------------------------------------------------------------
2023-10-17 15:26:35,160 epoch 6 - iter 144/1445 - loss 0.04407312 - time (sec): 7.35 - samples/sec: 2381.66 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:26:42,213 epoch 6 - iter 288/1445 - loss 0.07041699 - time (sec): 14.40 - samples/sec: 2363.33 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:26:49,199 epoch 6 - iter 432/1445 - loss 0.06543099 - time (sec): 21.39 - samples/sec: 2412.35 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:26:56,329 epoch 6 - iter 576/1445 - loss 0.05494049 - time (sec): 28.52 - samples/sec: 2447.24 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:27:03,461 epoch 6 - iter 720/1445 - loss 0.05313692 - time (sec): 35.65 - samples/sec: 2468.33 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:27:10,213 epoch 6 - iter 864/1445 - loss 0.04946118 - time (sec): 42.40 - samples/sec: 2462.88 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:27:16,992 epoch 6 - iter 1008/1445 - loss 0.04715937 - time (sec): 49.18 - samples/sec: 2492.01 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:27:23,900 epoch 6 - iter 1152/1445 - loss 0.04549664 - time (sec): 56.09 - samples/sec: 2479.47 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:27:30,845 epoch 6 - iter 1296/1445 - loss 0.04407099 - time (sec): 63.03 - samples/sec: 2486.69 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:27:37,799 epoch 6 - iter 1440/1445 - loss 0.04231390 - time (sec): 69.99 - samples/sec: 2507.53 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:27:38,026 ----------------------------------------------------------------------------------------------------
2023-10-17 15:27:38,026 EPOCH 6 done: loss 0.0422 - lr: 0.000013
2023-10-17 15:27:41,262 DEV : loss 0.12753015756607056 - f1-score (micro avg) 0.8179
2023-10-17 15:27:41,277 ----------------------------------------------------------------------------------------------------
2023-10-17 15:27:48,074 epoch 7 - iter 144/1445 - loss 0.02864804 - time (sec): 6.80 - samples/sec: 2541.40 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:27:54,732 epoch 7 - iter 288/1445 - loss 0.02850202 - time (sec): 13.45 - samples/sec: 2529.29 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:28:01,745 epoch 7 - iter 432/1445 - loss 0.02700154 - time (sec): 20.47 - samples/sec: 2538.36 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:28:09,285 epoch 7 - iter 576/1445 - loss 0.02685090 - time (sec): 28.01 - samples/sec: 2500.37 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:28:16,228 epoch 7 - iter 720/1445 - loss 0.02631576 - time (sec): 34.95 - samples/sec: 2498.86 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:28:23,229 epoch 7 - iter 864/1445 - loss 0.02503229 - time (sec): 41.95 - samples/sec: 2527.10 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:28:30,267 epoch 7 - iter 1008/1445 - loss 0.02322476 - time (sec): 48.99 - samples/sec: 2513.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:28:37,208 epoch 7 - iter 1152/1445 - loss 0.02262618 - time (sec): 55.93 - samples/sec: 2507.14 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:28:44,333 epoch 7 - iter 1296/1445 - loss 0.02359839 - time (sec): 63.05 - samples/sec: 2507.22 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:28:51,234 epoch 7 - iter 1440/1445 - loss 0.02410498 - time (sec): 69.96 - samples/sec: 2512.80 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:28:51,455 ----------------------------------------------------------------------------------------------------
2023-10-17 15:28:51,456 EPOCH 7 done: loss 0.0241 - lr: 0.000010
2023-10-17 15:28:54,762 DEV : loss 0.13132773339748383 - f1-score (micro avg) 0.8242
2023-10-17 15:28:54,780 ----------------------------------------------------------------------------------------------------
2023-10-17 15:29:01,739 epoch 8 - iter 144/1445 - loss 0.02599645 - time (sec): 6.96 - samples/sec: 2320.98 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:29:09,413 epoch 8 - iter 288/1445 - loss 0.03908539 - time (sec): 14.63 - samples/sec: 2361.44 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:29:16,244 epoch 8 - iter 432/1445 - loss 0.04312177 - time (sec): 21.46 - samples/sec: 2436.90 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:29:23,284 epoch 8 - iter 576/1445 - loss 0.04654217 - time (sec): 28.50 - samples/sec: 2422.07 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:29:30,118 epoch 8 - iter 720/1445 - loss 0.04836767 - time (sec): 35.34 - samples/sec: 2436.55 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:29:37,165 epoch 8 - iter 864/1445 - loss 0.04435024 - time (sec): 42.38 - samples/sec: 2467.42 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:29:44,054 epoch 8 - iter 1008/1445 - loss 0.04368948 - time (sec): 49.27 - samples/sec: 2483.93 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:29:51,080 epoch 8 - iter 1152/1445 - loss 0.04126688 - time (sec): 56.30 - samples/sec: 2473.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:29:58,174 epoch 8 - iter 1296/1445 - loss 0.03803387 - time (sec): 63.39 - samples/sec: 2489.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:30:05,225 epoch 8 - iter 1440/1445 - loss 0.03611431 - time (sec): 70.44 - samples/sec: 2491.01 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:30:05,479 ----------------------------------------------------------------------------------------------------
2023-10-17 15:30:05,479 EPOCH 8 done: loss 0.0360 - lr: 0.000007
2023-10-17 15:30:08,822 DEV : loss 0.14063192903995514 - f1-score (micro avg) 0.8385
2023-10-17 15:30:08,840 ----------------------------------------------------------------------------------------------------
2023-10-17 15:30:16,101 epoch 9 - iter 144/1445 - loss 0.00928856 - time (sec): 7.26 - samples/sec: 2645.70 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:30:22,830 epoch 9 - iter 288/1445 - loss 0.01137305 - time (sec): 13.99 - samples/sec: 2509.03 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:30:30,015 epoch 9 - iter 432/1445 - loss 0.01191848 - time (sec): 21.17 - samples/sec: 2557.99 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:30:37,152 epoch 9 - iter 576/1445 - loss 0.01374329 - time (sec): 28.31 - samples/sec: 2558.62 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:30:44,261 epoch 9 - iter 720/1445 - loss 0.01533254 - time (sec): 35.42 - samples/sec: 2528.74 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:30:51,281 epoch 9 - iter 864/1445 - loss 0.01618231 - time (sec): 42.44 - samples/sec: 2487.04 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:30:58,224 epoch 9 - iter 1008/1445 - loss 0.01852939 - time (sec): 49.38 - samples/sec: 2494.62 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:31:05,713 epoch 9 - iter 1152/1445 - loss 0.02014171 - time (sec): 56.87 - samples/sec: 2484.98 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:31:12,765 epoch 9 - iter 1296/1445 - loss 0.02108402 - time (sec): 63.92 - samples/sec: 2486.65 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:31:19,478 epoch 9 - iter 1440/1445 - loss 0.02161319 - time (sec): 70.64 - samples/sec: 2483.77 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:31:19,741 ----------------------------------------------------------------------------------------------------
2023-10-17 15:31:19,741 EPOCH 9 done: loss 0.0215 - lr: 0.000003
2023-10-17 15:31:22,962 DEV : loss 0.15872889757156372 - f1-score (micro avg) 0.7959
2023-10-17 15:31:22,980 ----------------------------------------------------------------------------------------------------
2023-10-17 15:31:29,837 epoch 10 - iter 144/1445 - loss 0.02704362 - time (sec): 6.86 - samples/sec: 2545.17 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:31:36,571 epoch 10 - iter 288/1445 - loss 0.03061519 - time (sec): 13.59 - samples/sec: 2579.66 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:31:43,536 epoch 10 - iter 432/1445 - loss 0.02831624 - time (sec): 20.55 - samples/sec: 2514.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:31:50,270 epoch 10 - iter 576/1445 - loss 0.02567299 - time (sec): 27.29 - samples/sec: 2489.86 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:31:57,397 epoch 10 - iter 720/1445 - loss 0.02269889 - time (sec): 34.42 - samples/sec: 2518.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:32:04,472 epoch 10 - iter 864/1445 - loss 0.02217955 - time (sec): 41.49 - samples/sec: 2540.09 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:32:11,337 epoch 10 - iter 1008/1445 - loss 0.02100455 - time (sec): 48.36 - samples/sec: 2523.83 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:32:18,547 epoch 10 - iter 1152/1445 - loss 0.02030918 - time (sec): 55.57 - samples/sec: 2521.36 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:32:25,459 epoch 10 - iter 1296/1445 - loss 0.01984945 - time (sec): 62.48 - samples/sec: 2530.72 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:32:32,454 epoch 10 - iter 1440/1445 - loss 0.01908284 - time (sec): 69.47 - samples/sec: 2529.70 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:32:32,675 ----------------------------------------------------------------------------------------------------
2023-10-17 15:32:32,675 EPOCH 10 done: loss 0.0191 - lr: 0.000000
2023-10-17 15:32:35,914 DEV : loss 0.14746923744678497 - f1-score (micro avg) 0.8243
2023-10-17 15:32:36,350 ----------------------------------------------------------------------------------------------------
2023-10-17 15:32:36,352 Loading model from best epoch ...
2023-10-17 15:32:38,127 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 15:32:40,946
Results:
- F-score (micro) 0.8538
- F-score (macro) 0.7349
- Accuracy 0.7522
By class:
precision recall f1-score support
PER 0.8758 0.8340 0.8544 482
LOC 0.9385 0.8996 0.9186 458
ORG 0.4286 0.4348 0.4317 69
micro avg 0.8719 0.8365 0.8538 1009
macro avg 0.7476 0.7228 0.7349 1009
weighted avg 0.8737 0.8365 0.8546 1009
2023-10-17 15:32:40,946 ----------------------------------------------------------------------------------------------------