2023-10-17 16:51:50,595 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Train: 5777 sentences 2023-10-17 16:51:50,596 (train_with_dev=False, train_with_test=False) 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Training Params: 2023-10-17 16:51:50,596 - learning_rate: "3e-05" 2023-10-17 16:51:50,596 - mini_batch_size: "4" 2023-10-17 16:51:50,596 - max_epochs: "10" 2023-10-17 16:51:50,596 - shuffle: "True" 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Plugins: 2023-10-17 16:51:50,596 - TensorboardLogger 2023-10-17 16:51:50,596 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:51:50,596 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Computation: 2023-10-17 16:51:50,596 - compute on device: cuda:0 2023-10-17 16:51:50,596 - embedding storage: none 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,596 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 16:51:50,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,597 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:50,597 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:51:57,734 epoch 1 - iter 144/1445 - loss 2.51964113 - time (sec): 7.14 - samples/sec: 2407.06 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:52:04,801 epoch 1 - iter 288/1445 - loss 1.46886525 - time (sec): 14.20 - samples/sec: 2396.44 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:52:11,824 epoch 1 - iter 432/1445 - loss 1.03580590 - time (sec): 21.23 - samples/sec: 2436.57 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:52:18,859 epoch 1 - iter 576/1445 - loss 0.82263272 - time (sec): 28.26 - samples/sec: 2459.21 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:52:26,087 epoch 1 - iter 720/1445 - loss 0.67840038 - time (sec): 35.49 - samples/sec: 2485.64 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:52:33,197 epoch 1 - iter 864/1445 - loss 0.58315275 - time (sec): 42.60 - samples/sec: 2500.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:52:40,137 epoch 1 - iter 1008/1445 - loss 0.51794596 - time (sec): 49.54 - samples/sec: 2498.80 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:52:47,253 epoch 1 - iter 1152/1445 - loss 0.46799546 - time (sec): 56.66 - samples/sec: 2495.94 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:52:54,086 epoch 1 - iter 1296/1445 - loss 0.43471267 - time (sec): 63.49 - samples/sec: 2473.24 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:53:01,099 epoch 1 - iter 1440/1445 - loss 0.40016424 - time (sec): 70.50 - samples/sec: 2488.40 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:53:01,370 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:53:01,371 EPOCH 1 done: loss 0.3987 - lr: 0.000030 2023-10-17 16:53:03,993 DEV : loss 0.12171746790409088 - f1-score (micro avg) 0.7585 2023-10-17 16:53:04,008 saving best model 2023-10-17 16:53:04,339 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:53:11,206 epoch 2 - iter 144/1445 - loss 0.09547817 - time (sec): 6.87 - samples/sec: 2527.57 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:53:18,194 epoch 2 - iter 288/1445 - loss 0.10207718 - time (sec): 13.85 - samples/sec: 2511.05 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:53:25,162 epoch 2 - iter 432/1445 - loss 0.10574612 - time (sec): 20.82 - samples/sec: 2486.76 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:53:32,148 epoch 2 - iter 576/1445 - loss 0.10231456 - time (sec): 27.81 - samples/sec: 2484.41 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:53:39,311 epoch 2 - iter 720/1445 - loss 0.09986497 - time (sec): 34.97 - samples/sec: 2513.27 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:53:46,749 epoch 2 - iter 864/1445 - loss 0.09794746 - time (sec): 42.41 - samples/sec: 2537.18 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:53:53,859 epoch 2 - iter 1008/1445 - loss 0.09611171 - time (sec): 49.52 - samples/sec: 2529.21 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:54:01,602 epoch 2 - iter 1152/1445 - loss 0.09333721 - time (sec): 57.26 - samples/sec: 2481.95 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:54:09,733 epoch 2 - iter 1296/1445 - loss 0.09231752 - time (sec): 65.39 - samples/sec: 2426.32 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:54:17,227 epoch 2 - iter 1440/1445 - loss 0.09139074 - time (sec): 72.89 - samples/sec: 2408.27 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:54:17,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:17,478 EPOCH 2 done: loss 0.0913 - lr: 0.000027 2023-10-17 16:54:21,098 DEV : loss 0.08095023036003113 - f1-score (micro avg) 0.8018 2023-10-17 16:54:21,115 saving best model 2023-10-17 16:54:21,557 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:28,597 epoch 3 - iter 144/1445 - loss 0.07907593 - time (sec): 7.03 - samples/sec: 2471.23 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:54:35,963 epoch 3 - iter 288/1445 - loss 0.06930842 - time (sec): 14.40 - samples/sec: 2491.47 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:54:42,970 epoch 3 - iter 432/1445 - loss 0.06547744 - time (sec): 21.41 - samples/sec: 2533.28 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:54:49,929 epoch 3 - iter 576/1445 - loss 0.06189547 - time (sec): 28.37 - samples/sec: 2537.73 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:54:56,792 epoch 3 - iter 720/1445 - loss 0.06117221 - time (sec): 35.23 - samples/sec: 2510.04 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:55:04,122 epoch 3 - iter 864/1445 - loss 0.06199025 - time (sec): 42.56 - samples/sec: 2503.89 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:55:11,617 epoch 3 - iter 1008/1445 - loss 0.06500361 - time (sec): 50.05 - samples/sec: 2487.30 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:55:18,545 epoch 3 - iter 1152/1445 - loss 0.06477948 - time (sec): 56.98 - samples/sec: 2477.01 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:55:25,742 epoch 3 - iter 1296/1445 - loss 0.06476144 - time (sec): 64.18 - samples/sec: 2466.25 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:55:32,997 epoch 3 - iter 1440/1445 - loss 0.06672483 - time (sec): 71.43 - samples/sec: 2462.82 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:55:33,220 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:33,221 EPOCH 3 done: loss 0.0667 - lr: 0.000023 2023-10-17 16:55:36,446 DEV : loss 0.07555373758077621 - f1-score (micro avg) 0.8639 2023-10-17 16:55:36,464 saving best model 2023-10-17 16:55:36,902 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:43,816 epoch 4 - iter 144/1445 - loss 0.04753253 - time (sec): 6.91 - samples/sec: 2416.67 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:55:51,351 epoch 4 - iter 288/1445 - loss 0.04514475 - time (sec): 14.44 - samples/sec: 2398.27 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:55:58,535 epoch 4 - iter 432/1445 - loss 0.04852100 - time (sec): 21.63 - samples/sec: 2392.92 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:56:05,540 epoch 4 - iter 576/1445 - loss 0.05156722 - time (sec): 28.63 - samples/sec: 2418.03 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:56:12,372 epoch 4 - iter 720/1445 - loss 0.05036006 - time (sec): 35.47 - samples/sec: 2433.19 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:56:19,508 epoch 4 - iter 864/1445 - loss 0.05094762 - time (sec): 42.60 - samples/sec: 2438.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:56:26,524 epoch 4 - iter 1008/1445 - loss 0.05000499 - time (sec): 49.62 - samples/sec: 2453.87 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:56:33,720 epoch 4 - iter 1152/1445 - loss 0.05036806 - time (sec): 56.81 - samples/sec: 2481.30 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:56:40,708 epoch 4 - iter 1296/1445 - loss 0.04919696 - time (sec): 63.80 - samples/sec: 2475.63 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:56:47,668 epoch 4 - iter 1440/1445 - loss 0.04959858 - time (sec): 70.76 - samples/sec: 2483.70 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:56:47,903 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:47,903 EPOCH 4 done: loss 0.0496 - lr: 0.000020 2023-10-17 16:56:51,737 DEV : loss 0.08526481688022614 - f1-score (micro avg) 0.8661 2023-10-17 16:56:51,757 saving best model 2023-10-17 16:56:52,198 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:59,878 epoch 5 - iter 144/1445 - loss 0.02983267 - time (sec): 7.68 - samples/sec: 2338.64 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:57:06,651 epoch 5 - iter 288/1445 - loss 0.02587426 - time (sec): 14.45 - samples/sec: 2423.26 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:13,811 epoch 5 - iter 432/1445 - loss 0.03178345 - time (sec): 21.61 - samples/sec: 2453.21 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:20,904 epoch 5 - iter 576/1445 - loss 0.03049647 - time (sec): 28.70 - samples/sec: 2478.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:27,751 epoch 5 - iter 720/1445 - loss 0.02973021 - time (sec): 35.55 - samples/sec: 2474.24 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:57:34,537 epoch 5 - iter 864/1445 - loss 0.03342990 - time (sec): 42.34 - samples/sec: 2504.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:57:41,369 epoch 5 - iter 1008/1445 - loss 0.03551324 - time (sec): 49.17 - samples/sec: 2522.36 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:57:48,235 epoch 5 - iter 1152/1445 - loss 0.03569568 - time (sec): 56.03 - samples/sec: 2512.81 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:57:55,333 epoch 5 - iter 1296/1445 - loss 0.03592580 - time (sec): 63.13 - samples/sec: 2500.90 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:58:03,284 epoch 5 - iter 1440/1445 - loss 0.03540825 - time (sec): 71.08 - samples/sec: 2471.52 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:58:03,556 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:58:03,556 EPOCH 5 done: loss 0.0353 - lr: 0.000017 2023-10-17 16:58:06,848 DEV : loss 0.12203659117221832 - f1-score (micro avg) 0.831 2023-10-17 16:58:06,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:58:14,955 epoch 6 - iter 144/1445 - loss 0.02842518 - time (sec): 8.09 - samples/sec: 2318.05 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:58:22,926 epoch 6 - iter 288/1445 - loss 0.02055464 - time (sec): 16.06 - samples/sec: 2226.37 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:58:31,227 epoch 6 - iter 432/1445 - loss 0.02489207 - time (sec): 24.36 - samples/sec: 2229.50 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:58:38,990 epoch 6 - iter 576/1445 - loss 0.02645244 - time (sec): 32.12 - samples/sec: 2259.49 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:58:45,907 epoch 6 - iter 720/1445 - loss 0.02618404 - time (sec): 39.04 - samples/sec: 2296.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:58:52,910 epoch 6 - iter 864/1445 - loss 0.02602900 - time (sec): 46.04 - samples/sec: 2339.43 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:58:59,842 epoch 6 - iter 1008/1445 - loss 0.02750063 - time (sec): 52.97 - samples/sec: 2349.00 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:59:06,963 epoch 6 - iter 1152/1445 - loss 0.02837136 - time (sec): 60.09 - samples/sec: 2363.19 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:59:13,986 epoch 6 - iter 1296/1445 - loss 0.02684921 - time (sec): 67.12 - samples/sec: 2363.48 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:59:20,726 epoch 6 - iter 1440/1445 - loss 0.02655197 - time (sec): 73.86 - samples/sec: 2379.68 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:59:20,957 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:59:20,957 EPOCH 6 done: loss 0.0265 - lr: 0.000013 2023-10-17 16:59:24,232 DEV : loss 0.1185513511300087 - f1-score (micro avg) 0.8581 2023-10-17 16:59:24,251 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:59:31,348 epoch 7 - iter 144/1445 - loss 0.01592139 - time (sec): 7.10 - samples/sec: 2420.48 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:59:38,719 epoch 7 - iter 288/1445 - loss 0.01953479 - time (sec): 14.47 - samples/sec: 2385.00 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:59:45,900 epoch 7 - iter 432/1445 - loss 0.01960054 - time (sec): 21.65 - samples/sec: 2414.78 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:59:53,535 epoch 7 - iter 576/1445 - loss 0.02108098 - time (sec): 29.28 - samples/sec: 2404.40 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:00:00,596 epoch 7 - iter 720/1445 - loss 0.01990454 - time (sec): 36.34 - samples/sec: 2412.16 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:00:07,764 epoch 7 - iter 864/1445 - loss 0.02048029 - time (sec): 43.51 - samples/sec: 2438.92 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:00:14,808 epoch 7 - iter 1008/1445 - loss 0.01953974 - time (sec): 50.56 - samples/sec: 2460.86 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:00:21,791 epoch 7 - iter 1152/1445 - loss 0.01979941 - time (sec): 57.54 - samples/sec: 2460.68 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:00:28,760 epoch 7 - iter 1296/1445 - loss 0.01839308 - time (sec): 64.51 - samples/sec: 2453.29 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:00:35,846 epoch 7 - iter 1440/1445 - loss 0.01795360 - time (sec): 71.59 - samples/sec: 2449.53 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:00:36,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:00:36,194 EPOCH 7 done: loss 0.0180 - lr: 0.000010 2023-10-17 17:00:39,469 DEV : loss 0.12616300582885742 - f1-score (micro avg) 0.8691 2023-10-17 17:00:39,485 saving best model 2023-10-17 17:00:39,951 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:00:46,624 epoch 8 - iter 144/1445 - loss 0.01689844 - time (sec): 6.67 - samples/sec: 2504.32 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:00:53,857 epoch 8 - iter 288/1445 - loss 0.01100211 - time (sec): 13.90 - samples/sec: 2442.08 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:01:00,829 epoch 8 - iter 432/1445 - loss 0.01437202 - time (sec): 20.88 - samples/sec: 2452.85 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:01:07,822 epoch 8 - iter 576/1445 - loss 0.01241985 - time (sec): 27.87 - samples/sec: 2473.87 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:01:15,067 epoch 8 - iter 720/1445 - loss 0.01363731 - time (sec): 35.11 - samples/sec: 2465.52 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:01:22,117 epoch 8 - iter 864/1445 - loss 0.01368516 - time (sec): 42.16 - samples/sec: 2464.53 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:01:29,232 epoch 8 - iter 1008/1445 - loss 0.01357477 - time (sec): 49.28 - samples/sec: 2472.94 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:01:36,222 epoch 8 - iter 1152/1445 - loss 0.01410465 - time (sec): 56.27 - samples/sec: 2479.90 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:01:43,482 epoch 8 - iter 1296/1445 - loss 0.01364676 - time (sec): 63.53 - samples/sec: 2503.19 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:01:50,427 epoch 8 - iter 1440/1445 - loss 0.01339047 - time (sec): 70.47 - samples/sec: 2491.94 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:01:50,657 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:01:50,657 EPOCH 8 done: loss 0.0134 - lr: 0.000007 2023-10-17 17:01:53,972 DEV : loss 0.12705564498901367 - f1-score (micro avg) 0.872 2023-10-17 17:01:53,990 saving best model 2023-10-17 17:01:54,439 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:02:01,517 epoch 9 - iter 144/1445 - loss 0.00381457 - time (sec): 7.08 - samples/sec: 2421.87 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:02:08,664 epoch 9 - iter 288/1445 - loss 0.00734927 - time (sec): 14.22 - samples/sec: 2462.26 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:02:15,773 epoch 9 - iter 432/1445 - loss 0.00735271 - time (sec): 21.33 - samples/sec: 2461.35 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:02:23,175 epoch 9 - iter 576/1445 - loss 0.00769310 - time (sec): 28.73 - samples/sec: 2467.86 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:02:30,572 epoch 9 - iter 720/1445 - loss 0.00864776 - time (sec): 36.13 - samples/sec: 2470.49 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:02:38,094 epoch 9 - iter 864/1445 - loss 0.00916627 - time (sec): 43.65 - samples/sec: 2447.52 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:02:45,162 epoch 9 - iter 1008/1445 - loss 0.00973331 - time (sec): 50.72 - samples/sec: 2433.57 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:02:51,976 epoch 9 - iter 1152/1445 - loss 0.00925598 - time (sec): 57.53 - samples/sec: 2426.27 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:02:59,048 epoch 9 - iter 1296/1445 - loss 0.00914846 - time (sec): 64.61 - samples/sec: 2445.75 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:03:06,122 epoch 9 - iter 1440/1445 - loss 0.00948694 - time (sec): 71.68 - samples/sec: 2450.76 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:03:06,346 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:03:06,347 EPOCH 9 done: loss 0.0095 - lr: 0.000003 2023-10-17 17:03:09,701 DEV : loss 0.14085045456886292 - f1-score (micro avg) 0.8698 2023-10-17 17:03:09,724 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:03:16,841 epoch 10 - iter 144/1445 - loss 0.00996398 - time (sec): 7.12 - samples/sec: 2532.12 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:03:23,906 epoch 10 - iter 288/1445 - loss 0.00789760 - time (sec): 14.18 - samples/sec: 2449.41 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:03:31,192 epoch 10 - iter 432/1445 - loss 0.00862420 - time (sec): 21.47 - samples/sec: 2474.56 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:03:38,308 epoch 10 - iter 576/1445 - loss 0.00751122 - time (sec): 28.58 - samples/sec: 2490.18 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:03:45,506 epoch 10 - iter 720/1445 - loss 0.00674528 - time (sec): 35.78 - samples/sec: 2480.89 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:03:52,464 epoch 10 - iter 864/1445 - loss 0.00666326 - time (sec): 42.74 - samples/sec: 2488.41 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:03:59,509 epoch 10 - iter 1008/1445 - loss 0.00638795 - time (sec): 49.78 - samples/sec: 2479.99 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:04:06,580 epoch 10 - iter 1152/1445 - loss 0.00631451 - time (sec): 56.85 - samples/sec: 2480.07 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:04:13,723 epoch 10 - iter 1296/1445 - loss 0.00652823 - time (sec): 64.00 - samples/sec: 2482.67 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:04:20,891 epoch 10 - iter 1440/1445 - loss 0.00649172 - time (sec): 71.17 - samples/sec: 2470.94 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:04:21,133 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:04:21,134 EPOCH 10 done: loss 0.0065 - lr: 0.000000 2023-10-17 17:04:24,372 DEV : loss 0.1461455523967743 - f1-score (micro avg) 0.8679 2023-10-17 17:04:24,745 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:04:24,746 Loading model from best epoch ... 2023-10-17 17:04:26,084 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 17:04:28,840 Results: - F-score (micro) 0.8773 - F-score (macro) 0.7854 - Accuracy 0.7879 By class: precision recall f1-score support PER 0.8737 0.8755 0.8746 482 LOC 0.9603 0.8974 0.9278 458 ORG 0.5902 0.5217 0.5538 69 micro avg 0.8940 0.8612 0.8773 1009 macro avg 0.8081 0.7649 0.7854 1009 weighted avg 0.8936 0.8612 0.8768 1009 2023-10-17 17:04:28,841 ----------------------------------------------------------------------------------------------------