2023-10-17 20:57:44,502 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,503 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 20:57:44,503 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,503 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 20:57:44,503 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,503 Train: 5901 sentences 2023-10-17 20:57:44,503 (train_with_dev=False, train_with_test=False) 2023-10-17 20:57:44,503 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,503 Training Params: 2023-10-17 20:57:44,503 - learning_rate: "3e-05" 2023-10-17 20:57:44,503 - mini_batch_size: "4" 2023-10-17 20:57:44,503 - max_epochs: "10" 2023-10-17 20:57:44,503 - shuffle: "True" 2023-10-17 20:57:44,503 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,503 Plugins: 2023-10-17 20:57:44,503 - TensorboardLogger 2023-10-17 20:57:44,504 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 20:57:44,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,504 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 20:57:44,504 - metric: "('micro avg', 'f1-score')" 2023-10-17 20:57:44,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,504 Computation: 2023-10-17 20:57:44,504 - compute on device: cuda:0 2023-10-17 20:57:44,504 - embedding storage: none 2023-10-17 20:57:44,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,504 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 20:57:44,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,504 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:57:44,504 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 20:57:51,411 epoch 1 - iter 147/1476 - loss 3.24267368 - time (sec): 6.91 - samples/sec: 2332.85 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:57:58,675 epoch 1 - iter 294/1476 - loss 1.82965327 - time (sec): 14.17 - samples/sec: 2490.20 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:58:05,460 epoch 1 - iter 441/1476 - loss 1.42129099 - time (sec): 20.96 - samples/sec: 2394.03 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:58:12,685 epoch 1 - iter 588/1476 - loss 1.16200279 - time (sec): 28.18 - samples/sec: 2378.23 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:58:19,657 epoch 1 - iter 735/1476 - loss 0.99879079 - time (sec): 35.15 - samples/sec: 2365.20 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:58:26,769 epoch 1 - iter 882/1476 - loss 0.87692872 - time (sec): 42.26 - samples/sec: 2358.50 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:58:33,988 epoch 1 - iter 1029/1476 - loss 0.78427069 - time (sec): 49.48 - samples/sec: 2354.88 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:58:40,899 epoch 1 - iter 1176/1476 - loss 0.71213649 - time (sec): 56.39 - samples/sec: 2343.92 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:58:47,798 epoch 1 - iter 1323/1476 - loss 0.65488116 - time (sec): 63.29 - samples/sec: 2353.75 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:58:55,056 epoch 1 - iter 1470/1476 - loss 0.60740596 - time (sec): 70.55 - samples/sec: 2348.91 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:58:55,367 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:58:55,367 EPOCH 1 done: loss 0.6056 - lr: 0.000030 2023-10-17 20:59:02,301 DEV : loss 0.14250747859477997 - f1-score (micro avg) 0.7374 2023-10-17 20:59:02,336 saving best model 2023-10-17 20:59:02,722 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:59:09,759 epoch 2 - iter 147/1476 - loss 0.12242529 - time (sec): 7.04 - samples/sec: 2131.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:59:17,375 epoch 2 - iter 294/1476 - loss 0.13734809 - time (sec): 14.65 - samples/sec: 2278.80 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:59:24,597 epoch 2 - iter 441/1476 - loss 0.14198341 - time (sec): 21.87 - samples/sec: 2309.55 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:59:32,115 epoch 2 - iter 588/1476 - loss 0.14279736 - time (sec): 29.39 - samples/sec: 2333.99 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:59:39,293 epoch 2 - iter 735/1476 - loss 0.14080823 - time (sec): 36.57 - samples/sec: 2363.39 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:59:46,457 epoch 2 - iter 882/1476 - loss 0.13838234 - time (sec): 43.73 - samples/sec: 2356.46 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:59:53,143 epoch 2 - iter 1029/1476 - loss 0.13757509 - time (sec): 50.42 - samples/sec: 2345.39 - lr: 0.000028 - momentum: 0.000000 2023-10-17 21:00:00,041 epoch 2 - iter 1176/1476 - loss 0.13702648 - time (sec): 57.32 - samples/sec: 2328.66 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:00:07,044 epoch 2 - iter 1323/1476 - loss 0.13591803 - time (sec): 64.32 - samples/sec: 2326.15 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:00:13,996 epoch 2 - iter 1470/1476 - loss 0.13420153 - time (sec): 71.27 - samples/sec: 2326.32 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:00:14,273 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:00:14,273 EPOCH 2 done: loss 0.1341 - lr: 0.000027 2023-10-17 21:00:25,898 DEV : loss 0.12737122178077698 - f1-score (micro avg) 0.8259 2023-10-17 21:00:25,943 saving best model 2023-10-17 21:00:26,419 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:00:33,428 epoch 3 - iter 147/1476 - loss 0.07934414 - time (sec): 7.01 - samples/sec: 2159.00 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:00:40,523 epoch 3 - iter 294/1476 - loss 0.09609900 - time (sec): 14.10 - samples/sec: 2172.43 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:00:47,796 epoch 3 - iter 441/1476 - loss 0.09164912 - time (sec): 21.37 - samples/sec: 2234.61 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:00:55,154 epoch 3 - iter 588/1476 - loss 0.09089120 - time (sec): 28.73 - samples/sec: 2268.48 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:01:02,256 epoch 3 - iter 735/1476 - loss 0.08339538 - time (sec): 35.84 - samples/sec: 2250.66 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:01:09,257 epoch 3 - iter 882/1476 - loss 0.08902747 - time (sec): 42.84 - samples/sec: 2262.89 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:01:16,813 epoch 3 - iter 1029/1476 - loss 0.08671360 - time (sec): 50.39 - samples/sec: 2294.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:01:23,764 epoch 3 - iter 1176/1476 - loss 0.08659255 - time (sec): 57.34 - samples/sec: 2313.83 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:01:30,803 epoch 3 - iter 1323/1476 - loss 0.08351285 - time (sec): 64.38 - samples/sec: 2321.06 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:01:37,894 epoch 3 - iter 1470/1476 - loss 0.08434125 - time (sec): 71.47 - samples/sec: 2319.81 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:01:38,165 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:01:38,166 EPOCH 3 done: loss 0.0846 - lr: 0.000023 2023-10-17 21:01:49,589 DEV : loss 0.15964345633983612 - f1-score (micro avg) 0.8312 2023-10-17 21:01:49,622 saving best model 2023-10-17 21:01:50,120 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:01:57,454 epoch 4 - iter 147/1476 - loss 0.04444733 - time (sec): 7.33 - samples/sec: 2324.92 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:02:04,946 epoch 4 - iter 294/1476 - loss 0.05274934 - time (sec): 14.82 - samples/sec: 2382.82 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:02:12,021 epoch 4 - iter 441/1476 - loss 0.05372551 - time (sec): 21.90 - samples/sec: 2369.97 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:02:19,177 epoch 4 - iter 588/1476 - loss 0.05933968 - time (sec): 29.06 - samples/sec: 2359.93 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:02:26,456 epoch 4 - iter 735/1476 - loss 0.06212474 - time (sec): 36.33 - samples/sec: 2312.29 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:02:33,665 epoch 4 - iter 882/1476 - loss 0.06178211 - time (sec): 43.54 - samples/sec: 2307.36 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:02:40,494 epoch 4 - iter 1029/1476 - loss 0.06100964 - time (sec): 50.37 - samples/sec: 2303.47 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:02:47,522 epoch 4 - iter 1176/1476 - loss 0.05940440 - time (sec): 57.40 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:02:54,841 epoch 4 - iter 1323/1476 - loss 0.05790407 - time (sec): 64.72 - samples/sec: 2331.95 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:03:01,613 epoch 4 - iter 1470/1476 - loss 0.05665612 - time (sec): 71.49 - samples/sec: 2319.81 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:03:01,876 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:03:01,876 EPOCH 4 done: loss 0.0567 - lr: 0.000020 2023-10-17 21:03:13,306 DEV : loss 0.17311468720436096 - f1-score (micro avg) 0.843 2023-10-17 21:03:13,340 saving best model 2023-10-17 21:03:13,833 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:03:21,330 epoch 5 - iter 147/1476 - loss 0.04282800 - time (sec): 7.49 - samples/sec: 2344.93 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:03:28,482 epoch 5 - iter 294/1476 - loss 0.04004777 - time (sec): 14.65 - samples/sec: 2342.70 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:03:36,115 epoch 5 - iter 441/1476 - loss 0.03965920 - time (sec): 22.28 - samples/sec: 2354.71 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:03:43,279 epoch 5 - iter 588/1476 - loss 0.04080899 - time (sec): 29.44 - samples/sec: 2343.97 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:03:50,569 epoch 5 - iter 735/1476 - loss 0.03827681 - time (sec): 36.73 - samples/sec: 2337.14 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:03:57,752 epoch 5 - iter 882/1476 - loss 0.03780090 - time (sec): 43.92 - samples/sec: 2346.28 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:04:04,597 epoch 5 - iter 1029/1476 - loss 0.03994119 - time (sec): 50.76 - samples/sec: 2320.16 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:04:11,642 epoch 5 - iter 1176/1476 - loss 0.04187246 - time (sec): 57.81 - samples/sec: 2298.10 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:04:18,894 epoch 5 - iter 1323/1476 - loss 0.04077018 - time (sec): 65.06 - samples/sec: 2295.16 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:04:25,992 epoch 5 - iter 1470/1476 - loss 0.04145277 - time (sec): 72.16 - samples/sec: 2299.03 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:04:26,259 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:04:26,259 EPOCH 5 done: loss 0.0413 - lr: 0.000017 2023-10-17 21:04:37,935 DEV : loss 0.17236292362213135 - f1-score (micro avg) 0.8485 2023-10-17 21:04:37,966 saving best model 2023-10-17 21:04:38,435 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:04:45,721 epoch 6 - iter 147/1476 - loss 0.02321970 - time (sec): 7.28 - samples/sec: 2247.92 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:04:52,756 epoch 6 - iter 294/1476 - loss 0.02613525 - time (sec): 14.32 - samples/sec: 2259.85 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:04:59,642 epoch 6 - iter 441/1476 - loss 0.02587034 - time (sec): 21.20 - samples/sec: 2235.49 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:05:06,890 epoch 6 - iter 588/1476 - loss 0.02499643 - time (sec): 28.45 - samples/sec: 2252.81 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:05:14,135 epoch 6 - iter 735/1476 - loss 0.02538178 - time (sec): 35.70 - samples/sec: 2259.69 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:05:21,161 epoch 6 - iter 882/1476 - loss 0.02381428 - time (sec): 42.72 - samples/sec: 2261.17 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:05:28,064 epoch 6 - iter 1029/1476 - loss 0.02435506 - time (sec): 49.62 - samples/sec: 2263.86 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:05:35,139 epoch 6 - iter 1176/1476 - loss 0.02457862 - time (sec): 56.70 - samples/sec: 2255.84 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:05:43,291 epoch 6 - iter 1323/1476 - loss 0.02668096 - time (sec): 64.85 - samples/sec: 2291.48 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:05:51,405 epoch 6 - iter 1470/1476 - loss 0.02593758 - time (sec): 72.96 - samples/sec: 2262.54 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:05:51,855 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:05:51,855 EPOCH 6 done: loss 0.0264 - lr: 0.000013 2023-10-17 21:06:03,471 DEV : loss 0.19792184233665466 - f1-score (micro avg) 0.8556 2023-10-17 21:06:03,503 saving best model 2023-10-17 21:06:03,987 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:06:11,091 epoch 7 - iter 147/1476 - loss 0.01811113 - time (sec): 7.10 - samples/sec: 2159.54 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:06:17,921 epoch 7 - iter 294/1476 - loss 0.01490523 - time (sec): 13.93 - samples/sec: 2252.91 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:06:25,612 epoch 7 - iter 441/1476 - loss 0.01387370 - time (sec): 21.62 - samples/sec: 2127.64 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:06:32,793 epoch 7 - iter 588/1476 - loss 0.01461327 - time (sec): 28.80 - samples/sec: 2167.02 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:06:40,082 epoch 7 - iter 735/1476 - loss 0.01603147 - time (sec): 36.09 - samples/sec: 2180.74 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:06:47,260 epoch 7 - iter 882/1476 - loss 0.01529605 - time (sec): 43.27 - samples/sec: 2231.67 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:06:54,953 epoch 7 - iter 1029/1476 - loss 0.01843807 - time (sec): 50.96 - samples/sec: 2301.89 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:07:01,899 epoch 7 - iter 1176/1476 - loss 0.01924003 - time (sec): 57.91 - samples/sec: 2299.36 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:07:08,912 epoch 7 - iter 1323/1476 - loss 0.01951725 - time (sec): 64.92 - samples/sec: 2305.74 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:07:16,073 epoch 7 - iter 1470/1476 - loss 0.01948374 - time (sec): 72.08 - samples/sec: 2303.28 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:07:16,337 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:07:16,337 EPOCH 7 done: loss 0.0195 - lr: 0.000010 2023-10-17 21:07:27,797 DEV : loss 0.19168895483016968 - f1-score (micro avg) 0.8477 2023-10-17 21:07:27,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:07:34,880 epoch 8 - iter 147/1476 - loss 0.00934227 - time (sec): 7.05 - samples/sec: 2231.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:07:41,878 epoch 8 - iter 294/1476 - loss 0.01061936 - time (sec): 14.05 - samples/sec: 2252.58 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:07:48,874 epoch 8 - iter 441/1476 - loss 0.00958806 - time (sec): 21.05 - samples/sec: 2265.87 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:07:55,968 epoch 8 - iter 588/1476 - loss 0.00935601 - time (sec): 28.14 - samples/sec: 2248.99 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:08:03,591 epoch 8 - iter 735/1476 - loss 0.01281493 - time (sec): 35.76 - samples/sec: 2346.33 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:08:11,023 epoch 8 - iter 882/1476 - loss 0.01188403 - time (sec): 43.19 - samples/sec: 2368.03 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:08:17,939 epoch 8 - iter 1029/1476 - loss 0.01101362 - time (sec): 50.11 - samples/sec: 2359.18 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:08:25,003 epoch 8 - iter 1176/1476 - loss 0.01167092 - time (sec): 57.17 - samples/sec: 2360.59 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:08:31,942 epoch 8 - iter 1323/1476 - loss 0.01217457 - time (sec): 64.11 - samples/sec: 2345.92 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:08:38,561 epoch 8 - iter 1470/1476 - loss 0.01223428 - time (sec): 70.73 - samples/sec: 2341.08 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:08:38,858 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:08:38,859 EPOCH 8 done: loss 0.0122 - lr: 0.000007 2023-10-17 21:08:50,366 DEV : loss 0.21457839012145996 - f1-score (micro avg) 0.8464 2023-10-17 21:08:50,399 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:08:57,588 epoch 9 - iter 147/1476 - loss 0.01782838 - time (sec): 7.19 - samples/sec: 2506.00 - lr: 0.000006 - momentum: 0.000000 2023-10-17 21:09:04,554 epoch 9 - iter 294/1476 - loss 0.01151611 - time (sec): 14.15 - samples/sec: 2447.27 - lr: 0.000006 - momentum: 0.000000 2023-10-17 21:09:11,392 epoch 9 - iter 441/1476 - loss 0.00978648 - time (sec): 20.99 - samples/sec: 2392.69 - lr: 0.000006 - momentum: 0.000000 2023-10-17 21:09:18,834 epoch 9 - iter 588/1476 - loss 0.00991456 - time (sec): 28.43 - samples/sec: 2375.92 - lr: 0.000005 - momentum: 0.000000 2023-10-17 21:09:25,871 epoch 9 - iter 735/1476 - loss 0.01023225 - time (sec): 35.47 - samples/sec: 2344.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 21:09:33,062 epoch 9 - iter 882/1476 - loss 0.00966054 - time (sec): 42.66 - samples/sec: 2357.99 - lr: 0.000005 - momentum: 0.000000 2023-10-17 21:09:40,162 epoch 9 - iter 1029/1476 - loss 0.01008829 - time (sec): 49.76 - samples/sec: 2357.56 - lr: 0.000004 - momentum: 0.000000 2023-10-17 21:09:47,248 epoch 9 - iter 1176/1476 - loss 0.01025552 - time (sec): 56.85 - samples/sec: 2370.39 - lr: 0.000004 - momentum: 0.000000 2023-10-17 21:09:54,306 epoch 9 - iter 1323/1476 - loss 0.00963061 - time (sec): 63.91 - samples/sec: 2359.79 - lr: 0.000004 - momentum: 0.000000 2023-10-17 21:10:01,693 epoch 9 - iter 1470/1476 - loss 0.00908092 - time (sec): 71.29 - samples/sec: 2327.44 - lr: 0.000003 - momentum: 0.000000 2023-10-17 21:10:01,964 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:10:01,964 EPOCH 9 done: loss 0.0091 - lr: 0.000003 2023-10-17 21:10:13,701 DEV : loss 0.2243947833776474 - f1-score (micro avg) 0.8475 2023-10-17 21:10:13,751 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:10:21,899 epoch 10 - iter 147/1476 - loss 0.00275969 - time (sec): 8.15 - samples/sec: 2031.82 - lr: 0.000003 - momentum: 0.000000 2023-10-17 21:10:30,507 epoch 10 - iter 294/1476 - loss 0.00846906 - time (sec): 16.75 - samples/sec: 2097.44 - lr: 0.000003 - momentum: 0.000000 2023-10-17 21:10:38,195 epoch 10 - iter 441/1476 - loss 0.00806670 - time (sec): 24.44 - samples/sec: 2080.66 - lr: 0.000002 - momentum: 0.000000 2023-10-17 21:10:45,362 epoch 10 - iter 588/1476 - loss 0.00753225 - time (sec): 31.61 - samples/sec: 2141.43 - lr: 0.000002 - momentum: 0.000000 2023-10-17 21:10:52,221 epoch 10 - iter 735/1476 - loss 0.00635976 - time (sec): 38.47 - samples/sec: 2173.07 - lr: 0.000002 - momentum: 0.000000 2023-10-17 21:10:59,111 epoch 10 - iter 882/1476 - loss 0.00703310 - time (sec): 45.36 - samples/sec: 2187.99 - lr: 0.000001 - momentum: 0.000000 2023-10-17 21:11:06,354 epoch 10 - iter 1029/1476 - loss 0.00633039 - time (sec): 52.60 - samples/sec: 2193.56 - lr: 0.000001 - momentum: 0.000000 2023-10-17 21:11:13,330 epoch 10 - iter 1176/1476 - loss 0.00586313 - time (sec): 59.58 - samples/sec: 2215.09 - lr: 0.000001 - momentum: 0.000000 2023-10-17 21:11:20,199 epoch 10 - iter 1323/1476 - loss 0.00585113 - time (sec): 66.45 - samples/sec: 2232.50 - lr: 0.000000 - momentum: 0.000000 2023-10-17 21:11:27,533 epoch 10 - iter 1470/1476 - loss 0.00538743 - time (sec): 73.78 - samples/sec: 2248.65 - lr: 0.000000 - momentum: 0.000000 2023-10-17 21:11:27,799 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:11:27,800 EPOCH 10 done: loss 0.0054 - lr: 0.000000 2023-10-17 21:11:39,245 DEV : loss 0.22376643121242523 - f1-score (micro avg) 0.8539 2023-10-17 21:11:39,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:11:39,634 Loading model from best epoch ... 2023-10-17 21:11:41,018 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 21:11:47,214 Results: - F-score (micro) 0.7986 - F-score (macro) 0.7013 - Accuracy 0.6844 By class: precision recall f1-score support loc 0.8515 0.8753 0.8632 858 pers 0.7638 0.8007 0.7818 537 org 0.5605 0.6667 0.6090 132 time 0.5410 0.6111 0.5739 54 prod 0.7451 0.6230 0.6786 61 micro avg 0.7818 0.8161 0.7986 1642 macro avg 0.6924 0.7154 0.7013 1642 weighted avg 0.7852 0.8161 0.7998 1642 2023-10-17 21:11:47,214 ----------------------------------------------------------------------------------------------------