stefan-it's picture
Upload folder using huggingface_hub
8051048
2023-10-17 20:57:44,502 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,503 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 20:57:44,503 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,503 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 20:57:44,503 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,503 Train: 5901 sentences
2023-10-17 20:57:44,503 (train_with_dev=False, train_with_test=False)
2023-10-17 20:57:44,503 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,503 Training Params:
2023-10-17 20:57:44,503 - learning_rate: "3e-05"
2023-10-17 20:57:44,503 - mini_batch_size: "4"
2023-10-17 20:57:44,503 - max_epochs: "10"
2023-10-17 20:57:44,503 - shuffle: "True"
2023-10-17 20:57:44,503 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,503 Plugins:
2023-10-17 20:57:44,503 - TensorboardLogger
2023-10-17 20:57:44,504 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 20:57:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,504 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 20:57:44,504 - metric: "('micro avg', 'f1-score')"
2023-10-17 20:57:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,504 Computation:
2023-10-17 20:57:44,504 - compute on device: cuda:0
2023-10-17 20:57:44,504 - embedding storage: none
2023-10-17 20:57:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,504 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 20:57:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 20:57:44,504 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 20:57:51,411 epoch 1 - iter 147/1476 - loss 3.24267368 - time (sec): 6.91 - samples/sec: 2332.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:57:58,675 epoch 1 - iter 294/1476 - loss 1.82965327 - time (sec): 14.17 - samples/sec: 2490.20 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:58:05,460 epoch 1 - iter 441/1476 - loss 1.42129099 - time (sec): 20.96 - samples/sec: 2394.03 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:58:12,685 epoch 1 - iter 588/1476 - loss 1.16200279 - time (sec): 28.18 - samples/sec: 2378.23 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:58:19,657 epoch 1 - iter 735/1476 - loss 0.99879079 - time (sec): 35.15 - samples/sec: 2365.20 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:58:26,769 epoch 1 - iter 882/1476 - loss 0.87692872 - time (sec): 42.26 - samples/sec: 2358.50 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:58:33,988 epoch 1 - iter 1029/1476 - loss 0.78427069 - time (sec): 49.48 - samples/sec: 2354.88 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:58:40,899 epoch 1 - iter 1176/1476 - loss 0.71213649 - time (sec): 56.39 - samples/sec: 2343.92 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:58:47,798 epoch 1 - iter 1323/1476 - loss 0.65488116 - time (sec): 63.29 - samples/sec: 2353.75 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:58:55,056 epoch 1 - iter 1470/1476 - loss 0.60740596 - time (sec): 70.55 - samples/sec: 2348.91 - lr: 0.000030 - momentum: 0.000000
2023-10-17 20:58:55,367 ----------------------------------------------------------------------------------------------------
2023-10-17 20:58:55,367 EPOCH 1 done: loss 0.6056 - lr: 0.000030
2023-10-17 20:59:02,301 DEV : loss 0.14250747859477997 - f1-score (micro avg) 0.7374
2023-10-17 20:59:02,336 saving best model
2023-10-17 20:59:02,722 ----------------------------------------------------------------------------------------------------
2023-10-17 20:59:09,759 epoch 2 - iter 147/1476 - loss 0.12242529 - time (sec): 7.04 - samples/sec: 2131.27 - lr: 0.000030 - momentum: 0.000000
2023-10-17 20:59:17,375 epoch 2 - iter 294/1476 - loss 0.13734809 - time (sec): 14.65 - samples/sec: 2278.80 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:59:24,597 epoch 2 - iter 441/1476 - loss 0.14198341 - time (sec): 21.87 - samples/sec: 2309.55 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:59:32,115 epoch 2 - iter 588/1476 - loss 0.14279736 - time (sec): 29.39 - samples/sec: 2333.99 - lr: 0.000029 - momentum: 0.000000
2023-10-17 20:59:39,293 epoch 2 - iter 735/1476 - loss 0.14080823 - time (sec): 36.57 - samples/sec: 2363.39 - lr: 0.000028 - momentum: 0.000000
2023-10-17 20:59:46,457 epoch 2 - iter 882/1476 - loss 0.13838234 - time (sec): 43.73 - samples/sec: 2356.46 - lr: 0.000028 - momentum: 0.000000
2023-10-17 20:59:53,143 epoch 2 - iter 1029/1476 - loss 0.13757509 - time (sec): 50.42 - samples/sec: 2345.39 - lr: 0.000028 - momentum: 0.000000
2023-10-17 21:00:00,041 epoch 2 - iter 1176/1476 - loss 0.13702648 - time (sec): 57.32 - samples/sec: 2328.66 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:00:07,044 epoch 2 - iter 1323/1476 - loss 0.13591803 - time (sec): 64.32 - samples/sec: 2326.15 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:00:13,996 epoch 2 - iter 1470/1476 - loss 0.13420153 - time (sec): 71.27 - samples/sec: 2326.32 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:00:14,273 ----------------------------------------------------------------------------------------------------
2023-10-17 21:00:14,273 EPOCH 2 done: loss 0.1341 - lr: 0.000027
2023-10-17 21:00:25,898 DEV : loss 0.12737122178077698 - f1-score (micro avg) 0.8259
2023-10-17 21:00:25,943 saving best model
2023-10-17 21:00:26,419 ----------------------------------------------------------------------------------------------------
2023-10-17 21:00:33,428 epoch 3 - iter 147/1476 - loss 0.07934414 - time (sec): 7.01 - samples/sec: 2159.00 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:00:40,523 epoch 3 - iter 294/1476 - loss 0.09609900 - time (sec): 14.10 - samples/sec: 2172.43 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:00:47,796 epoch 3 - iter 441/1476 - loss 0.09164912 - time (sec): 21.37 - samples/sec: 2234.61 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:00:55,154 epoch 3 - iter 588/1476 - loss 0.09089120 - time (sec): 28.73 - samples/sec: 2268.48 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:01:02,256 epoch 3 - iter 735/1476 - loss 0.08339538 - time (sec): 35.84 - samples/sec: 2250.66 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:01:09,257 epoch 3 - iter 882/1476 - loss 0.08902747 - time (sec): 42.84 - samples/sec: 2262.89 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:01:16,813 epoch 3 - iter 1029/1476 - loss 0.08671360 - time (sec): 50.39 - samples/sec: 2294.45 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:01:23,764 epoch 3 - iter 1176/1476 - loss 0.08659255 - time (sec): 57.34 - samples/sec: 2313.83 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:01:30,803 epoch 3 - iter 1323/1476 - loss 0.08351285 - time (sec): 64.38 - samples/sec: 2321.06 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:01:37,894 epoch 3 - iter 1470/1476 - loss 0.08434125 - time (sec): 71.47 - samples/sec: 2319.81 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:01:38,165 ----------------------------------------------------------------------------------------------------
2023-10-17 21:01:38,166 EPOCH 3 done: loss 0.0846 - lr: 0.000023
2023-10-17 21:01:49,589 DEV : loss 0.15964345633983612 - f1-score (micro avg) 0.8312
2023-10-17 21:01:49,622 saving best model
2023-10-17 21:01:50,120 ----------------------------------------------------------------------------------------------------
2023-10-17 21:01:57,454 epoch 4 - iter 147/1476 - loss 0.04444733 - time (sec): 7.33 - samples/sec: 2324.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:02:04,946 epoch 4 - iter 294/1476 - loss 0.05274934 - time (sec): 14.82 - samples/sec: 2382.82 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:02:12,021 epoch 4 - iter 441/1476 - loss 0.05372551 - time (sec): 21.90 - samples/sec: 2369.97 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:02:19,177 epoch 4 - iter 588/1476 - loss 0.05933968 - time (sec): 29.06 - samples/sec: 2359.93 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:02:26,456 epoch 4 - iter 735/1476 - loss 0.06212474 - time (sec): 36.33 - samples/sec: 2312.29 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:02:33,665 epoch 4 - iter 882/1476 - loss 0.06178211 - time (sec): 43.54 - samples/sec: 2307.36 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:02:40,494 epoch 4 - iter 1029/1476 - loss 0.06100964 - time (sec): 50.37 - samples/sec: 2303.47 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:02:47,522 epoch 4 - iter 1176/1476 - loss 0.05940440 - time (sec): 57.40 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:02:54,841 epoch 4 - iter 1323/1476 - loss 0.05790407 - time (sec): 64.72 - samples/sec: 2331.95 - lr: 0.000020 - momentum: 0.000000
2023-10-17 21:03:01,613 epoch 4 - iter 1470/1476 - loss 0.05665612 - time (sec): 71.49 - samples/sec: 2319.81 - lr: 0.000020 - momentum: 0.000000
2023-10-17 21:03:01,876 ----------------------------------------------------------------------------------------------------
2023-10-17 21:03:01,876 EPOCH 4 done: loss 0.0567 - lr: 0.000020
2023-10-17 21:03:13,306 DEV : loss 0.17311468720436096 - f1-score (micro avg) 0.843
2023-10-17 21:03:13,340 saving best model
2023-10-17 21:03:13,833 ----------------------------------------------------------------------------------------------------
2023-10-17 21:03:21,330 epoch 5 - iter 147/1476 - loss 0.04282800 - time (sec): 7.49 - samples/sec: 2344.93 - lr: 0.000020 - momentum: 0.000000
2023-10-17 21:03:28,482 epoch 5 - iter 294/1476 - loss 0.04004777 - time (sec): 14.65 - samples/sec: 2342.70 - lr: 0.000019 - momentum: 0.000000
2023-10-17 21:03:36,115 epoch 5 - iter 441/1476 - loss 0.03965920 - time (sec): 22.28 - samples/sec: 2354.71 - lr: 0.000019 - momentum: 0.000000
2023-10-17 21:03:43,279 epoch 5 - iter 588/1476 - loss 0.04080899 - time (sec): 29.44 - samples/sec: 2343.97 - lr: 0.000019 - momentum: 0.000000
2023-10-17 21:03:50,569 epoch 5 - iter 735/1476 - loss 0.03827681 - time (sec): 36.73 - samples/sec: 2337.14 - lr: 0.000018 - momentum: 0.000000
2023-10-17 21:03:57,752 epoch 5 - iter 882/1476 - loss 0.03780090 - time (sec): 43.92 - samples/sec: 2346.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 21:04:04,597 epoch 5 - iter 1029/1476 - loss 0.03994119 - time (sec): 50.76 - samples/sec: 2320.16 - lr: 0.000018 - momentum: 0.000000
2023-10-17 21:04:11,642 epoch 5 - iter 1176/1476 - loss 0.04187246 - time (sec): 57.81 - samples/sec: 2298.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 21:04:18,894 epoch 5 - iter 1323/1476 - loss 0.04077018 - time (sec): 65.06 - samples/sec: 2295.16 - lr: 0.000017 - momentum: 0.000000
2023-10-17 21:04:25,992 epoch 5 - iter 1470/1476 - loss 0.04145277 - time (sec): 72.16 - samples/sec: 2299.03 - lr: 0.000017 - momentum: 0.000000
2023-10-17 21:04:26,259 ----------------------------------------------------------------------------------------------------
2023-10-17 21:04:26,259 EPOCH 5 done: loss 0.0413 - lr: 0.000017
2023-10-17 21:04:37,935 DEV : loss 0.17236292362213135 - f1-score (micro avg) 0.8485
2023-10-17 21:04:37,966 saving best model
2023-10-17 21:04:38,435 ----------------------------------------------------------------------------------------------------
2023-10-17 21:04:45,721 epoch 6 - iter 147/1476 - loss 0.02321970 - time (sec): 7.28 - samples/sec: 2247.92 - lr: 0.000016 - momentum: 0.000000
2023-10-17 21:04:52,756 epoch 6 - iter 294/1476 - loss 0.02613525 - time (sec): 14.32 - samples/sec: 2259.85 - lr: 0.000016 - momentum: 0.000000
2023-10-17 21:04:59,642 epoch 6 - iter 441/1476 - loss 0.02587034 - time (sec): 21.20 - samples/sec: 2235.49 - lr: 0.000016 - momentum: 0.000000
2023-10-17 21:05:06,890 epoch 6 - iter 588/1476 - loss 0.02499643 - time (sec): 28.45 - samples/sec: 2252.81 - lr: 0.000015 - momentum: 0.000000
2023-10-17 21:05:14,135 epoch 6 - iter 735/1476 - loss 0.02538178 - time (sec): 35.70 - samples/sec: 2259.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 21:05:21,161 epoch 6 - iter 882/1476 - loss 0.02381428 - time (sec): 42.72 - samples/sec: 2261.17 - lr: 0.000015 - momentum: 0.000000
2023-10-17 21:05:28,064 epoch 6 - iter 1029/1476 - loss 0.02435506 - time (sec): 49.62 - samples/sec: 2263.86 - lr: 0.000014 - momentum: 0.000000
2023-10-17 21:05:35,139 epoch 6 - iter 1176/1476 - loss 0.02457862 - time (sec): 56.70 - samples/sec: 2255.84 - lr: 0.000014 - momentum: 0.000000
2023-10-17 21:05:43,291 epoch 6 - iter 1323/1476 - loss 0.02668096 - time (sec): 64.85 - samples/sec: 2291.48 - lr: 0.000014 - momentum: 0.000000
2023-10-17 21:05:51,405 epoch 6 - iter 1470/1476 - loss 0.02593758 - time (sec): 72.96 - samples/sec: 2262.54 - lr: 0.000013 - momentum: 0.000000
2023-10-17 21:05:51,855 ----------------------------------------------------------------------------------------------------
2023-10-17 21:05:51,855 EPOCH 6 done: loss 0.0264 - lr: 0.000013
2023-10-17 21:06:03,471 DEV : loss 0.19792184233665466 - f1-score (micro avg) 0.8556
2023-10-17 21:06:03,503 saving best model
2023-10-17 21:06:03,987 ----------------------------------------------------------------------------------------------------
2023-10-17 21:06:11,091 epoch 7 - iter 147/1476 - loss 0.01811113 - time (sec): 7.10 - samples/sec: 2159.54 - lr: 0.000013 - momentum: 0.000000
2023-10-17 21:06:17,921 epoch 7 - iter 294/1476 - loss 0.01490523 - time (sec): 13.93 - samples/sec: 2252.91 - lr: 0.000013 - momentum: 0.000000
2023-10-17 21:06:25,612 epoch 7 - iter 441/1476 - loss 0.01387370 - time (sec): 21.62 - samples/sec: 2127.64 - lr: 0.000012 - momentum: 0.000000
2023-10-17 21:06:32,793 epoch 7 - iter 588/1476 - loss 0.01461327 - time (sec): 28.80 - samples/sec: 2167.02 - lr: 0.000012 - momentum: 0.000000
2023-10-17 21:06:40,082 epoch 7 - iter 735/1476 - loss 0.01603147 - time (sec): 36.09 - samples/sec: 2180.74 - lr: 0.000012 - momentum: 0.000000
2023-10-17 21:06:47,260 epoch 7 - iter 882/1476 - loss 0.01529605 - time (sec): 43.27 - samples/sec: 2231.67 - lr: 0.000011 - momentum: 0.000000
2023-10-17 21:06:54,953 epoch 7 - iter 1029/1476 - loss 0.01843807 - time (sec): 50.96 - samples/sec: 2301.89 - lr: 0.000011 - momentum: 0.000000
2023-10-17 21:07:01,899 epoch 7 - iter 1176/1476 - loss 0.01924003 - time (sec): 57.91 - samples/sec: 2299.36 - lr: 0.000011 - momentum: 0.000000
2023-10-17 21:07:08,912 epoch 7 - iter 1323/1476 - loss 0.01951725 - time (sec): 64.92 - samples/sec: 2305.74 - lr: 0.000010 - momentum: 0.000000
2023-10-17 21:07:16,073 epoch 7 - iter 1470/1476 - loss 0.01948374 - time (sec): 72.08 - samples/sec: 2303.28 - lr: 0.000010 - momentum: 0.000000
2023-10-17 21:07:16,337 ----------------------------------------------------------------------------------------------------
2023-10-17 21:07:16,337 EPOCH 7 done: loss 0.0195 - lr: 0.000010
2023-10-17 21:07:27,797 DEV : loss 0.19168895483016968 - f1-score (micro avg) 0.8477
2023-10-17 21:07:27,827 ----------------------------------------------------------------------------------------------------
2023-10-17 21:07:34,880 epoch 8 - iter 147/1476 - loss 0.00934227 - time (sec): 7.05 - samples/sec: 2231.59 - lr: 0.000010 - momentum: 0.000000
2023-10-17 21:07:41,878 epoch 8 - iter 294/1476 - loss 0.01061936 - time (sec): 14.05 - samples/sec: 2252.58 - lr: 0.000009 - momentum: 0.000000
2023-10-17 21:07:48,874 epoch 8 - iter 441/1476 - loss 0.00958806 - time (sec): 21.05 - samples/sec: 2265.87 - lr: 0.000009 - momentum: 0.000000
2023-10-17 21:07:55,968 epoch 8 - iter 588/1476 - loss 0.00935601 - time (sec): 28.14 - samples/sec: 2248.99 - lr: 0.000009 - momentum: 0.000000
2023-10-17 21:08:03,591 epoch 8 - iter 735/1476 - loss 0.01281493 - time (sec): 35.76 - samples/sec: 2346.33 - lr: 0.000008 - momentum: 0.000000
2023-10-17 21:08:11,023 epoch 8 - iter 882/1476 - loss 0.01188403 - time (sec): 43.19 - samples/sec: 2368.03 - lr: 0.000008 - momentum: 0.000000
2023-10-17 21:08:17,939 epoch 8 - iter 1029/1476 - loss 0.01101362 - time (sec): 50.11 - samples/sec: 2359.18 - lr: 0.000008 - momentum: 0.000000
2023-10-17 21:08:25,003 epoch 8 - iter 1176/1476 - loss 0.01167092 - time (sec): 57.17 - samples/sec: 2360.59 - lr: 0.000007 - momentum: 0.000000
2023-10-17 21:08:31,942 epoch 8 - iter 1323/1476 - loss 0.01217457 - time (sec): 64.11 - samples/sec: 2345.92 - lr: 0.000007 - momentum: 0.000000
2023-10-17 21:08:38,561 epoch 8 - iter 1470/1476 - loss 0.01223428 - time (sec): 70.73 - samples/sec: 2341.08 - lr: 0.000007 - momentum: 0.000000
2023-10-17 21:08:38,858 ----------------------------------------------------------------------------------------------------
2023-10-17 21:08:38,859 EPOCH 8 done: loss 0.0122 - lr: 0.000007
2023-10-17 21:08:50,366 DEV : loss 0.21457839012145996 - f1-score (micro avg) 0.8464
2023-10-17 21:08:50,399 ----------------------------------------------------------------------------------------------------
2023-10-17 21:08:57,588 epoch 9 - iter 147/1476 - loss 0.01782838 - time (sec): 7.19 - samples/sec: 2506.00 - lr: 0.000006 - momentum: 0.000000
2023-10-17 21:09:04,554 epoch 9 - iter 294/1476 - loss 0.01151611 - time (sec): 14.15 - samples/sec: 2447.27 - lr: 0.000006 - momentum: 0.000000
2023-10-17 21:09:11,392 epoch 9 - iter 441/1476 - loss 0.00978648 - time (sec): 20.99 - samples/sec: 2392.69 - lr: 0.000006 - momentum: 0.000000
2023-10-17 21:09:18,834 epoch 9 - iter 588/1476 - loss 0.00991456 - time (sec): 28.43 - samples/sec: 2375.92 - lr: 0.000005 - momentum: 0.000000
2023-10-17 21:09:25,871 epoch 9 - iter 735/1476 - loss 0.01023225 - time (sec): 35.47 - samples/sec: 2344.60 - lr: 0.000005 - momentum: 0.000000
2023-10-17 21:09:33,062 epoch 9 - iter 882/1476 - loss 0.00966054 - time (sec): 42.66 - samples/sec: 2357.99 - lr: 0.000005 - momentum: 0.000000
2023-10-17 21:09:40,162 epoch 9 - iter 1029/1476 - loss 0.01008829 - time (sec): 49.76 - samples/sec: 2357.56 - lr: 0.000004 - momentum: 0.000000
2023-10-17 21:09:47,248 epoch 9 - iter 1176/1476 - loss 0.01025552 - time (sec): 56.85 - samples/sec: 2370.39 - lr: 0.000004 - momentum: 0.000000
2023-10-17 21:09:54,306 epoch 9 - iter 1323/1476 - loss 0.00963061 - time (sec): 63.91 - samples/sec: 2359.79 - lr: 0.000004 - momentum: 0.000000
2023-10-17 21:10:01,693 epoch 9 - iter 1470/1476 - loss 0.00908092 - time (sec): 71.29 - samples/sec: 2327.44 - lr: 0.000003 - momentum: 0.000000
2023-10-17 21:10:01,964 ----------------------------------------------------------------------------------------------------
2023-10-17 21:10:01,964 EPOCH 9 done: loss 0.0091 - lr: 0.000003
2023-10-17 21:10:13,701 DEV : loss 0.2243947833776474 - f1-score (micro avg) 0.8475
2023-10-17 21:10:13,751 ----------------------------------------------------------------------------------------------------
2023-10-17 21:10:21,899 epoch 10 - iter 147/1476 - loss 0.00275969 - time (sec): 8.15 - samples/sec: 2031.82 - lr: 0.000003 - momentum: 0.000000
2023-10-17 21:10:30,507 epoch 10 - iter 294/1476 - loss 0.00846906 - time (sec): 16.75 - samples/sec: 2097.44 - lr: 0.000003 - momentum: 0.000000
2023-10-17 21:10:38,195 epoch 10 - iter 441/1476 - loss 0.00806670 - time (sec): 24.44 - samples/sec: 2080.66 - lr: 0.000002 - momentum: 0.000000
2023-10-17 21:10:45,362 epoch 10 - iter 588/1476 - loss 0.00753225 - time (sec): 31.61 - samples/sec: 2141.43 - lr: 0.000002 - momentum: 0.000000
2023-10-17 21:10:52,221 epoch 10 - iter 735/1476 - loss 0.00635976 - time (sec): 38.47 - samples/sec: 2173.07 - lr: 0.000002 - momentum: 0.000000
2023-10-17 21:10:59,111 epoch 10 - iter 882/1476 - loss 0.00703310 - time (sec): 45.36 - samples/sec: 2187.99 - lr: 0.000001 - momentum: 0.000000
2023-10-17 21:11:06,354 epoch 10 - iter 1029/1476 - loss 0.00633039 - time (sec): 52.60 - samples/sec: 2193.56 - lr: 0.000001 - momentum: 0.000000
2023-10-17 21:11:13,330 epoch 10 - iter 1176/1476 - loss 0.00586313 - time (sec): 59.58 - samples/sec: 2215.09 - lr: 0.000001 - momentum: 0.000000
2023-10-17 21:11:20,199 epoch 10 - iter 1323/1476 - loss 0.00585113 - time (sec): 66.45 - samples/sec: 2232.50 - lr: 0.000000 - momentum: 0.000000
2023-10-17 21:11:27,533 epoch 10 - iter 1470/1476 - loss 0.00538743 - time (sec): 73.78 - samples/sec: 2248.65 - lr: 0.000000 - momentum: 0.000000
2023-10-17 21:11:27,799 ----------------------------------------------------------------------------------------------------
2023-10-17 21:11:27,800 EPOCH 10 done: loss 0.0054 - lr: 0.000000
2023-10-17 21:11:39,245 DEV : loss 0.22376643121242523 - f1-score (micro avg) 0.8539
2023-10-17 21:11:39,633 ----------------------------------------------------------------------------------------------------
2023-10-17 21:11:39,634 Loading model from best epoch ...
2023-10-17 21:11:41,018 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 21:11:47,214
Results:
- F-score (micro) 0.7986
- F-score (macro) 0.7013
- Accuracy 0.6844
By class:
precision recall f1-score support
loc 0.8515 0.8753 0.8632 858
pers 0.7638 0.8007 0.7818 537
org 0.5605 0.6667 0.6090 132
time 0.5410 0.6111 0.5739 54
prod 0.7451 0.6230 0.6786 61
micro avg 0.7818 0.8161 0.7986 1642
macro avg 0.6924 0.7154 0.7013 1642
weighted avg 0.7852 0.8161 0.7998 1642
2023-10-17 21:11:47,214 ----------------------------------------------------------------------------------------------------