stefan-it's picture
Upload folder using huggingface_hub
da3e520
2023-10-17 22:40:09,832 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,833 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Train: 5901 sentences
2023-10-17 22:40:09,834 (train_with_dev=False, train_with_test=False)
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Training Params:
2023-10-17 22:40:09,834 - learning_rate: "3e-05"
2023-10-17 22:40:09,834 - mini_batch_size: "4"
2023-10-17 22:40:09,834 - max_epochs: "10"
2023-10-17 22:40:09,834 - shuffle: "True"
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Plugins:
2023-10-17 22:40:09,834 - TensorboardLogger
2023-10-17 22:40:09,834 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 22:40:09,834 - metric: "('micro avg', 'f1-score')"
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Computation:
2023-10-17 22:40:09,834 - compute on device: cuda:0
2023-10-17 22:40:09,834 - embedding storage: none
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,834 ----------------------------------------------------------------------------------------------------
2023-10-17 22:40:09,835 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 22:40:17,519 epoch 1 - iter 147/1476 - loss 3.07750723 - time (sec): 7.68 - samples/sec: 2206.71 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:40:24,336 epoch 1 - iter 294/1476 - loss 1.95459894 - time (sec): 14.50 - samples/sec: 2196.26 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:40:31,314 epoch 1 - iter 441/1476 - loss 1.47363471 - time (sec): 21.48 - samples/sec: 2197.04 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:40:38,542 epoch 1 - iter 588/1476 - loss 1.18167349 - time (sec): 28.71 - samples/sec: 2238.56 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:40:46,134 epoch 1 - iter 735/1476 - loss 0.98564526 - time (sec): 36.30 - samples/sec: 2278.43 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:40:53,423 epoch 1 - iter 882/1476 - loss 0.86421617 - time (sec): 43.59 - samples/sec: 2289.40 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:41:00,627 epoch 1 - iter 1029/1476 - loss 0.77931638 - time (sec): 50.79 - samples/sec: 2268.73 - lr: 0.000021 - momentum: 0.000000
2023-10-17 22:41:07,707 epoch 1 - iter 1176/1476 - loss 0.70386191 - time (sec): 57.87 - samples/sec: 2277.93 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:41:15,069 epoch 1 - iter 1323/1476 - loss 0.64445259 - time (sec): 65.23 - samples/sec: 2276.79 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:41:22,297 epoch 1 - iter 1470/1476 - loss 0.59470038 - time (sec): 72.46 - samples/sec: 2291.42 - lr: 0.000030 - momentum: 0.000000
2023-10-17 22:41:22,569 ----------------------------------------------------------------------------------------------------
2023-10-17 22:41:22,570 EPOCH 1 done: loss 0.5940 - lr: 0.000030
2023-10-17 22:41:29,319 DEV : loss 0.16735684871673584 - f1-score (micro avg) 0.7058
2023-10-17 22:41:29,348 saving best model
2023-10-17 22:41:29,779 ----------------------------------------------------------------------------------------------------
2023-10-17 22:41:37,266 epoch 2 - iter 147/1476 - loss 0.15072771 - time (sec): 7.49 - samples/sec: 2469.71 - lr: 0.000030 - momentum: 0.000000
2023-10-17 22:41:44,582 epoch 2 - iter 294/1476 - loss 0.14746518 - time (sec): 14.80 - samples/sec: 2331.23 - lr: 0.000029 - momentum: 0.000000
2023-10-17 22:41:51,851 epoch 2 - iter 441/1476 - loss 0.13894507 - time (sec): 22.07 - samples/sec: 2291.64 - lr: 0.000029 - momentum: 0.000000
2023-10-17 22:41:59,061 epoch 2 - iter 588/1476 - loss 0.13021866 - time (sec): 29.28 - samples/sec: 2291.07 - lr: 0.000029 - momentum: 0.000000
2023-10-17 22:42:05,987 epoch 2 - iter 735/1476 - loss 0.12984065 - time (sec): 36.21 - samples/sec: 2253.87 - lr: 0.000028 - momentum: 0.000000
2023-10-17 22:42:13,348 epoch 2 - iter 882/1476 - loss 0.12804986 - time (sec): 43.57 - samples/sec: 2245.49 - lr: 0.000028 - momentum: 0.000000
2023-10-17 22:42:20,395 epoch 2 - iter 1029/1476 - loss 0.12990719 - time (sec): 50.61 - samples/sec: 2238.85 - lr: 0.000028 - momentum: 0.000000
2023-10-17 22:42:27,576 epoch 2 - iter 1176/1476 - loss 0.12828340 - time (sec): 57.80 - samples/sec: 2238.14 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:42:35,053 epoch 2 - iter 1323/1476 - loss 0.12942787 - time (sec): 65.27 - samples/sec: 2237.59 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:42:42,936 epoch 2 - iter 1470/1476 - loss 0.12860067 - time (sec): 73.16 - samples/sec: 2244.43 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:42:43,494 ----------------------------------------------------------------------------------------------------
2023-10-17 22:42:43,495 EPOCH 2 done: loss 0.1275 - lr: 0.000027
2023-10-17 22:42:55,310 DEV : loss 0.12050662934780121 - f1-score (micro avg) 0.8191
2023-10-17 22:42:55,343 saving best model
2023-10-17 22:42:55,872 ----------------------------------------------------------------------------------------------------
2023-10-17 22:43:03,355 epoch 3 - iter 147/1476 - loss 0.07089900 - time (sec): 7.48 - samples/sec: 2486.71 - lr: 0.000026 - momentum: 0.000000
2023-10-17 22:43:10,357 epoch 3 - iter 294/1476 - loss 0.07506524 - time (sec): 14.48 - samples/sec: 2406.72 - lr: 0.000026 - momentum: 0.000000
2023-10-17 22:43:17,277 epoch 3 - iter 441/1476 - loss 0.07835628 - time (sec): 21.40 - samples/sec: 2319.71 - lr: 0.000026 - momentum: 0.000000
2023-10-17 22:43:24,328 epoch 3 - iter 588/1476 - loss 0.07833308 - time (sec): 28.45 - samples/sec: 2315.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:43:31,759 epoch 3 - iter 735/1476 - loss 0.07691365 - time (sec): 35.88 - samples/sec: 2265.75 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:43:39,308 epoch 3 - iter 882/1476 - loss 0.07517658 - time (sec): 43.43 - samples/sec: 2271.90 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:43:46,575 epoch 3 - iter 1029/1476 - loss 0.07815255 - time (sec): 50.70 - samples/sec: 2295.81 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:43:53,550 epoch 3 - iter 1176/1476 - loss 0.07835946 - time (sec): 57.68 - samples/sec: 2282.20 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:44:00,946 epoch 3 - iter 1323/1476 - loss 0.07697543 - time (sec): 65.07 - samples/sec: 2273.40 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:44:08,761 epoch 3 - iter 1470/1476 - loss 0.07545738 - time (sec): 72.89 - samples/sec: 2274.74 - lr: 0.000023 - momentum: 0.000000
2023-10-17 22:44:09,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:44:09,031 EPOCH 3 done: loss 0.0753 - lr: 0.000023
2023-10-17 22:44:20,781 DEV : loss 0.15762658417224884 - f1-score (micro avg) 0.8294
2023-10-17 22:44:20,813 saving best model
2023-10-17 22:44:21,236 ----------------------------------------------------------------------------------------------------
2023-10-17 22:44:28,279 epoch 4 - iter 147/1476 - loss 0.04836565 - time (sec): 7.04 - samples/sec: 2203.22 - lr: 0.000023 - momentum: 0.000000
2023-10-17 22:44:35,324 epoch 4 - iter 294/1476 - loss 0.04815048 - time (sec): 14.09 - samples/sec: 2226.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 22:44:42,611 epoch 4 - iter 441/1476 - loss 0.04592847 - time (sec): 21.37 - samples/sec: 2218.10 - lr: 0.000022 - momentum: 0.000000
2023-10-17 22:44:50,431 epoch 4 - iter 588/1476 - loss 0.04935296 - time (sec): 29.19 - samples/sec: 2290.53 - lr: 0.000022 - momentum: 0.000000
2023-10-17 22:44:57,584 epoch 4 - iter 735/1476 - loss 0.05072599 - time (sec): 36.35 - samples/sec: 2307.10 - lr: 0.000022 - momentum: 0.000000
2023-10-17 22:45:04,847 epoch 4 - iter 882/1476 - loss 0.05252931 - time (sec): 43.61 - samples/sec: 2269.02 - lr: 0.000021 - momentum: 0.000000
2023-10-17 22:45:11,951 epoch 4 - iter 1029/1476 - loss 0.05279802 - time (sec): 50.71 - samples/sec: 2271.93 - lr: 0.000021 - momentum: 0.000000
2023-10-17 22:45:19,180 epoch 4 - iter 1176/1476 - loss 0.05225997 - time (sec): 57.94 - samples/sec: 2263.76 - lr: 0.000021 - momentum: 0.000000
2023-10-17 22:45:26,359 epoch 4 - iter 1323/1476 - loss 0.05271834 - time (sec): 65.12 - samples/sec: 2260.40 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:45:34,049 epoch 4 - iter 1470/1476 - loss 0.05452585 - time (sec): 72.81 - samples/sec: 2278.00 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:45:34,354 ----------------------------------------------------------------------------------------------------
2023-10-17 22:45:34,354 EPOCH 4 done: loss 0.0544 - lr: 0.000020
2023-10-17 22:45:45,923 DEV : loss 0.1822829395532608 - f1-score (micro avg) 0.8332
2023-10-17 22:45:45,954 saving best model
2023-10-17 22:45:46,477 ----------------------------------------------------------------------------------------------------
2023-10-17 22:45:53,531 epoch 5 - iter 147/1476 - loss 0.03087592 - time (sec): 7.05 - samples/sec: 2257.42 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:46:00,812 epoch 5 - iter 294/1476 - loss 0.03510467 - time (sec): 14.33 - samples/sec: 2199.75 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:46:08,156 epoch 5 - iter 441/1476 - loss 0.03365358 - time (sec): 21.67 - samples/sec: 2227.65 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:46:15,353 epoch 5 - iter 588/1476 - loss 0.03434837 - time (sec): 28.87 - samples/sec: 2274.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:46:22,756 epoch 5 - iter 735/1476 - loss 0.03685440 - time (sec): 36.27 - samples/sec: 2283.78 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:46:30,304 epoch 5 - iter 882/1476 - loss 0.03825794 - time (sec): 43.82 - samples/sec: 2271.52 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:46:37,448 epoch 5 - iter 1029/1476 - loss 0.03822057 - time (sec): 50.97 - samples/sec: 2265.37 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:46:45,497 epoch 5 - iter 1176/1476 - loss 0.03924243 - time (sec): 59.02 - samples/sec: 2293.68 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:46:52,518 epoch 5 - iter 1323/1476 - loss 0.03934298 - time (sec): 66.04 - samples/sec: 2280.42 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:46:59,295 epoch 5 - iter 1470/1476 - loss 0.03882360 - time (sec): 72.81 - samples/sec: 2278.67 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:46:59,561 ----------------------------------------------------------------------------------------------------
2023-10-17 22:46:59,561 EPOCH 5 done: loss 0.0388 - lr: 0.000017
2023-10-17 22:47:11,162 DEV : loss 0.18621014058589935 - f1-score (micro avg) 0.8379
2023-10-17 22:47:11,197 saving best model
2023-10-17 22:47:11,723 ----------------------------------------------------------------------------------------------------
2023-10-17 22:47:18,892 epoch 6 - iter 147/1476 - loss 0.02770414 - time (sec): 7.17 - samples/sec: 2264.60 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:47:25,711 epoch 6 - iter 294/1476 - loss 0.02851307 - time (sec): 13.99 - samples/sec: 2323.19 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:47:33,069 epoch 6 - iter 441/1476 - loss 0.02486202 - time (sec): 21.34 - samples/sec: 2347.40 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:47:39,956 epoch 6 - iter 588/1476 - loss 0.02524968 - time (sec): 28.23 - samples/sec: 2325.82 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:47:47,038 epoch 6 - iter 735/1476 - loss 0.02636018 - time (sec): 35.31 - samples/sec: 2305.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:47:54,645 epoch 6 - iter 882/1476 - loss 0.02920486 - time (sec): 42.92 - samples/sec: 2311.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:48:01,797 epoch 6 - iter 1029/1476 - loss 0.02934640 - time (sec): 50.07 - samples/sec: 2289.23 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:48:09,059 epoch 6 - iter 1176/1476 - loss 0.02885883 - time (sec): 57.33 - samples/sec: 2292.72 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:48:16,703 epoch 6 - iter 1323/1476 - loss 0.02769546 - time (sec): 64.98 - samples/sec: 2304.69 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:48:23,801 epoch 6 - iter 1470/1476 - loss 0.02636714 - time (sec): 72.08 - samples/sec: 2300.74 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:48:24,067 ----------------------------------------------------------------------------------------------------
2023-10-17 22:48:24,067 EPOCH 6 done: loss 0.0263 - lr: 0.000013
2023-10-17 22:48:35,658 DEV : loss 0.20406724512577057 - f1-score (micro avg) 0.8511
2023-10-17 22:48:35,689 saving best model
2023-10-17 22:48:36,271 ----------------------------------------------------------------------------------------------------
2023-10-17 22:48:43,803 epoch 7 - iter 147/1476 - loss 0.02205793 - time (sec): 7.53 - samples/sec: 2524.67 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:48:50,958 epoch 7 - iter 294/1476 - loss 0.02143969 - time (sec): 14.69 - samples/sec: 2419.88 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:48:57,854 epoch 7 - iter 441/1476 - loss 0.02028760 - time (sec): 21.58 - samples/sec: 2377.35 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:49:05,154 epoch 7 - iter 588/1476 - loss 0.01741992 - time (sec): 28.88 - samples/sec: 2373.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:49:11,967 epoch 7 - iter 735/1476 - loss 0.01641821 - time (sec): 35.69 - samples/sec: 2342.40 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:49:19,144 epoch 7 - iter 882/1476 - loss 0.01806466 - time (sec): 42.87 - samples/sec: 2332.93 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:49:26,261 epoch 7 - iter 1029/1476 - loss 0.01822509 - time (sec): 49.99 - samples/sec: 2333.26 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:49:33,408 epoch 7 - iter 1176/1476 - loss 0.01920481 - time (sec): 57.14 - samples/sec: 2326.48 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:49:40,792 epoch 7 - iter 1323/1476 - loss 0.01923333 - time (sec): 64.52 - samples/sec: 2314.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:49:47,921 epoch 7 - iter 1470/1476 - loss 0.01855287 - time (sec): 71.65 - samples/sec: 2312.81 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:49:48,208 ----------------------------------------------------------------------------------------------------
2023-10-17 22:49:48,208 EPOCH 7 done: loss 0.0185 - lr: 0.000010
2023-10-17 22:50:00,146 DEV : loss 0.2245376855134964 - f1-score (micro avg) 0.8306
2023-10-17 22:50:00,178 ----------------------------------------------------------------------------------------------------
2023-10-17 22:50:07,857 epoch 8 - iter 147/1476 - loss 0.01592872 - time (sec): 7.68 - samples/sec: 2498.55 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:50:15,185 epoch 8 - iter 294/1476 - loss 0.01652058 - time (sec): 15.01 - samples/sec: 2318.44 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:50:23,579 epoch 8 - iter 441/1476 - loss 0.01518762 - time (sec): 23.40 - samples/sec: 2257.87 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:50:31,110 epoch 8 - iter 588/1476 - loss 0.01343599 - time (sec): 30.93 - samples/sec: 2264.61 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:50:38,254 epoch 8 - iter 735/1476 - loss 0.01329444 - time (sec): 38.07 - samples/sec: 2262.48 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:50:45,519 epoch 8 - iter 882/1476 - loss 0.01390326 - time (sec): 45.34 - samples/sec: 2239.01 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:50:52,644 epoch 8 - iter 1029/1476 - loss 0.01339450 - time (sec): 52.46 - samples/sec: 2242.96 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:50:59,897 epoch 8 - iter 1176/1476 - loss 0.01415263 - time (sec): 59.72 - samples/sec: 2249.11 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:51:07,040 epoch 8 - iter 1323/1476 - loss 0.01348761 - time (sec): 66.86 - samples/sec: 2248.82 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:51:14,049 epoch 8 - iter 1470/1476 - loss 0.01308090 - time (sec): 73.87 - samples/sec: 2245.12 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:51:14,314 ----------------------------------------------------------------------------------------------------
2023-10-17 22:51:14,314 EPOCH 8 done: loss 0.0130 - lr: 0.000007
2023-10-17 22:51:25,969 DEV : loss 0.21830753982067108 - f1-score (micro avg) 0.8335
2023-10-17 22:51:26,000 ----------------------------------------------------------------------------------------------------
2023-10-17 22:51:33,103 epoch 9 - iter 147/1476 - loss 0.00986709 - time (sec): 7.10 - samples/sec: 2119.93 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:51:40,262 epoch 9 - iter 294/1476 - loss 0.00760639 - time (sec): 14.26 - samples/sec: 2184.15 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:51:48,382 epoch 9 - iter 441/1476 - loss 0.01190210 - time (sec): 22.38 - samples/sec: 2319.69 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:51:55,409 epoch 9 - iter 588/1476 - loss 0.01052885 - time (sec): 29.41 - samples/sec: 2261.66 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:52:02,664 epoch 9 - iter 735/1476 - loss 0.00969757 - time (sec): 36.66 - samples/sec: 2243.39 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:52:10,084 epoch 9 - iter 882/1476 - loss 0.00968684 - time (sec): 44.08 - samples/sec: 2280.73 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:52:17,065 epoch 9 - iter 1029/1476 - loss 0.00877958 - time (sec): 51.06 - samples/sec: 2263.45 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:52:24,448 epoch 9 - iter 1176/1476 - loss 0.01011816 - time (sec): 58.45 - samples/sec: 2271.94 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:52:31,842 epoch 9 - iter 1323/1476 - loss 0.00976704 - time (sec): 65.84 - samples/sec: 2272.64 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:52:38,763 epoch 9 - iter 1470/1476 - loss 0.00911628 - time (sec): 72.76 - samples/sec: 2275.80 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:52:39,069 ----------------------------------------------------------------------------------------------------
2023-10-17 22:52:39,069 EPOCH 9 done: loss 0.0091 - lr: 0.000003
2023-10-17 22:52:50,843 DEV : loss 0.22712911665439606 - f1-score (micro avg) 0.843
2023-10-17 22:52:50,874 ----------------------------------------------------------------------------------------------------
2023-10-17 22:52:58,014 epoch 10 - iter 147/1476 - loss 0.00057728 - time (sec): 7.14 - samples/sec: 2074.37 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:53:05,235 epoch 10 - iter 294/1476 - loss 0.00458539 - time (sec): 14.36 - samples/sec: 2150.42 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:53:12,340 epoch 10 - iter 441/1476 - loss 0.00372741 - time (sec): 21.47 - samples/sec: 2177.11 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:53:19,847 epoch 10 - iter 588/1476 - loss 0.00604047 - time (sec): 28.97 - samples/sec: 2274.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:53:26,849 epoch 10 - iter 735/1476 - loss 0.00618119 - time (sec): 35.97 - samples/sec: 2261.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:53:34,080 epoch 10 - iter 882/1476 - loss 0.00787261 - time (sec): 43.21 - samples/sec: 2280.58 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:53:41,061 epoch 10 - iter 1029/1476 - loss 0.00728488 - time (sec): 50.19 - samples/sec: 2282.92 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:53:48,202 epoch 10 - iter 1176/1476 - loss 0.00691182 - time (sec): 57.33 - samples/sec: 2283.09 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:53:55,920 epoch 10 - iter 1323/1476 - loss 0.00682906 - time (sec): 65.04 - samples/sec: 2298.88 - lr: 0.000000 - momentum: 0.000000
2023-10-17 22:54:03,480 epoch 10 - iter 1470/1476 - loss 0.00648025 - time (sec): 72.61 - samples/sec: 2278.46 - lr: 0.000000 - momentum: 0.000000
2023-10-17 22:54:03,815 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:03,815 EPOCH 10 done: loss 0.0064 - lr: 0.000000
2023-10-17 22:54:15,411 DEV : loss 0.2371174395084381 - f1-score (micro avg) 0.8457
2023-10-17 22:54:15,851 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:15,852 Loading model from best epoch ...
2023-10-17 22:54:17,337 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 22:54:23,597
Results:
- F-score (micro) 0.8158
- F-score (macro) 0.7248
- Accuracy 0.7083
By class:
precision recall f1-score support
loc 0.8456 0.8939 0.8691 858
pers 0.8044 0.8119 0.8082 537
org 0.6609 0.5758 0.6154 132
prod 0.7419 0.7541 0.7480 61
time 0.5303 0.6481 0.5833 54
micro avg 0.8038 0.8283 0.8158 1642
macro avg 0.7166 0.7368 0.7248 1642
weighted avg 0.8031 0.8283 0.8149 1642
2023-10-17 22:54:23,598 ----------------------------------------------------------------------------------------------------