2023-10-17 20:20:37,487 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,488 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 20:20:37,488 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,488 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 20:20:37,488 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,488 Train: 5901 sentences 2023-10-17 20:20:37,488 (train_with_dev=False, train_with_test=False) 2023-10-17 20:20:37,488 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,488 Training Params: 2023-10-17 20:20:37,488 - learning_rate: "5e-05" 2023-10-17 20:20:37,489 - mini_batch_size: "4" 2023-10-17 20:20:37,489 - max_epochs: "10" 2023-10-17 20:20:37,489 - shuffle: "True" 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 Plugins: 2023-10-17 20:20:37,489 - TensorboardLogger 2023-10-17 20:20:37,489 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 20:20:37,489 - metric: "('micro avg', 'f1-score')" 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 Computation: 2023-10-17 20:20:37,489 - compute on device: cuda:0 2023-10-17 20:20:37,489 - embedding storage: none 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:37,489 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 20:20:45,045 epoch 1 - iter 147/1476 - loss 2.45075682 - time (sec): 7.55 - samples/sec: 2345.15 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:20:52,026 epoch 1 - iter 294/1476 - loss 1.54365509 - time (sec): 14.54 - samples/sec: 2280.56 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:20:59,562 epoch 1 - iter 441/1476 - loss 1.13854388 - time (sec): 22.07 - samples/sec: 2333.15 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:21:06,932 epoch 1 - iter 588/1476 - loss 0.91730690 - time (sec): 29.44 - samples/sec: 2358.65 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:21:14,098 epoch 1 - iter 735/1476 - loss 0.78832461 - time (sec): 36.61 - samples/sec: 2336.33 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:21:21,137 epoch 1 - iter 882/1476 - loss 0.70576861 - time (sec): 43.65 - samples/sec: 2298.06 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:21:28,234 epoch 1 - iter 1029/1476 - loss 0.64071654 - time (sec): 50.74 - samples/sec: 2286.04 - lr: 0.000035 - momentum: 0.000000 2023-10-17 20:21:35,283 epoch 1 - iter 1176/1476 - loss 0.58480153 - time (sec): 57.79 - samples/sec: 2286.47 - lr: 0.000040 - momentum: 0.000000 2023-10-17 20:21:42,496 epoch 1 - iter 1323/1476 - loss 0.54130591 - time (sec): 65.01 - samples/sec: 2285.85 - lr: 0.000045 - momentum: 0.000000 2023-10-17 20:21:49,635 epoch 1 - iter 1470/1476 - loss 0.50188471 - time (sec): 72.14 - samples/sec: 2299.29 - lr: 0.000050 - momentum: 0.000000 2023-10-17 20:21:49,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:21:49,886 EPOCH 1 done: loss 0.5008 - lr: 0.000050 2023-10-17 20:21:56,210 DEV : loss 0.12269438803195953 - f1-score (micro avg) 0.7543 2023-10-17 20:21:56,243 saving best model 2023-10-17 20:21:56,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:22:03,964 epoch 2 - iter 147/1476 - loss 0.15141280 - time (sec): 7.36 - samples/sec: 2270.51 - lr: 0.000049 - momentum: 0.000000 2023-10-17 20:22:11,323 epoch 2 - iter 294/1476 - loss 0.15112657 - time (sec): 14.72 - samples/sec: 2373.19 - lr: 0.000049 - momentum: 0.000000 2023-10-17 20:22:18,876 epoch 2 - iter 441/1476 - loss 0.14768430 - time (sec): 22.27 - samples/sec: 2351.50 - lr: 0.000048 - momentum: 0.000000 2023-10-17 20:22:25,980 epoch 2 - iter 588/1476 - loss 0.14500357 - time (sec): 29.38 - samples/sec: 2312.57 - lr: 0.000048 - momentum: 0.000000 2023-10-17 20:22:33,091 epoch 2 - iter 735/1476 - loss 0.14249498 - time (sec): 36.49 - samples/sec: 2254.48 - lr: 0.000047 - momentum: 0.000000 2023-10-17 20:22:40,077 epoch 2 - iter 882/1476 - loss 0.14525703 - time (sec): 43.48 - samples/sec: 2246.59 - lr: 0.000047 - momentum: 0.000000 2023-10-17 20:22:47,573 epoch 2 - iter 1029/1476 - loss 0.14238917 - time (sec): 50.97 - samples/sec: 2239.06 - lr: 0.000046 - momentum: 0.000000 2023-10-17 20:22:54,845 epoch 2 - iter 1176/1476 - loss 0.14182713 - time (sec): 58.24 - samples/sec: 2239.66 - lr: 0.000046 - momentum: 0.000000 2023-10-17 20:23:02,268 epoch 2 - iter 1323/1476 - loss 0.14190065 - time (sec): 65.67 - samples/sec: 2261.46 - lr: 0.000045 - momentum: 0.000000 2023-10-17 20:23:09,457 epoch 2 - iter 1470/1476 - loss 0.14112657 - time (sec): 72.86 - samples/sec: 2276.21 - lr: 0.000044 - momentum: 0.000000 2023-10-17 20:23:09,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:23:09,710 EPOCH 2 done: loss 0.1411 - lr: 0.000044 2023-10-17 20:23:21,174 DEV : loss 0.16049356758594513 - f1-score (micro avg) 0.7019 2023-10-17 20:23:21,207 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:23:28,726 epoch 3 - iter 147/1476 - loss 0.08022691 - time (sec): 7.52 - samples/sec: 2350.39 - lr: 0.000044 - momentum: 0.000000 2023-10-17 20:23:36,021 epoch 3 - iter 294/1476 - loss 0.08406285 - time (sec): 14.81 - samples/sec: 2373.14 - lr: 0.000043 - momentum: 0.000000 2023-10-17 20:23:43,025 epoch 3 - iter 441/1476 - loss 0.08796088 - time (sec): 21.82 - samples/sec: 2379.86 - lr: 0.000043 - momentum: 0.000000 2023-10-17 20:23:50,369 epoch 3 - iter 588/1476 - loss 0.09119106 - time (sec): 29.16 - samples/sec: 2330.63 - lr: 0.000042 - momentum: 0.000000 2023-10-17 20:23:57,561 epoch 3 - iter 735/1476 - loss 0.09523907 - time (sec): 36.35 - samples/sec: 2322.17 - lr: 0.000042 - momentum: 0.000000 2023-10-17 20:24:04,678 epoch 3 - iter 882/1476 - loss 0.09396738 - time (sec): 43.47 - samples/sec: 2293.44 - lr: 0.000041 - momentum: 0.000000 2023-10-17 20:24:12,835 epoch 3 - iter 1029/1476 - loss 0.09373264 - time (sec): 51.63 - samples/sec: 2278.70 - lr: 0.000041 - momentum: 0.000000 2023-10-17 20:24:20,129 epoch 3 - iter 1176/1476 - loss 0.09469625 - time (sec): 58.92 - samples/sec: 2270.69 - lr: 0.000040 - momentum: 0.000000 2023-10-17 20:24:27,320 epoch 3 - iter 1323/1476 - loss 0.09368396 - time (sec): 66.11 - samples/sec: 2267.77 - lr: 0.000039 - momentum: 0.000000 2023-10-17 20:24:34,665 epoch 3 - iter 1470/1476 - loss 0.09312445 - time (sec): 73.46 - samples/sec: 2259.50 - lr: 0.000039 - momentum: 0.000000 2023-10-17 20:24:34,937 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:24:34,938 EPOCH 3 done: loss 0.0932 - lr: 0.000039 2023-10-17 20:24:46,497 DEV : loss 0.13214744627475739 - f1-score (micro avg) 0.7977 2023-10-17 20:24:46,530 saving best model 2023-10-17 20:24:47,001 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:24:54,116 epoch 4 - iter 147/1476 - loss 0.07305392 - time (sec): 7.11 - samples/sec: 2239.81 - lr: 0.000038 - momentum: 0.000000 2023-10-17 20:25:01,367 epoch 4 - iter 294/1476 - loss 0.06289770 - time (sec): 14.36 - samples/sec: 2348.53 - lr: 0.000038 - momentum: 0.000000 2023-10-17 20:25:08,480 epoch 4 - iter 441/1476 - loss 0.06776105 - time (sec): 21.48 - samples/sec: 2284.26 - lr: 0.000037 - momentum: 0.000000 2023-10-17 20:25:16,121 epoch 4 - iter 588/1476 - loss 0.06873963 - time (sec): 29.12 - samples/sec: 2252.25 - lr: 0.000037 - momentum: 0.000000 2023-10-17 20:25:23,375 epoch 4 - iter 735/1476 - loss 0.07185754 - time (sec): 36.37 - samples/sec: 2207.32 - lr: 0.000036 - momentum: 0.000000 2023-10-17 20:25:30,807 epoch 4 - iter 882/1476 - loss 0.06900936 - time (sec): 43.80 - samples/sec: 2216.41 - lr: 0.000036 - momentum: 0.000000 2023-10-17 20:25:37,796 epoch 4 - iter 1029/1476 - loss 0.06636325 - time (sec): 50.79 - samples/sec: 2221.93 - lr: 0.000035 - momentum: 0.000000 2023-10-17 20:25:45,174 epoch 4 - iter 1176/1476 - loss 0.06586245 - time (sec): 58.17 - samples/sec: 2252.10 - lr: 0.000034 - momentum: 0.000000 2023-10-17 20:25:52,240 epoch 4 - iter 1323/1476 - loss 0.06697787 - time (sec): 65.24 - samples/sec: 2255.01 - lr: 0.000034 - momentum: 0.000000 2023-10-17 20:25:59,983 epoch 4 - iter 1470/1476 - loss 0.06808951 - time (sec): 72.98 - samples/sec: 2271.08 - lr: 0.000033 - momentum: 0.000000 2023-10-17 20:26:00,291 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:26:00,292 EPOCH 4 done: loss 0.0681 - lr: 0.000033 2023-10-17 20:26:11,758 DEV : loss 0.1754840463399887 - f1-score (micro avg) 0.827 2023-10-17 20:26:11,791 saving best model 2023-10-17 20:26:12,273 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:26:19,589 epoch 5 - iter 147/1476 - loss 0.04825016 - time (sec): 7.31 - samples/sec: 2436.31 - lr: 0.000033 - momentum: 0.000000 2023-10-17 20:26:26,872 epoch 5 - iter 294/1476 - loss 0.04930976 - time (sec): 14.59 - samples/sec: 2309.14 - lr: 0.000032 - momentum: 0.000000 2023-10-17 20:26:34,166 epoch 5 - iter 441/1476 - loss 0.04770019 - time (sec): 21.89 - samples/sec: 2307.01 - lr: 0.000032 - momentum: 0.000000 2023-10-17 20:26:41,138 epoch 5 - iter 588/1476 - loss 0.04688871 - time (sec): 28.86 - samples/sec: 2317.40 - lr: 0.000031 - momentum: 0.000000 2023-10-17 20:26:48,383 epoch 5 - iter 735/1476 - loss 0.04601401 - time (sec): 36.10 - samples/sec: 2323.11 - lr: 0.000031 - momentum: 0.000000 2023-10-17 20:26:55,592 epoch 5 - iter 882/1476 - loss 0.04384687 - time (sec): 43.31 - samples/sec: 2318.19 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:27:03,155 epoch 5 - iter 1029/1476 - loss 0.04285307 - time (sec): 50.87 - samples/sec: 2285.79 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:27:10,270 epoch 5 - iter 1176/1476 - loss 0.04658069 - time (sec): 57.99 - samples/sec: 2269.90 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:27:17,743 epoch 5 - iter 1323/1476 - loss 0.04712647 - time (sec): 65.46 - samples/sec: 2293.93 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:27:25,424 epoch 5 - iter 1470/1476 - loss 0.04713805 - time (sec): 73.14 - samples/sec: 2268.84 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:27:25,686 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:27:25,686 EPOCH 5 done: loss 0.0476 - lr: 0.000028 2023-10-17 20:27:37,323 DEV : loss 0.21042108535766602 - f1-score (micro avg) 0.8307 2023-10-17 20:27:37,354 saving best model 2023-10-17 20:27:37,817 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:27:44,810 epoch 6 - iter 147/1476 - loss 0.03453912 - time (sec): 6.99 - samples/sec: 2265.05 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:27:51,917 epoch 6 - iter 294/1476 - loss 0.03647166 - time (sec): 14.10 - samples/sec: 2343.34 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:27:58,988 epoch 6 - iter 441/1476 - loss 0.03049749 - time (sec): 21.17 - samples/sec: 2351.10 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:28:06,214 epoch 6 - iter 588/1476 - loss 0.03256111 - time (sec): 28.39 - samples/sec: 2317.20 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:28:13,245 epoch 6 - iter 735/1476 - loss 0.03378667 - time (sec): 35.42 - samples/sec: 2312.00 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:28:20,160 epoch 6 - iter 882/1476 - loss 0.03314087 - time (sec): 42.34 - samples/sec: 2304.81 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:28:27,651 epoch 6 - iter 1029/1476 - loss 0.03089634 - time (sec): 49.83 - samples/sec: 2307.34 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:28:34,770 epoch 6 - iter 1176/1476 - loss 0.03003096 - time (sec): 56.95 - samples/sec: 2309.76 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:28:41,901 epoch 6 - iter 1323/1476 - loss 0.03187601 - time (sec): 64.08 - samples/sec: 2305.63 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:28:49,320 epoch 6 - iter 1470/1476 - loss 0.03330083 - time (sec): 71.50 - samples/sec: 2319.18 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:28:49,614 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:28:49,614 EPOCH 6 done: loss 0.0332 - lr: 0.000022 2023-10-17 20:29:01,261 DEV : loss 0.18915601074695587 - f1-score (micro avg) 0.8285 2023-10-17 20:29:01,298 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:29:08,867 epoch 7 - iter 147/1476 - loss 0.01653898 - time (sec): 7.57 - samples/sec: 2231.80 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:29:17,104 epoch 7 - iter 294/1476 - loss 0.02599015 - time (sec): 15.80 - samples/sec: 2102.79 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:29:24,982 epoch 7 - iter 441/1476 - loss 0.02861579 - time (sec): 23.68 - samples/sec: 2159.44 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:29:32,237 epoch 7 - iter 588/1476 - loss 0.02528070 - time (sec): 30.94 - samples/sec: 2219.64 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:29:39,342 epoch 7 - iter 735/1476 - loss 0.02570093 - time (sec): 38.04 - samples/sec: 2218.06 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:29:46,643 epoch 7 - iter 882/1476 - loss 0.02442225 - time (sec): 45.34 - samples/sec: 2231.31 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:29:53,635 epoch 7 - iter 1029/1476 - loss 0.02453850 - time (sec): 52.34 - samples/sec: 2235.33 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:30:01,791 epoch 7 - iter 1176/1476 - loss 0.02386131 - time (sec): 60.49 - samples/sec: 2217.70 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:30:09,880 epoch 7 - iter 1323/1476 - loss 0.02311452 - time (sec): 68.58 - samples/sec: 2204.29 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:30:17,172 epoch 7 - iter 1470/1476 - loss 0.02213326 - time (sec): 75.87 - samples/sec: 2186.09 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:30:17,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:30:17,451 EPOCH 7 done: loss 0.0222 - lr: 0.000017 2023-10-17 20:30:29,432 DEV : loss 0.22237005829811096 - f1-score (micro avg) 0.8375 2023-10-17 20:30:29,467 saving best model 2023-10-17 20:30:29,942 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:30:37,188 epoch 8 - iter 147/1476 - loss 0.00842618 - time (sec): 7.24 - samples/sec: 2253.24 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:30:44,981 epoch 8 - iter 294/1476 - loss 0.00786122 - time (sec): 15.04 - samples/sec: 2269.72 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:30:52,418 epoch 8 - iter 441/1476 - loss 0.01083371 - time (sec): 22.47 - samples/sec: 2214.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:30:59,408 epoch 8 - iter 588/1476 - loss 0.01158928 - time (sec): 29.46 - samples/sec: 2230.67 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:31:06,510 epoch 8 - iter 735/1476 - loss 0.01381677 - time (sec): 36.57 - samples/sec: 2262.93 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:31:13,291 epoch 8 - iter 882/1476 - loss 0.01326944 - time (sec): 43.35 - samples/sec: 2266.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:31:21,217 epoch 8 - iter 1029/1476 - loss 0.01584565 - time (sec): 51.27 - samples/sec: 2295.28 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:31:28,174 epoch 8 - iter 1176/1476 - loss 0.01500981 - time (sec): 58.23 - samples/sec: 2292.30 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:31:35,385 epoch 8 - iter 1323/1476 - loss 0.01457328 - time (sec): 65.44 - samples/sec: 2296.33 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:31:42,554 epoch 8 - iter 1470/1476 - loss 0.01473348 - time (sec): 72.61 - samples/sec: 2278.84 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:31:42,937 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:31:42,937 EPOCH 8 done: loss 0.0147 - lr: 0.000011 2023-10-17 20:31:55,009 DEV : loss 0.2160811871290207 - f1-score (micro avg) 0.839 2023-10-17 20:31:55,060 saving best model 2023-10-17 20:31:55,642 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:32:04,050 epoch 9 - iter 147/1476 - loss 0.00655270 - time (sec): 8.41 - samples/sec: 2143.44 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:32:12,106 epoch 9 - iter 294/1476 - loss 0.00851486 - time (sec): 16.46 - samples/sec: 2211.15 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:32:20,382 epoch 9 - iter 441/1476 - loss 0.00841270 - time (sec): 24.74 - samples/sec: 2210.55 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:32:28,491 epoch 9 - iter 588/1476 - loss 0.00821208 - time (sec): 32.85 - samples/sec: 2128.04 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:32:36,654 epoch 9 - iter 735/1476 - loss 0.00783197 - time (sec): 41.01 - samples/sec: 2096.88 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:32:44,502 epoch 9 - iter 882/1476 - loss 0.00844606 - time (sec): 48.86 - samples/sec: 2093.27 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:32:52,126 epoch 9 - iter 1029/1476 - loss 0.00913720 - time (sec): 56.48 - samples/sec: 2077.93 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:32:59,984 epoch 9 - iter 1176/1476 - loss 0.00836341 - time (sec): 64.34 - samples/sec: 2065.61 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:33:08,375 epoch 9 - iter 1323/1476 - loss 0.00807285 - time (sec): 72.73 - samples/sec: 2079.22 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:33:15,939 epoch 9 - iter 1470/1476 - loss 0.00815447 - time (sec): 80.30 - samples/sec: 2063.52 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:33:16,264 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:33:16,264 EPOCH 9 done: loss 0.0081 - lr: 0.000006 2023-10-17 20:33:28,004 DEV : loss 0.22555013000965118 - f1-score (micro avg) 0.8369 2023-10-17 20:33:28,043 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:33:36,574 epoch 10 - iter 147/1476 - loss 0.00592010 - time (sec): 8.53 - samples/sec: 2314.54 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:33:45,359 epoch 10 - iter 294/1476 - loss 0.00415630 - time (sec): 17.31 - samples/sec: 2137.17 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:33:53,166 epoch 10 - iter 441/1476 - loss 0.00430526 - time (sec): 25.12 - samples/sec: 2120.30 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:34:00,904 epoch 10 - iter 588/1476 - loss 0.00403158 - time (sec): 32.86 - samples/sec: 2070.07 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:34:08,706 epoch 10 - iter 735/1476 - loss 0.00450434 - time (sec): 40.66 - samples/sec: 2059.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:34:16,327 epoch 10 - iter 882/1476 - loss 0.00429572 - time (sec): 48.28 - samples/sec: 2066.70 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:34:24,451 epoch 10 - iter 1029/1476 - loss 0.00479129 - time (sec): 56.41 - samples/sec: 2075.91 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:34:32,203 epoch 10 - iter 1176/1476 - loss 0.00542076 - time (sec): 64.16 - samples/sec: 2065.92 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:34:39,990 epoch 10 - iter 1323/1476 - loss 0.00558378 - time (sec): 71.95 - samples/sec: 2061.86 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:34:47,979 epoch 10 - iter 1470/1476 - loss 0.00606797 - time (sec): 79.93 - samples/sec: 2075.49 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:34:48,280 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:34:48,280 EPOCH 10 done: loss 0.0061 - lr: 0.000000 2023-10-17 20:34:59,857 DEV : loss 0.2355625480413437 - f1-score (micro avg) 0.8344 2023-10-17 20:35:00,410 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:35:00,411 Loading model from best epoch ... 2023-10-17 20:35:02,129 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 20:35:08,299 Results: - F-score (micro) 0.8028 - F-score (macro) 0.7089 - Accuracy 0.691 By class: precision recall f1-score support loc 0.8573 0.8683 0.8628 858 pers 0.7960 0.8063 0.8011 537 org 0.5260 0.6136 0.5664 132 prod 0.7167 0.7049 0.7107 61 time 0.5645 0.6481 0.6034 54 micro avg 0.7916 0.8143 0.8028 1642 macro avg 0.6921 0.7283 0.7089 1642 weighted avg 0.7958 0.8143 0.8046 1642 2023-10-17 20:35:08,299 ----------------------------------------------------------------------------------------------------