2023-10-17 10:43:23,425 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Train: 966 sentences 2023-10-17 10:43:23,426 (train_with_dev=False, train_with_test=False) 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Training Params: 2023-10-17 10:43:23,426 - learning_rate: "3e-05" 2023-10-17 10:43:23,426 - mini_batch_size: "4" 2023-10-17 10:43:23,426 - max_epochs: "10" 2023-10-17 10:43:23,426 - shuffle: "True" 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Plugins: 2023-10-17 10:43:23,426 - TensorboardLogger 2023-10-17 10:43:23,426 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:43:23,426 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:43:23,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,426 Computation: 2023-10-17 10:43:23,426 - compute on device: cuda:0 2023-10-17 10:43:23,426 - embedding storage: none 2023-10-17 10:43:23,427 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,427 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 10:43:23,427 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,427 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:23,427 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:43:24,504 epoch 1 - iter 24/242 - loss 4.50633463 - time (sec): 1.08 - samples/sec: 2245.92 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:43:25,651 epoch 1 - iter 48/242 - loss 3.82310468 - time (sec): 2.22 - samples/sec: 2295.52 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:43:26,730 epoch 1 - iter 72/242 - loss 3.02011488 - time (sec): 3.30 - samples/sec: 2301.42 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:43:27,826 epoch 1 - iter 96/242 - loss 2.43709632 - time (sec): 4.40 - samples/sec: 2291.30 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:43:28,939 epoch 1 - iter 120/242 - loss 2.00850212 - time (sec): 5.51 - samples/sec: 2293.59 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:43:30,047 epoch 1 - iter 144/242 - loss 1.75672900 - time (sec): 6.62 - samples/sec: 2273.18 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:43:31,137 epoch 1 - iter 168/242 - loss 1.57371072 - time (sec): 7.71 - samples/sec: 2254.69 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:43:32,228 epoch 1 - iter 192/242 - loss 1.42447660 - time (sec): 8.80 - samples/sec: 2249.37 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:43:33,324 epoch 1 - iter 216/242 - loss 1.30854786 - time (sec): 9.90 - samples/sec: 2233.32 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:43:34,442 epoch 1 - iter 240/242 - loss 1.20350225 - time (sec): 11.01 - samples/sec: 2232.46 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:43:34,538 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:34,539 EPOCH 1 done: loss 1.1965 - lr: 0.000030 2023-10-17 10:43:35,140 DEV : loss 0.20349003374576569 - f1-score (micro avg) 0.5952 2023-10-17 10:43:35,145 saving best model 2023-10-17 10:43:35,589 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:36,737 epoch 2 - iter 24/242 - loss 0.23724173 - time (sec): 1.14 - samples/sec: 2280.62 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:43:37,830 epoch 2 - iter 48/242 - loss 0.24614221 - time (sec): 2.24 - samples/sec: 2212.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:43:38,930 epoch 2 - iter 72/242 - loss 0.22188577 - time (sec): 3.34 - samples/sec: 2234.71 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:43:40,028 epoch 2 - iter 96/242 - loss 0.20664918 - time (sec): 4.44 - samples/sec: 2205.45 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:43:41,124 epoch 2 - iter 120/242 - loss 0.20358552 - time (sec): 5.53 - samples/sec: 2171.39 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:43:42,221 epoch 2 - iter 144/242 - loss 0.20094148 - time (sec): 6.63 - samples/sec: 2194.89 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:43:43,305 epoch 2 - iter 168/242 - loss 0.19473548 - time (sec): 7.71 - samples/sec: 2179.83 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:43:44,416 epoch 2 - iter 192/242 - loss 0.18447914 - time (sec): 8.82 - samples/sec: 2231.47 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:43:45,511 epoch 2 - iter 216/242 - loss 0.17895642 - time (sec): 9.92 - samples/sec: 2224.42 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:43:46,628 epoch 2 - iter 240/242 - loss 0.17710668 - time (sec): 11.04 - samples/sec: 2226.50 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:43:46,715 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:46,716 EPOCH 2 done: loss 0.1767 - lr: 0.000027 2023-10-17 10:43:47,643 DEV : loss 0.14694465696811676 - f1-score (micro avg) 0.7711 2023-10-17 10:43:47,648 saving best model 2023-10-17 10:43:48,226 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:49,323 epoch 3 - iter 24/242 - loss 0.15723834 - time (sec): 1.10 - samples/sec: 2158.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:43:50,455 epoch 3 - iter 48/242 - loss 0.11376720 - time (sec): 2.23 - samples/sec: 2196.04 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:43:51,571 epoch 3 - iter 72/242 - loss 0.09861329 - time (sec): 3.34 - samples/sec: 2154.07 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:43:52,673 epoch 3 - iter 96/242 - loss 0.09545921 - time (sec): 4.45 - samples/sec: 2176.77 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:43:53,769 epoch 3 - iter 120/242 - loss 0.10069060 - time (sec): 5.54 - samples/sec: 2160.74 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:43:54,864 epoch 3 - iter 144/242 - loss 0.10195434 - time (sec): 6.64 - samples/sec: 2201.25 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:43:55,988 epoch 3 - iter 168/242 - loss 0.10207902 - time (sec): 7.76 - samples/sec: 2216.61 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:43:57,103 epoch 3 - iter 192/242 - loss 0.10383210 - time (sec): 8.88 - samples/sec: 2208.96 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:43:58,208 epoch 3 - iter 216/242 - loss 0.10017263 - time (sec): 9.98 - samples/sec: 2200.97 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:43:59,315 epoch 3 - iter 240/242 - loss 0.09984191 - time (sec): 11.09 - samples/sec: 2219.09 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:43:59,417 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:59,418 EPOCH 3 done: loss 0.0993 - lr: 0.000023 2023-10-17 10:44:00,199 DEV : loss 0.1559617966413498 - f1-score (micro avg) 0.808 2023-10-17 10:44:00,205 saving best model 2023-10-17 10:44:00,702 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:01,847 epoch 4 - iter 24/242 - loss 0.06933295 - time (sec): 1.14 - samples/sec: 2093.52 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:44:02,996 epoch 4 - iter 48/242 - loss 0.05508186 - time (sec): 2.29 - samples/sec: 2189.39 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:44:04,091 epoch 4 - iter 72/242 - loss 0.06080892 - time (sec): 3.39 - samples/sec: 2199.25 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:44:05,201 epoch 4 - iter 96/242 - loss 0.07338365 - time (sec): 4.50 - samples/sec: 2215.13 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:44:06,288 epoch 4 - iter 120/242 - loss 0.07371388 - time (sec): 5.58 - samples/sec: 2198.28 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:44:07,403 epoch 4 - iter 144/242 - loss 0.07076303 - time (sec): 6.70 - samples/sec: 2226.96 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:44:08,493 epoch 4 - iter 168/242 - loss 0.07489325 - time (sec): 7.79 - samples/sec: 2214.47 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:44:09,626 epoch 4 - iter 192/242 - loss 0.07280727 - time (sec): 8.92 - samples/sec: 2225.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:44:10,720 epoch 4 - iter 216/242 - loss 0.07819685 - time (sec): 10.02 - samples/sec: 2208.60 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:44:11,826 epoch 4 - iter 240/242 - loss 0.07332647 - time (sec): 11.12 - samples/sec: 2200.38 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:44:11,935 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:11,935 EPOCH 4 done: loss 0.0733 - lr: 0.000020 2023-10-17 10:44:12,722 DEV : loss 0.19135209918022156 - f1-score (micro avg) 0.8256 2023-10-17 10:44:12,728 saving best model 2023-10-17 10:44:13,211 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:14,356 epoch 5 - iter 24/242 - loss 0.08121044 - time (sec): 1.14 - samples/sec: 2319.35 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:44:15,513 epoch 5 - iter 48/242 - loss 0.06790236 - time (sec): 2.30 - samples/sec: 2213.89 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:44:16,650 epoch 5 - iter 72/242 - loss 0.06190861 - time (sec): 3.44 - samples/sec: 2182.35 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:44:17,799 epoch 5 - iter 96/242 - loss 0.06890864 - time (sec): 4.58 - samples/sec: 2199.39 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:44:18,927 epoch 5 - iter 120/242 - loss 0.06897976 - time (sec): 5.71 - samples/sec: 2178.28 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:44:20,031 epoch 5 - iter 144/242 - loss 0.06540872 - time (sec): 6.82 - samples/sec: 2194.58 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:44:21,139 epoch 5 - iter 168/242 - loss 0.06264734 - time (sec): 7.92 - samples/sec: 2210.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:44:22,255 epoch 5 - iter 192/242 - loss 0.06071230 - time (sec): 9.04 - samples/sec: 2189.02 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:44:23,381 epoch 5 - iter 216/242 - loss 0.05963512 - time (sec): 10.17 - samples/sec: 2183.75 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:44:24,528 epoch 5 - iter 240/242 - loss 0.05793629 - time (sec): 11.31 - samples/sec: 2167.55 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:44:24,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:24,627 EPOCH 5 done: loss 0.0582 - lr: 0.000017 2023-10-17 10:44:25,430 DEV : loss 0.1789170801639557 - f1-score (micro avg) 0.8348 2023-10-17 10:44:25,436 saving best model 2023-10-17 10:44:25,926 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:27,075 epoch 6 - iter 24/242 - loss 0.04985398 - time (sec): 1.15 - samples/sec: 2147.85 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:44:28,198 epoch 6 - iter 48/242 - loss 0.03267616 - time (sec): 2.27 - samples/sec: 2109.00 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:44:29,370 epoch 6 - iter 72/242 - loss 0.03996354 - time (sec): 3.44 - samples/sec: 2086.36 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:44:30,560 epoch 6 - iter 96/242 - loss 0.03958975 - time (sec): 4.63 - samples/sec: 2109.87 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:44:31,670 epoch 6 - iter 120/242 - loss 0.03543666 - time (sec): 5.74 - samples/sec: 2136.44 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:44:32,780 epoch 6 - iter 144/242 - loss 0.03336359 - time (sec): 6.85 - samples/sec: 2177.54 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:44:33,898 epoch 6 - iter 168/242 - loss 0.03902790 - time (sec): 7.97 - samples/sec: 2188.57 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:44:34,994 epoch 6 - iter 192/242 - loss 0.03982655 - time (sec): 9.07 - samples/sec: 2176.95 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:44:36,101 epoch 6 - iter 216/242 - loss 0.03921389 - time (sec): 10.17 - samples/sec: 2177.46 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:44:37,228 epoch 6 - iter 240/242 - loss 0.04014048 - time (sec): 11.30 - samples/sec: 2180.87 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:44:37,313 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:37,313 EPOCH 6 done: loss 0.0400 - lr: 0.000013 2023-10-17 10:44:38,136 DEV : loss 0.20801085233688354 - f1-score (micro avg) 0.8342 2023-10-17 10:44:38,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:39,307 epoch 7 - iter 24/242 - loss 0.04453521 - time (sec): 1.16 - samples/sec: 2337.69 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:44:40,400 epoch 7 - iter 48/242 - loss 0.03762355 - time (sec): 2.26 - samples/sec: 2123.43 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:44:41,491 epoch 7 - iter 72/242 - loss 0.03749958 - time (sec): 3.35 - samples/sec: 2164.38 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:44:42,687 epoch 7 - iter 96/242 - loss 0.03188376 - time (sec): 4.54 - samples/sec: 2117.00 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:44:43,804 epoch 7 - iter 120/242 - loss 0.03191117 - time (sec): 5.66 - samples/sec: 2155.71 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:44:44,972 epoch 7 - iter 144/242 - loss 0.03400510 - time (sec): 6.83 - samples/sec: 2200.24 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:44:46,096 epoch 7 - iter 168/242 - loss 0.03160299 - time (sec): 7.95 - samples/sec: 2167.96 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:44:47,235 epoch 7 - iter 192/242 - loss 0.03142760 - time (sec): 9.09 - samples/sec: 2191.91 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:44:48,362 epoch 7 - iter 216/242 - loss 0.03004264 - time (sec): 10.22 - samples/sec: 2167.76 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:44:49,462 epoch 7 - iter 240/242 - loss 0.03077950 - time (sec): 11.32 - samples/sec: 2170.15 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:44:49,553 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:49,553 EPOCH 7 done: loss 0.0306 - lr: 0.000010 2023-10-17 10:44:50,380 DEV : loss 0.21018549799919128 - f1-score (micro avg) 0.8596 2023-10-17 10:44:50,388 saving best model 2023-10-17 10:44:50,906 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:44:52,246 epoch 8 - iter 24/242 - loss 0.03256163 - time (sec): 1.34 - samples/sec: 1716.83 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:44:53,602 epoch 8 - iter 48/242 - loss 0.02384796 - time (sec): 2.69 - samples/sec: 1865.43 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:44:54,863 epoch 8 - iter 72/242 - loss 0.02107325 - time (sec): 3.95 - samples/sec: 1862.20 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:44:56,067 epoch 8 - iter 96/242 - loss 0.02937280 - time (sec): 5.16 - samples/sec: 1936.11 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:44:57,271 epoch 8 - iter 120/242 - loss 0.02524872 - time (sec): 6.36 - samples/sec: 1942.50 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:44:58,417 epoch 8 - iter 144/242 - loss 0.02384867 - time (sec): 7.51 - samples/sec: 1984.35 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:44:59,497 epoch 8 - iter 168/242 - loss 0.02335409 - time (sec): 8.59 - samples/sec: 2032.03 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:45:00,568 epoch 8 - iter 192/242 - loss 0.02365502 - time (sec): 9.66 - samples/sec: 2043.22 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:45:01,658 epoch 8 - iter 216/242 - loss 0.02187485 - time (sec): 10.75 - samples/sec: 2061.17 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:45:02,795 epoch 8 - iter 240/242 - loss 0.02079913 - time (sec): 11.89 - samples/sec: 2072.08 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:45:02,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:02,888 EPOCH 8 done: loss 0.0207 - lr: 0.000007 2023-10-17 10:45:03,719 DEV : loss 0.23265871405601501 - f1-score (micro avg) 0.8421 2023-10-17 10:45:03,725 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:04,829 epoch 9 - iter 24/242 - loss 0.00814737 - time (sec): 1.10 - samples/sec: 2295.02 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:45:05,961 epoch 9 - iter 48/242 - loss 0.01322277 - time (sec): 2.24 - samples/sec: 2286.61 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:45:07,144 epoch 9 - iter 72/242 - loss 0.01888777 - time (sec): 3.42 - samples/sec: 2221.38 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:45:08,312 epoch 9 - iter 96/242 - loss 0.01857381 - time (sec): 4.59 - samples/sec: 2157.28 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:45:09,457 epoch 9 - iter 120/242 - loss 0.02081802 - time (sec): 5.73 - samples/sec: 2137.30 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:45:10,609 epoch 9 - iter 144/242 - loss 0.01826304 - time (sec): 6.88 - samples/sec: 2095.25 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:45:11,768 epoch 9 - iter 168/242 - loss 0.01783901 - time (sec): 8.04 - samples/sec: 2117.99 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:45:12,874 epoch 9 - iter 192/242 - loss 0.01661624 - time (sec): 9.15 - samples/sec: 2137.79 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:45:14,034 epoch 9 - iter 216/242 - loss 0.01632579 - time (sec): 10.31 - samples/sec: 2154.43 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:45:15,123 epoch 9 - iter 240/242 - loss 0.01577096 - time (sec): 11.40 - samples/sec: 2154.03 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:45:15,213 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:15,213 EPOCH 9 done: loss 0.0157 - lr: 0.000003 2023-10-17 10:45:16,000 DEV : loss 0.24005292356014252 - f1-score (micro avg) 0.8385 2023-10-17 10:45:16,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:17,158 epoch 10 - iter 24/242 - loss 0.00587990 - time (sec): 1.15 - samples/sec: 2034.35 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:45:18,345 epoch 10 - iter 48/242 - loss 0.01906675 - time (sec): 2.34 - samples/sec: 2077.09 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:45:19,515 epoch 10 - iter 72/242 - loss 0.01409409 - time (sec): 3.51 - samples/sec: 2049.65 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:45:20,732 epoch 10 - iter 96/242 - loss 0.01134298 - time (sec): 4.72 - samples/sec: 2092.42 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:45:21,907 epoch 10 - iter 120/242 - loss 0.01067227 - time (sec): 5.90 - samples/sec: 2136.29 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:45:22,994 epoch 10 - iter 144/242 - loss 0.01308109 - time (sec): 6.99 - samples/sec: 2138.42 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:45:24,108 epoch 10 - iter 168/242 - loss 0.01406185 - time (sec): 8.10 - samples/sec: 2151.51 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:45:25,212 epoch 10 - iter 192/242 - loss 0.01385244 - time (sec): 9.20 - samples/sec: 2172.30 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:45:26,320 epoch 10 - iter 216/242 - loss 0.01384161 - time (sec): 10.31 - samples/sec: 2161.57 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:45:27,431 epoch 10 - iter 240/242 - loss 0.01381030 - time (sec): 11.42 - samples/sec: 2157.96 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:45:27,516 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:27,516 EPOCH 10 done: loss 0.0138 - lr: 0.000000 2023-10-17 10:45:28,314 DEV : loss 0.24683794379234314 - f1-score (micro avg) 0.8315 2023-10-17 10:45:28,746 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:45:28,748 Loading model from best epoch ... 2023-10-17 10:45:30,261 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:45:31,127 Results: - F-score (micro) 0.8168 - F-score (macro) 0.5547 - Accuracy 0.7066 By class: precision recall f1-score support pers 0.8803 0.8993 0.8897 139 scope 0.8175 0.8682 0.8421 129 work 0.6526 0.7750 0.7086 80 loc 0.6667 0.2222 0.3333 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7984 0.8361 0.8168 360 macro avg 0.6034 0.5529 0.5547 360 weighted avg 0.7945 0.8361 0.8111 360 2023-10-17 10:45:31,127 ----------------------------------------------------------------------------------------------------