2023-10-23 18:05:18,170 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,172 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-23 18:05:18,172 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,172 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-23 18:05:18,172 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,172 Train: 1214 sentences 2023-10-23 18:05:18,172 (train_with_dev=False, train_with_test=False) 2023-10-23 18:05:18,172 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,172 Training Params: 2023-10-23 18:05:18,172 - learning_rate: "5e-05" 2023-10-23 18:05:18,172 - mini_batch_size: "4" 2023-10-23 18:05:18,172 - max_epochs: "10" 2023-10-23 18:05:18,172 - shuffle: "True" 2023-10-23 18:05:18,172 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,172 Plugins: 2023-10-23 18:05:18,173 - TensorboardLogger 2023-10-23 18:05:18,173 - LinearScheduler | warmup_fraction: '0.1' 2023-10-23 18:05:18,173 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,173 Final evaluation on model from best epoch (best-model.pt) 2023-10-23 18:05:18,173 - metric: "('micro avg', 'f1-score')" 2023-10-23 18:05:18,173 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,173 Computation: 2023-10-23 18:05:18,173 - compute on device: cuda:0 2023-10-23 18:05:18,173 - embedding storage: none 2023-10-23 18:05:18,173 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,173 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-23 18:05:18,173 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,173 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:18,173 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-23 18:05:19,738 epoch 1 - iter 30/304 - loss 3.02351216 - time (sec): 1.56 - samples/sec: 2092.12 - lr: 0.000005 - momentum: 0.000000 2023-10-23 18:05:21,366 epoch 1 - iter 60/304 - loss 2.16279826 - time (sec): 3.19 - samples/sec: 1911.82 - lr: 0.000010 - momentum: 0.000000 2023-10-23 18:05:23,007 epoch 1 - iter 90/304 - loss 1.64732925 - time (sec): 4.83 - samples/sec: 1911.64 - lr: 0.000015 - momentum: 0.000000 2023-10-23 18:05:24,648 epoch 1 - iter 120/304 - loss 1.30376680 - time (sec): 6.47 - samples/sec: 1966.46 - lr: 0.000020 - momentum: 0.000000 2023-10-23 18:05:26,288 epoch 1 - iter 150/304 - loss 1.11797657 - time (sec): 8.11 - samples/sec: 1933.49 - lr: 0.000025 - momentum: 0.000000 2023-10-23 18:05:27,926 epoch 1 - iter 180/304 - loss 0.98825947 - time (sec): 9.75 - samples/sec: 1913.77 - lr: 0.000029 - momentum: 0.000000 2023-10-23 18:05:29,564 epoch 1 - iter 210/304 - loss 0.88981727 - time (sec): 11.39 - samples/sec: 1897.65 - lr: 0.000034 - momentum: 0.000000 2023-10-23 18:05:31,197 epoch 1 - iter 240/304 - loss 0.80330881 - time (sec): 13.02 - samples/sec: 1882.05 - lr: 0.000039 - momentum: 0.000000 2023-10-23 18:05:32,835 epoch 1 - iter 270/304 - loss 0.74394110 - time (sec): 14.66 - samples/sec: 1872.07 - lr: 0.000044 - momentum: 0.000000 2023-10-23 18:05:34,471 epoch 1 - iter 300/304 - loss 0.68489685 - time (sec): 16.30 - samples/sec: 1880.48 - lr: 0.000049 - momentum: 0.000000 2023-10-23 18:05:34,685 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:34,685 EPOCH 1 done: loss 0.6805 - lr: 0.000049 2023-10-23 18:05:35,473 DEV : loss 0.15541432797908783 - f1-score (micro avg) 0.726 2023-10-23 18:05:35,481 saving best model 2023-10-23 18:05:35,857 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:37,496 epoch 2 - iter 30/304 - loss 0.19423395 - time (sec): 1.64 - samples/sec: 1971.04 - lr: 0.000049 - momentum: 0.000000 2023-10-23 18:05:39,135 epoch 2 - iter 60/304 - loss 0.15037454 - time (sec): 3.28 - samples/sec: 1867.22 - lr: 0.000049 - momentum: 0.000000 2023-10-23 18:05:40,779 epoch 2 - iter 90/304 - loss 0.14114291 - time (sec): 4.92 - samples/sec: 1855.21 - lr: 0.000048 - momentum: 0.000000 2023-10-23 18:05:42,426 epoch 2 - iter 120/304 - loss 0.13326513 - time (sec): 6.57 - samples/sec: 1870.75 - lr: 0.000048 - momentum: 0.000000 2023-10-23 18:05:44,070 epoch 2 - iter 150/304 - loss 0.12683531 - time (sec): 8.21 - samples/sec: 1878.75 - lr: 0.000047 - momentum: 0.000000 2023-10-23 18:05:45,710 epoch 2 - iter 180/304 - loss 0.12499108 - time (sec): 9.85 - samples/sec: 1863.27 - lr: 0.000047 - momentum: 0.000000 2023-10-23 18:05:47,357 epoch 2 - iter 210/304 - loss 0.12849916 - time (sec): 11.50 - samples/sec: 1857.08 - lr: 0.000046 - momentum: 0.000000 2023-10-23 18:05:49,009 epoch 2 - iter 240/304 - loss 0.13158152 - time (sec): 13.15 - samples/sec: 1864.61 - lr: 0.000046 - momentum: 0.000000 2023-10-23 18:05:50,657 epoch 2 - iter 270/304 - loss 0.12718118 - time (sec): 14.80 - samples/sec: 1861.95 - lr: 0.000045 - momentum: 0.000000 2023-10-23 18:05:52,291 epoch 2 - iter 300/304 - loss 0.12696736 - time (sec): 16.43 - samples/sec: 1865.31 - lr: 0.000045 - momentum: 0.000000 2023-10-23 18:05:52,504 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:52,504 EPOCH 2 done: loss 0.1293 - lr: 0.000045 2023-10-23 18:05:53,359 DEV : loss 0.17078670859336853 - f1-score (micro avg) 0.76 2023-10-23 18:05:53,367 saving best model 2023-10-23 18:05:53,888 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:05:55,526 epoch 3 - iter 30/304 - loss 0.08939435 - time (sec): 1.64 - samples/sec: 1863.49 - lr: 0.000044 - momentum: 0.000000 2023-10-23 18:05:57,173 epoch 3 - iter 60/304 - loss 0.07897526 - time (sec): 3.28 - samples/sec: 1808.72 - lr: 0.000043 - momentum: 0.000000 2023-10-23 18:05:58,810 epoch 3 - iter 90/304 - loss 0.08706321 - time (sec): 4.92 - samples/sec: 1820.30 - lr: 0.000043 - momentum: 0.000000 2023-10-23 18:06:00,456 epoch 3 - iter 120/304 - loss 0.07668656 - time (sec): 6.57 - samples/sec: 1832.75 - lr: 0.000042 - momentum: 0.000000 2023-10-23 18:06:02,105 epoch 3 - iter 150/304 - loss 0.07354657 - time (sec): 8.21 - samples/sec: 1826.17 - lr: 0.000042 - momentum: 0.000000 2023-10-23 18:06:03,754 epoch 3 - iter 180/304 - loss 0.08189361 - time (sec): 9.86 - samples/sec: 1846.34 - lr: 0.000041 - momentum: 0.000000 2023-10-23 18:06:05,403 epoch 3 - iter 210/304 - loss 0.07732853 - time (sec): 11.51 - samples/sec: 1871.12 - lr: 0.000041 - momentum: 0.000000 2023-10-23 18:06:07,047 epoch 3 - iter 240/304 - loss 0.08312185 - time (sec): 13.16 - samples/sec: 1868.54 - lr: 0.000040 - momentum: 0.000000 2023-10-23 18:06:08,681 epoch 3 - iter 270/304 - loss 0.08169576 - time (sec): 14.79 - samples/sec: 1877.51 - lr: 0.000040 - momentum: 0.000000 2023-10-23 18:06:10,296 epoch 3 - iter 300/304 - loss 0.08673653 - time (sec): 16.41 - samples/sec: 1870.86 - lr: 0.000039 - momentum: 0.000000 2023-10-23 18:06:10,506 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:10,507 EPOCH 3 done: loss 0.0861 - lr: 0.000039 2023-10-23 18:06:11,363 DEV : loss 0.17484019696712494 - f1-score (micro avg) 0.8167 2023-10-23 18:06:11,371 saving best model 2023-10-23 18:06:11,897 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:13,537 epoch 4 - iter 30/304 - loss 0.04368220 - time (sec): 1.64 - samples/sec: 1658.87 - lr: 0.000038 - momentum: 0.000000 2023-10-23 18:06:15,181 epoch 4 - iter 60/304 - loss 0.05730760 - time (sec): 3.28 - samples/sec: 1825.30 - lr: 0.000038 - momentum: 0.000000 2023-10-23 18:06:16,788 epoch 4 - iter 90/304 - loss 0.06114841 - time (sec): 4.89 - samples/sec: 1849.70 - lr: 0.000037 - momentum: 0.000000 2023-10-23 18:06:18,420 epoch 4 - iter 120/304 - loss 0.05553906 - time (sec): 6.52 - samples/sec: 1850.44 - lr: 0.000037 - momentum: 0.000000 2023-10-23 18:06:20,054 epoch 4 - iter 150/304 - loss 0.06034393 - time (sec): 8.15 - samples/sec: 1885.51 - lr: 0.000036 - momentum: 0.000000 2023-10-23 18:06:21,679 epoch 4 - iter 180/304 - loss 0.06426993 - time (sec): 9.78 - samples/sec: 1865.59 - lr: 0.000036 - momentum: 0.000000 2023-10-23 18:06:23,330 epoch 4 - iter 210/304 - loss 0.06643169 - time (sec): 11.43 - samples/sec: 1891.08 - lr: 0.000035 - momentum: 0.000000 2023-10-23 18:06:24,979 epoch 4 - iter 240/304 - loss 0.06216304 - time (sec): 13.08 - samples/sec: 1900.70 - lr: 0.000035 - momentum: 0.000000 2023-10-23 18:06:26,611 epoch 4 - iter 270/304 - loss 0.06119980 - time (sec): 14.71 - samples/sec: 1877.13 - lr: 0.000034 - momentum: 0.000000 2023-10-23 18:06:28,247 epoch 4 - iter 300/304 - loss 0.05807122 - time (sec): 16.35 - samples/sec: 1874.32 - lr: 0.000033 - momentum: 0.000000 2023-10-23 18:06:28,461 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:28,461 EPOCH 4 done: loss 0.0590 - lr: 0.000033 2023-10-23 18:06:29,313 DEV : loss 0.20702120661735535 - f1-score (micro avg) 0.8028 2023-10-23 18:06:29,320 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:30,953 epoch 5 - iter 30/304 - loss 0.07189814 - time (sec): 1.63 - samples/sec: 1711.36 - lr: 0.000033 - momentum: 0.000000 2023-10-23 18:06:32,593 epoch 5 - iter 60/304 - loss 0.04765151 - time (sec): 3.27 - samples/sec: 1787.29 - lr: 0.000032 - momentum: 0.000000 2023-10-23 18:06:34,231 epoch 5 - iter 90/304 - loss 0.03673195 - time (sec): 4.91 - samples/sec: 1826.48 - lr: 0.000032 - momentum: 0.000000 2023-10-23 18:06:35,868 epoch 5 - iter 120/304 - loss 0.04243436 - time (sec): 6.55 - samples/sec: 1839.20 - lr: 0.000031 - momentum: 0.000000 2023-10-23 18:06:37,507 epoch 5 - iter 150/304 - loss 0.04100820 - time (sec): 8.19 - samples/sec: 1862.05 - lr: 0.000031 - momentum: 0.000000 2023-10-23 18:06:39,146 epoch 5 - iter 180/304 - loss 0.04005715 - time (sec): 9.82 - samples/sec: 1876.76 - lr: 0.000030 - momentum: 0.000000 2023-10-23 18:06:40,785 epoch 5 - iter 210/304 - loss 0.04081348 - time (sec): 11.46 - samples/sec: 1890.67 - lr: 0.000030 - momentum: 0.000000 2023-10-23 18:06:42,432 epoch 5 - iter 240/304 - loss 0.04483258 - time (sec): 13.11 - samples/sec: 1886.03 - lr: 0.000029 - momentum: 0.000000 2023-10-23 18:06:44,078 epoch 5 - iter 270/304 - loss 0.04381410 - time (sec): 14.76 - samples/sec: 1888.61 - lr: 0.000028 - momentum: 0.000000 2023-10-23 18:06:45,716 epoch 5 - iter 300/304 - loss 0.04177830 - time (sec): 16.39 - samples/sec: 1868.39 - lr: 0.000028 - momentum: 0.000000 2023-10-23 18:06:45,928 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:45,929 EPOCH 5 done: loss 0.0418 - lr: 0.000028 2023-10-23 18:06:46,772 DEV : loss 0.19316518306732178 - f1-score (micro avg) 0.8464 2023-10-23 18:06:46,779 saving best model 2023-10-23 18:06:47,338 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:06:48,973 epoch 6 - iter 30/304 - loss 0.05467245 - time (sec): 1.63 - samples/sec: 1886.52 - lr: 0.000027 - momentum: 0.000000 2023-10-23 18:06:50,612 epoch 6 - iter 60/304 - loss 0.03021958 - time (sec): 3.27 - samples/sec: 1888.73 - lr: 0.000027 - momentum: 0.000000 2023-10-23 18:06:52,244 epoch 6 - iter 90/304 - loss 0.02852577 - time (sec): 4.90 - samples/sec: 1844.84 - lr: 0.000026 - momentum: 0.000000 2023-10-23 18:06:53,884 epoch 6 - iter 120/304 - loss 0.02827307 - time (sec): 6.54 - samples/sec: 1856.42 - lr: 0.000026 - momentum: 0.000000 2023-10-23 18:06:55,532 epoch 6 - iter 150/304 - loss 0.03479194 - time (sec): 8.19 - samples/sec: 1873.89 - lr: 0.000025 - momentum: 0.000000 2023-10-23 18:06:57,167 epoch 6 - iter 180/304 - loss 0.03172006 - time (sec): 9.83 - samples/sec: 1845.55 - lr: 0.000025 - momentum: 0.000000 2023-10-23 18:06:58,806 epoch 6 - iter 210/304 - loss 0.03968617 - time (sec): 11.47 - samples/sec: 1847.34 - lr: 0.000024 - momentum: 0.000000 2023-10-23 18:07:00,442 epoch 6 - iter 240/304 - loss 0.04188832 - time (sec): 13.10 - samples/sec: 1848.07 - lr: 0.000023 - momentum: 0.000000 2023-10-23 18:07:02,083 epoch 6 - iter 270/304 - loss 0.04332220 - time (sec): 14.74 - samples/sec: 1846.32 - lr: 0.000023 - momentum: 0.000000 2023-10-23 18:07:03,721 epoch 6 - iter 300/304 - loss 0.05048982 - time (sec): 16.38 - samples/sec: 1873.09 - lr: 0.000022 - momentum: 0.000000 2023-10-23 18:07:03,936 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:03,936 EPOCH 6 done: loss 0.0501 - lr: 0.000022 2023-10-23 18:07:04,785 DEV : loss 0.22458811104297638 - f1-score (micro avg) 0.6659 2023-10-23 18:07:04,792 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:06,425 epoch 7 - iter 30/304 - loss 0.04567402 - time (sec): 1.63 - samples/sec: 1846.80 - lr: 0.000022 - momentum: 0.000000 2023-10-23 18:07:08,068 epoch 7 - iter 60/304 - loss 0.04013202 - time (sec): 3.27 - samples/sec: 1887.08 - lr: 0.000021 - momentum: 0.000000 2023-10-23 18:07:09,712 epoch 7 - iter 90/304 - loss 0.04103430 - time (sec): 4.92 - samples/sec: 1867.22 - lr: 0.000021 - momentum: 0.000000 2023-10-23 18:07:11,354 epoch 7 - iter 120/304 - loss 0.04689034 - time (sec): 6.56 - samples/sec: 1879.38 - lr: 0.000020 - momentum: 0.000000 2023-10-23 18:07:13,007 epoch 7 - iter 150/304 - loss 0.04243080 - time (sec): 8.21 - samples/sec: 1887.35 - lr: 0.000020 - momentum: 0.000000 2023-10-23 18:07:14,626 epoch 7 - iter 180/304 - loss 0.03739778 - time (sec): 9.83 - samples/sec: 1884.23 - lr: 0.000019 - momentum: 0.000000 2023-10-23 18:07:16,244 epoch 7 - iter 210/304 - loss 0.03572045 - time (sec): 11.45 - samples/sec: 1889.73 - lr: 0.000018 - momentum: 0.000000 2023-10-23 18:07:17,860 epoch 7 - iter 240/304 - loss 0.03268610 - time (sec): 13.07 - samples/sec: 1891.94 - lr: 0.000018 - momentum: 0.000000 2023-10-23 18:07:19,499 epoch 7 - iter 270/304 - loss 0.03137129 - time (sec): 14.71 - samples/sec: 1870.60 - lr: 0.000017 - momentum: 0.000000 2023-10-23 18:07:21,141 epoch 7 - iter 300/304 - loss 0.03089265 - time (sec): 16.35 - samples/sec: 1873.73 - lr: 0.000017 - momentum: 0.000000 2023-10-23 18:07:21,356 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:21,356 EPOCH 7 done: loss 0.0305 - lr: 0.000017 2023-10-23 18:07:22,378 DEV : loss 0.19062967598438263 - f1-score (micro avg) 0.8435 2023-10-23 18:07:22,386 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:24,031 epoch 8 - iter 30/304 - loss 0.03284706 - time (sec): 1.64 - samples/sec: 1796.18 - lr: 0.000016 - momentum: 0.000000 2023-10-23 18:07:25,666 epoch 8 - iter 60/304 - loss 0.02779091 - time (sec): 3.28 - samples/sec: 1744.32 - lr: 0.000016 - momentum: 0.000000 2023-10-23 18:07:27,302 epoch 8 - iter 90/304 - loss 0.01855685 - time (sec): 4.91 - samples/sec: 1746.35 - lr: 0.000015 - momentum: 0.000000 2023-10-23 18:07:28,948 epoch 8 - iter 120/304 - loss 0.01484052 - time (sec): 6.56 - samples/sec: 1789.62 - lr: 0.000015 - momentum: 0.000000 2023-10-23 18:07:30,589 epoch 8 - iter 150/304 - loss 0.01229122 - time (sec): 8.20 - samples/sec: 1841.21 - lr: 0.000014 - momentum: 0.000000 2023-10-23 18:07:32,228 epoch 8 - iter 180/304 - loss 0.01257633 - time (sec): 9.84 - samples/sec: 1836.27 - lr: 0.000013 - momentum: 0.000000 2023-10-23 18:07:33,865 epoch 8 - iter 210/304 - loss 0.01408353 - time (sec): 11.48 - samples/sec: 1834.05 - lr: 0.000013 - momentum: 0.000000 2023-10-23 18:07:35,504 epoch 8 - iter 240/304 - loss 0.01384160 - time (sec): 13.12 - samples/sec: 1836.24 - lr: 0.000012 - momentum: 0.000000 2023-10-23 18:07:37,149 epoch 8 - iter 270/304 - loss 0.01387053 - time (sec): 14.76 - samples/sec: 1853.15 - lr: 0.000012 - momentum: 0.000000 2023-10-23 18:07:38,791 epoch 8 - iter 300/304 - loss 0.01359108 - time (sec): 16.40 - samples/sec: 1866.58 - lr: 0.000011 - momentum: 0.000000 2023-10-23 18:07:39,007 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:39,007 EPOCH 8 done: loss 0.0134 - lr: 0.000011 2023-10-23 18:07:39,856 DEV : loss 0.2041168510913849 - f1-score (micro avg) 0.852 2023-10-23 18:07:39,863 saving best model 2023-10-23 18:07:40,387 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:42,032 epoch 9 - iter 30/304 - loss 0.00600479 - time (sec): 1.64 - samples/sec: 1838.36 - lr: 0.000011 - momentum: 0.000000 2023-10-23 18:07:43,670 epoch 9 - iter 60/304 - loss 0.00346556 - time (sec): 3.28 - samples/sec: 1817.14 - lr: 0.000010 - momentum: 0.000000 2023-10-23 18:07:45,307 epoch 9 - iter 90/304 - loss 0.00522093 - time (sec): 4.92 - samples/sec: 1831.14 - lr: 0.000010 - momentum: 0.000000 2023-10-23 18:07:46,946 epoch 9 - iter 120/304 - loss 0.00635893 - time (sec): 6.56 - samples/sec: 1857.20 - lr: 0.000009 - momentum: 0.000000 2023-10-23 18:07:48,561 epoch 9 - iter 150/304 - loss 0.00829404 - time (sec): 8.17 - samples/sec: 1843.97 - lr: 0.000008 - momentum: 0.000000 2023-10-23 18:07:50,202 epoch 9 - iter 180/304 - loss 0.01123978 - time (sec): 9.81 - samples/sec: 1863.28 - lr: 0.000008 - momentum: 0.000000 2023-10-23 18:07:51,829 epoch 9 - iter 210/304 - loss 0.00958663 - time (sec): 11.44 - samples/sec: 1875.46 - lr: 0.000007 - momentum: 0.000000 2023-10-23 18:07:53,451 epoch 9 - iter 240/304 - loss 0.00999350 - time (sec): 13.06 - samples/sec: 1858.64 - lr: 0.000007 - momentum: 0.000000 2023-10-23 18:07:55,085 epoch 9 - iter 270/304 - loss 0.00938350 - time (sec): 14.70 - samples/sec: 1862.46 - lr: 0.000006 - momentum: 0.000000 2023-10-23 18:07:56,735 epoch 9 - iter 300/304 - loss 0.00917477 - time (sec): 16.35 - samples/sec: 1877.87 - lr: 0.000006 - momentum: 0.000000 2023-10-23 18:07:56,950 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:56,950 EPOCH 9 done: loss 0.0091 - lr: 0.000006 2023-10-23 18:07:57,812 DEV : loss 0.2051309198141098 - f1-score (micro avg) 0.8456 2023-10-23 18:07:57,820 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:07:59,462 epoch 10 - iter 30/304 - loss 0.00983964 - time (sec): 1.64 - samples/sec: 1911.88 - lr: 0.000005 - momentum: 0.000000 2023-10-23 18:08:01,094 epoch 10 - iter 60/304 - loss 0.01371081 - time (sec): 3.27 - samples/sec: 1855.77 - lr: 0.000005 - momentum: 0.000000 2023-10-23 18:08:02,731 epoch 10 - iter 90/304 - loss 0.00942075 - time (sec): 4.91 - samples/sec: 1815.96 - lr: 0.000004 - momentum: 0.000000 2023-10-23 18:08:04,367 epoch 10 - iter 120/304 - loss 0.00928705 - time (sec): 6.55 - samples/sec: 1844.97 - lr: 0.000003 - momentum: 0.000000 2023-10-23 18:08:06,004 epoch 10 - iter 150/304 - loss 0.00814334 - time (sec): 8.18 - samples/sec: 1827.76 - lr: 0.000003 - momentum: 0.000000 2023-10-23 18:08:07,642 epoch 10 - iter 180/304 - loss 0.00692152 - time (sec): 9.82 - samples/sec: 1856.28 - lr: 0.000002 - momentum: 0.000000 2023-10-23 18:08:09,282 epoch 10 - iter 210/304 - loss 0.00802509 - time (sec): 11.46 - samples/sec: 1878.13 - lr: 0.000002 - momentum: 0.000000 2023-10-23 18:08:10,927 epoch 10 - iter 240/304 - loss 0.00720115 - time (sec): 13.11 - samples/sec: 1888.06 - lr: 0.000001 - momentum: 0.000000 2023-10-23 18:08:12,565 epoch 10 - iter 270/304 - loss 0.00642390 - time (sec): 14.74 - samples/sec: 1885.17 - lr: 0.000001 - momentum: 0.000000 2023-10-23 18:08:14,194 epoch 10 - iter 300/304 - loss 0.00718668 - time (sec): 16.37 - samples/sec: 1875.88 - lr: 0.000000 - momentum: 0.000000 2023-10-23 18:08:14,406 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:08:14,406 EPOCH 10 done: loss 0.0071 - lr: 0.000000 2023-10-23 18:08:15,325 DEV : loss 0.20948979258537292 - f1-score (micro avg) 0.8476 2023-10-23 18:08:15,746 ---------------------------------------------------------------------------------------------------- 2023-10-23 18:08:15,747 Loading model from best epoch ... 2023-10-23 18:08:17,540 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-23 18:08:18,429 Results: - F-score (micro) 0.8016 - F-score (macro) 0.6217 - Accuracy 0.6781 By class: precision recall f1-score support scope 0.7546 0.8146 0.7834 151 work 0.7368 0.8842 0.8038 95 pers 0.8000 0.9167 0.8544 96 loc 0.6667 0.6667 0.6667 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7557 0.8534 0.8016 348 macro avg 0.5916 0.6564 0.6217 348 weighted avg 0.7550 0.8534 0.8008 348 2023-10-23 18:08:18,429 ----------------------------------------------------------------------------------------------------