2023-10-17 08:39:39,345 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,346 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:39:39,346 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,346 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:39:39,346 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,346 Train: 1100 sentences 2023-10-17 08:39:39,346 (train_with_dev=False, train_with_test=False) 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Training Params: 2023-10-17 08:39:39,347 - learning_rate: "5e-05" 2023-10-17 08:39:39,347 - mini_batch_size: "4" 2023-10-17 08:39:39,347 - max_epochs: "10" 2023-10-17 08:39:39,347 - shuffle: "True" 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Plugins: 2023-10-17 08:39:39,347 - TensorboardLogger 2023-10-17 08:39:39,347 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:39:39,347 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Computation: 2023-10-17 08:39:39,347 - compute on device: cuda:0 2023-10-17 08:39:39,347 - embedding storage: none 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:39,347 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:39:40,651 epoch 1 - iter 27/275 - loss 4.03878656 - time (sec): 1.30 - samples/sec: 1602.15 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:39:41,984 epoch 1 - iter 54/275 - loss 3.09313257 - time (sec): 2.64 - samples/sec: 1652.71 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:39:43,312 epoch 1 - iter 81/275 - loss 2.32201658 - time (sec): 3.96 - samples/sec: 1708.97 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:39:44,621 epoch 1 - iter 108/275 - loss 1.96172135 - time (sec): 5.27 - samples/sec: 1660.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:39:45,831 epoch 1 - iter 135/275 - loss 1.65705331 - time (sec): 6.48 - samples/sec: 1707.48 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:39:47,070 epoch 1 - iter 162/275 - loss 1.43741272 - time (sec): 7.72 - samples/sec: 1732.13 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:39:48,277 epoch 1 - iter 189/275 - loss 1.28970992 - time (sec): 8.93 - samples/sec: 1737.39 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:39:49,497 epoch 1 - iter 216/275 - loss 1.16239932 - time (sec): 10.15 - samples/sec: 1750.43 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:39:50,739 epoch 1 - iter 243/275 - loss 1.05732942 - time (sec): 11.39 - samples/sec: 1767.15 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:39:51,981 epoch 1 - iter 270/275 - loss 0.97265547 - time (sec): 12.63 - samples/sec: 1772.94 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:39:52,200 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:52,200 EPOCH 1 done: loss 0.9602 - lr: 0.000049 2023-10-17 08:39:52,734 DEV : loss 0.18448342382907867 - f1-score (micro avg) 0.7457 2023-10-17 08:39:52,738 saving best model 2023-10-17 08:39:53,085 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:39:54,298 epoch 2 - iter 27/275 - loss 0.27765461 - time (sec): 1.21 - samples/sec: 1847.98 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:39:55,529 epoch 2 - iter 54/275 - loss 0.20491030 - time (sec): 2.44 - samples/sec: 1847.56 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:39:56,740 epoch 2 - iter 81/275 - loss 0.18857691 - time (sec): 3.65 - samples/sec: 1881.68 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:39:57,971 epoch 2 - iter 108/275 - loss 0.18384314 - time (sec): 4.88 - samples/sec: 1863.64 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:39:59,237 epoch 2 - iter 135/275 - loss 0.19785182 - time (sec): 6.15 - samples/sec: 1883.70 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:40:00,460 epoch 2 - iter 162/275 - loss 0.18989803 - time (sec): 7.37 - samples/sec: 1867.37 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:40:01,679 epoch 2 - iter 189/275 - loss 0.17923154 - time (sec): 8.59 - samples/sec: 1851.17 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:40:02,886 epoch 2 - iter 216/275 - loss 0.16990514 - time (sec): 9.80 - samples/sec: 1844.09 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:40:04,108 epoch 2 - iter 243/275 - loss 0.16846483 - time (sec): 11.02 - samples/sec: 1828.04 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:40:05,323 epoch 2 - iter 270/275 - loss 0.16677599 - time (sec): 12.24 - samples/sec: 1833.14 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:40:05,547 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:05,547 EPOCH 2 done: loss 0.1650 - lr: 0.000045 2023-10-17 08:40:06,199 DEV : loss 0.18307897448539734 - f1-score (micro avg) 0.7929 2023-10-17 08:40:06,204 saving best model 2023-10-17 08:40:06,658 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:07,957 epoch 3 - iter 27/275 - loss 0.10069768 - time (sec): 1.30 - samples/sec: 1916.69 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:40:09,194 epoch 3 - iter 54/275 - loss 0.10050973 - time (sec): 2.53 - samples/sec: 1883.93 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:40:10,426 epoch 3 - iter 81/275 - loss 0.09155482 - time (sec): 3.77 - samples/sec: 1866.60 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:40:11,659 epoch 3 - iter 108/275 - loss 0.08904290 - time (sec): 5.00 - samples/sec: 1897.14 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:40:12,873 epoch 3 - iter 135/275 - loss 0.08844808 - time (sec): 6.21 - samples/sec: 1846.47 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:40:14,099 epoch 3 - iter 162/275 - loss 0.09007275 - time (sec): 7.44 - samples/sec: 1837.24 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:40:15,314 epoch 3 - iter 189/275 - loss 0.09894026 - time (sec): 8.65 - samples/sec: 1838.92 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:40:16,541 epoch 3 - iter 216/275 - loss 0.10190647 - time (sec): 9.88 - samples/sec: 1840.60 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:40:17,747 epoch 3 - iter 243/275 - loss 0.10769110 - time (sec): 11.09 - samples/sec: 1835.25 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:40:18,964 epoch 3 - iter 270/275 - loss 0.10706840 - time (sec): 12.30 - samples/sec: 1823.91 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:40:19,187 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:19,188 EPOCH 3 done: loss 0.1065 - lr: 0.000039 2023-10-17 08:40:19,998 DEV : loss 0.18310178816318512 - f1-score (micro avg) 0.8514 2023-10-17 08:40:20,002 saving best model 2023-10-17 08:40:20,437 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:21,645 epoch 4 - iter 27/275 - loss 0.05756256 - time (sec): 1.21 - samples/sec: 1794.76 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:40:22,858 epoch 4 - iter 54/275 - loss 0.05243097 - time (sec): 2.42 - samples/sec: 1833.72 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:40:24,061 epoch 4 - iter 81/275 - loss 0.05875408 - time (sec): 3.62 - samples/sec: 1831.26 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:40:25,273 epoch 4 - iter 108/275 - loss 0.05543076 - time (sec): 4.83 - samples/sec: 1783.41 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:40:26,480 epoch 4 - iter 135/275 - loss 0.06963521 - time (sec): 6.04 - samples/sec: 1806.84 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:40:27,712 epoch 4 - iter 162/275 - loss 0.07078516 - time (sec): 7.27 - samples/sec: 1816.65 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:40:28,939 epoch 4 - iter 189/275 - loss 0.07629535 - time (sec): 8.50 - samples/sec: 1793.65 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:40:30,163 epoch 4 - iter 216/275 - loss 0.08775086 - time (sec): 9.72 - samples/sec: 1822.76 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:40:31,360 epoch 4 - iter 243/275 - loss 0.08837665 - time (sec): 10.92 - samples/sec: 1825.54 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:40:32,593 epoch 4 - iter 270/275 - loss 0.08667772 - time (sec): 12.15 - samples/sec: 1836.66 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:40:32,816 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:32,816 EPOCH 4 done: loss 0.0867 - lr: 0.000034 2023-10-17 08:40:33,475 DEV : loss 0.21343372762203217 - f1-score (micro avg) 0.8759 2023-10-17 08:40:33,479 saving best model 2023-10-17 08:40:33,906 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:35,101 epoch 5 - iter 27/275 - loss 0.09904212 - time (sec): 1.19 - samples/sec: 1928.97 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:40:36,312 epoch 5 - iter 54/275 - loss 0.08332440 - time (sec): 2.40 - samples/sec: 1929.05 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:40:37,498 epoch 5 - iter 81/275 - loss 0.07371610 - time (sec): 3.59 - samples/sec: 1906.32 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:40:38,715 epoch 5 - iter 108/275 - loss 0.08587081 - time (sec): 4.81 - samples/sec: 1876.77 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:40:39,919 epoch 5 - iter 135/275 - loss 0.08650298 - time (sec): 6.01 - samples/sec: 1876.23 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:40:41,117 epoch 5 - iter 162/275 - loss 0.08560863 - time (sec): 7.21 - samples/sec: 1867.28 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:40:42,352 epoch 5 - iter 189/275 - loss 0.08186266 - time (sec): 8.44 - samples/sec: 1869.53 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:40:43,582 epoch 5 - iter 216/275 - loss 0.07453600 - time (sec): 9.67 - samples/sec: 1870.31 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:40:44,819 epoch 5 - iter 243/275 - loss 0.07028006 - time (sec): 10.91 - samples/sec: 1859.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:40:46,043 epoch 5 - iter 270/275 - loss 0.06854325 - time (sec): 12.14 - samples/sec: 1843.18 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:40:46,271 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:46,271 EPOCH 5 done: loss 0.0673 - lr: 0.000028 2023-10-17 08:40:46,938 DEV : loss 0.20947176218032837 - f1-score (micro avg) 0.8504 2023-10-17 08:40:46,943 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:48,187 epoch 6 - iter 27/275 - loss 0.05511466 - time (sec): 1.24 - samples/sec: 1842.84 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:40:49,440 epoch 6 - iter 54/275 - loss 0.05118589 - time (sec): 2.50 - samples/sec: 1787.99 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:40:50,650 epoch 6 - iter 81/275 - loss 0.05692708 - time (sec): 3.71 - samples/sec: 1763.66 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:40:51,852 epoch 6 - iter 108/275 - loss 0.06210658 - time (sec): 4.91 - samples/sec: 1755.23 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:40:53,063 epoch 6 - iter 135/275 - loss 0.05927054 - time (sec): 6.12 - samples/sec: 1813.61 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:40:54,220 epoch 6 - iter 162/275 - loss 0.05347616 - time (sec): 7.28 - samples/sec: 1801.60 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:40:55,399 epoch 6 - iter 189/275 - loss 0.05304045 - time (sec): 8.46 - samples/sec: 1844.48 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:40:56,562 epoch 6 - iter 216/275 - loss 0.05186256 - time (sec): 9.62 - samples/sec: 1849.76 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:40:57,712 epoch 6 - iter 243/275 - loss 0.04991913 - time (sec): 10.77 - samples/sec: 1848.15 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:40:58,869 epoch 6 - iter 270/275 - loss 0.04731578 - time (sec): 11.93 - samples/sec: 1866.98 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:40:59,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:40:59,090 EPOCH 6 done: loss 0.0469 - lr: 0.000022 2023-10-17 08:40:59,729 DEV : loss 0.17257489264011383 - f1-score (micro avg) 0.8764 2023-10-17 08:40:59,734 saving best model 2023-10-17 08:41:00,182 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:01,421 epoch 7 - iter 27/275 - loss 0.01858912 - time (sec): 1.24 - samples/sec: 1872.12 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:41:02,633 epoch 7 - iter 54/275 - loss 0.01466161 - time (sec): 2.45 - samples/sec: 1827.37 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:41:03,852 epoch 7 - iter 81/275 - loss 0.01630909 - time (sec): 3.67 - samples/sec: 1800.60 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:41:05,148 epoch 7 - iter 108/275 - loss 0.02794978 - time (sec): 4.96 - samples/sec: 1740.67 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:41:06,405 epoch 7 - iter 135/275 - loss 0.02556503 - time (sec): 6.22 - samples/sec: 1794.75 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:41:07,676 epoch 7 - iter 162/275 - loss 0.03038199 - time (sec): 7.49 - samples/sec: 1806.56 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:41:08,890 epoch 7 - iter 189/275 - loss 0.03380410 - time (sec): 8.71 - samples/sec: 1799.57 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:41:10,121 epoch 7 - iter 216/275 - loss 0.03086095 - time (sec): 9.94 - samples/sec: 1810.37 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:41:11,366 epoch 7 - iter 243/275 - loss 0.03220375 - time (sec): 11.18 - samples/sec: 1806.21 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:41:12,567 epoch 7 - iter 270/275 - loss 0.03117750 - time (sec): 12.38 - samples/sec: 1815.30 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:41:12,788 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:12,788 EPOCH 7 done: loss 0.0328 - lr: 0.000017 2023-10-17 08:41:13,421 DEV : loss 0.16961951553821564 - f1-score (micro avg) 0.8696 2023-10-17 08:41:13,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:14,692 epoch 8 - iter 27/275 - loss 0.03404871 - time (sec): 1.26 - samples/sec: 1755.31 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:41:15,915 epoch 8 - iter 54/275 - loss 0.04007446 - time (sec): 2.49 - samples/sec: 1762.18 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:41:17,187 epoch 8 - iter 81/275 - loss 0.03058524 - time (sec): 3.76 - samples/sec: 1822.23 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:41:18,407 epoch 8 - iter 108/275 - loss 0.03511864 - time (sec): 4.98 - samples/sec: 1835.28 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:41:19,619 epoch 8 - iter 135/275 - loss 0.03051247 - time (sec): 6.19 - samples/sec: 1832.90 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:41:20,817 epoch 8 - iter 162/275 - loss 0.02693145 - time (sec): 7.39 - samples/sec: 1798.00 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:41:22,089 epoch 8 - iter 189/275 - loss 0.02624564 - time (sec): 8.66 - samples/sec: 1798.56 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:41:23,318 epoch 8 - iter 216/275 - loss 0.02675046 - time (sec): 9.89 - samples/sec: 1803.44 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:41:24,550 epoch 8 - iter 243/275 - loss 0.02456897 - time (sec): 11.12 - samples/sec: 1803.77 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:41:25,712 epoch 8 - iter 270/275 - loss 0.02792934 - time (sec): 12.29 - samples/sec: 1822.32 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:41:25,926 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:25,926 EPOCH 8 done: loss 0.0275 - lr: 0.000011 2023-10-17 08:41:26,560 DEV : loss 0.18767307698726654 - f1-score (micro avg) 0.8819 2023-10-17 08:41:26,565 saving best model 2023-10-17 08:41:26,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:28,308 epoch 9 - iter 27/275 - loss 0.01847685 - time (sec): 1.31 - samples/sec: 1706.16 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:41:29,553 epoch 9 - iter 54/275 - loss 0.02177958 - time (sec): 2.55 - samples/sec: 1832.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:41:30,790 epoch 9 - iter 81/275 - loss 0.01693967 - time (sec): 3.79 - samples/sec: 1789.02 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:41:32,050 epoch 9 - iter 108/275 - loss 0.01773225 - time (sec): 5.05 - samples/sec: 1776.39 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:41:33,286 epoch 9 - iter 135/275 - loss 0.01737827 - time (sec): 6.28 - samples/sec: 1794.29 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:41:34,507 epoch 9 - iter 162/275 - loss 0.01537384 - time (sec): 7.50 - samples/sec: 1796.61 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:41:35,743 epoch 9 - iter 189/275 - loss 0.01914939 - time (sec): 8.74 - samples/sec: 1794.42 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:41:36,975 epoch 9 - iter 216/275 - loss 0.01928300 - time (sec): 9.97 - samples/sec: 1800.75 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:41:38,189 epoch 9 - iter 243/275 - loss 0.01786580 - time (sec): 11.19 - samples/sec: 1788.11 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:41:39,426 epoch 9 - iter 270/275 - loss 0.01731168 - time (sec): 12.42 - samples/sec: 1796.45 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:41:39,653 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:39,653 EPOCH 9 done: loss 0.0170 - lr: 0.000006 2023-10-17 08:41:40,301 DEV : loss 0.1904018521308899 - f1-score (micro avg) 0.8759 2023-10-17 08:41:40,306 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:41,562 epoch 10 - iter 27/275 - loss 0.03813907 - time (sec): 1.25 - samples/sec: 2061.13 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:41:42,792 epoch 10 - iter 54/275 - loss 0.02883733 - time (sec): 2.48 - samples/sec: 2002.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:41:44,014 epoch 10 - iter 81/275 - loss 0.02220392 - time (sec): 3.71 - samples/sec: 1992.70 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:41:45,249 epoch 10 - iter 108/275 - loss 0.02022173 - time (sec): 4.94 - samples/sec: 1911.67 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:41:46,451 epoch 10 - iter 135/275 - loss 0.01751670 - time (sec): 6.14 - samples/sec: 1884.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:41:47,677 epoch 10 - iter 162/275 - loss 0.01868292 - time (sec): 7.37 - samples/sec: 1861.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:41:48,901 epoch 10 - iter 189/275 - loss 0.01721044 - time (sec): 8.59 - samples/sec: 1841.39 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:41:50,124 epoch 10 - iter 216/275 - loss 0.01701337 - time (sec): 9.82 - samples/sec: 1826.79 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:41:51,282 epoch 10 - iter 243/275 - loss 0.01725403 - time (sec): 10.97 - samples/sec: 1844.54 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:41:52,480 epoch 10 - iter 270/275 - loss 0.01637470 - time (sec): 12.17 - samples/sec: 1837.19 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:41:52,705 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:52,705 EPOCH 10 done: loss 0.0164 - lr: 0.000000 2023-10-17 08:41:53,342 DEV : loss 0.1948593109846115 - f1-score (micro avg) 0.8703 2023-10-17 08:41:53,711 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:41:53,712 Loading model from best epoch ... 2023-10-17 08:41:55,054 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:41:55,885 Results: - F-score (micro) 0.8924 - F-score (macro) 0.8645 - Accuracy 0.8175 By class: precision recall f1-score support scope 0.8895 0.8693 0.8793 176 pers 0.9754 0.9297 0.9520 128 work 0.8243 0.8243 0.8243 74 object 1.0000 1.0000 1.0000 2 loc 1.0000 0.5000 0.6667 2 micro avg 0.9057 0.8796 0.8924 382 macro avg 0.9379 0.8247 0.8645 382 weighted avg 0.9068 0.8796 0.8925 382 2023-10-17 08:41:55,885 ----------------------------------------------------------------------------------------------------