2023-10-17 08:55:30,338 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,339 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:55:30,339 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,339 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:55:30,339 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,339 Train: 1100 sentences 2023-10-17 08:55:30,339 (train_with_dev=False, train_with_test=False) 2023-10-17 08:55:30,339 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,339 Training Params: 2023-10-17 08:55:30,339 - learning_rate: "3e-05" 2023-10-17 08:55:30,339 - mini_batch_size: "4" 2023-10-17 08:55:30,340 - max_epochs: "10" 2023-10-17 08:55:30,340 - shuffle: "True" 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 Plugins: 2023-10-17 08:55:30,340 - TensorboardLogger 2023-10-17 08:55:30,340 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:55:30,340 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 Computation: 2023-10-17 08:55:30,340 - compute on device: cuda:0 2023-10-17 08:55:30,340 - embedding storage: none 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:30,340 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:55:31,518 epoch 1 - iter 27/275 - loss 4.20966829 - time (sec): 1.18 - samples/sec: 2005.09 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:55:32,725 epoch 1 - iter 54/275 - loss 3.75589086 - time (sec): 2.38 - samples/sec: 1968.25 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:55:33,959 epoch 1 - iter 81/275 - loss 2.96292735 - time (sec): 3.62 - samples/sec: 1902.72 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:55:35,225 epoch 1 - iter 108/275 - loss 2.48582220 - time (sec): 4.88 - samples/sec: 1844.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:55:36,460 epoch 1 - iter 135/275 - loss 2.09988698 - time (sec): 6.12 - samples/sec: 1858.90 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:55:37,680 epoch 1 - iter 162/275 - loss 1.82713499 - time (sec): 7.34 - samples/sec: 1855.76 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:55:38,920 epoch 1 - iter 189/275 - loss 1.62242540 - time (sec): 8.58 - samples/sec: 1872.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:55:40,167 epoch 1 - iter 216/275 - loss 1.47095023 - time (sec): 9.83 - samples/sec: 1873.15 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:55:41,392 epoch 1 - iter 243/275 - loss 1.36038447 - time (sec): 11.05 - samples/sec: 1840.09 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:55:42,653 epoch 1 - iter 270/275 - loss 1.26654641 - time (sec): 12.31 - samples/sec: 1814.71 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:55:42,872 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:42,873 EPOCH 1 done: loss 1.2457 - lr: 0.000029 2023-10-17 08:55:43,558 DEV : loss 0.22156471014022827 - f1-score (micro avg) 0.6929 2023-10-17 08:55:43,563 saving best model 2023-10-17 08:55:43,910 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:45,062 epoch 2 - iter 27/275 - loss 0.21548090 - time (sec): 1.15 - samples/sec: 1936.90 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:55:46,221 epoch 2 - iter 54/275 - loss 0.24741000 - time (sec): 2.31 - samples/sec: 1823.11 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:55:47,381 epoch 2 - iter 81/275 - loss 0.21806936 - time (sec): 3.47 - samples/sec: 1874.12 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:55:48,593 epoch 2 - iter 108/275 - loss 0.22056142 - time (sec): 4.68 - samples/sec: 1854.73 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:55:49,836 epoch 2 - iter 135/275 - loss 0.21012019 - time (sec): 5.92 - samples/sec: 1820.46 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:55:51,118 epoch 2 - iter 162/275 - loss 0.20415720 - time (sec): 7.21 - samples/sec: 1801.91 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:55:52,371 epoch 2 - iter 189/275 - loss 0.19846404 - time (sec): 8.46 - samples/sec: 1800.33 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:55:53,663 epoch 2 - iter 216/275 - loss 0.19250304 - time (sec): 9.75 - samples/sec: 1803.22 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:55:54,911 epoch 2 - iter 243/275 - loss 0.18789230 - time (sec): 11.00 - samples/sec: 1814.77 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:55:56,141 epoch 2 - iter 270/275 - loss 0.18821809 - time (sec): 12.23 - samples/sec: 1825.82 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:55:56,365 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:56,365 EPOCH 2 done: loss 0.1866 - lr: 0.000027 2023-10-17 08:55:57,007 DEV : loss 0.17130924761295319 - f1-score (micro avg) 0.8119 2023-10-17 08:55:57,011 saving best model 2023-10-17 08:55:57,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:58,720 epoch 3 - iter 27/275 - loss 0.13594230 - time (sec): 1.27 - samples/sec: 1920.69 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:55:59,968 epoch 3 - iter 54/275 - loss 0.11616941 - time (sec): 2.52 - samples/sec: 1809.88 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:56:01,242 epoch 3 - iter 81/275 - loss 0.10852886 - time (sec): 3.79 - samples/sec: 1839.76 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:56:02,495 epoch 3 - iter 108/275 - loss 0.10609183 - time (sec): 5.04 - samples/sec: 1801.75 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:56:03,728 epoch 3 - iter 135/275 - loss 0.09439168 - time (sec): 6.28 - samples/sec: 1819.37 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:56:04,950 epoch 3 - iter 162/275 - loss 0.09864309 - time (sec): 7.50 - samples/sec: 1794.72 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:56:06,207 epoch 3 - iter 189/275 - loss 0.10337472 - time (sec): 8.75 - samples/sec: 1784.73 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:56:07,465 epoch 3 - iter 216/275 - loss 0.10537503 - time (sec): 10.01 - samples/sec: 1793.50 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:56:08,703 epoch 3 - iter 243/275 - loss 0.10358980 - time (sec): 11.25 - samples/sec: 1806.04 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:56:09,933 epoch 3 - iter 270/275 - loss 0.10680948 - time (sec): 12.48 - samples/sec: 1794.31 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:56:10,170 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:10,171 EPOCH 3 done: loss 0.1057 - lr: 0.000023 2023-10-17 08:56:10,805 DEV : loss 0.16759812831878662 - f1-score (micro avg) 0.8473 2023-10-17 08:56:10,809 saving best model 2023-10-17 08:56:11,254 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:12,511 epoch 4 - iter 27/275 - loss 0.06795575 - time (sec): 1.26 - samples/sec: 1548.51 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:56:13,740 epoch 4 - iter 54/275 - loss 0.06004520 - time (sec): 2.49 - samples/sec: 1638.19 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:56:14,977 epoch 4 - iter 81/275 - loss 0.07936595 - time (sec): 3.72 - samples/sec: 1753.38 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:56:16,206 epoch 4 - iter 108/275 - loss 0.07707577 - time (sec): 4.95 - samples/sec: 1786.60 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:56:17,441 epoch 4 - iter 135/275 - loss 0.07154381 - time (sec): 6.19 - samples/sec: 1824.33 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:56:18,688 epoch 4 - iter 162/275 - loss 0.07892132 - time (sec): 7.43 - samples/sec: 1810.07 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:56:19,898 epoch 4 - iter 189/275 - loss 0.08645319 - time (sec): 8.64 - samples/sec: 1817.76 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:56:21,178 epoch 4 - iter 216/275 - loss 0.08316034 - time (sec): 9.92 - samples/sec: 1832.55 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:56:22,404 epoch 4 - iter 243/275 - loss 0.08256366 - time (sec): 11.15 - samples/sec: 1840.96 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:56:23,627 epoch 4 - iter 270/275 - loss 0.07860632 - time (sec): 12.37 - samples/sec: 1812.42 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:56:23,850 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:23,851 EPOCH 4 done: loss 0.0797 - lr: 0.000020 2023-10-17 08:56:24,496 DEV : loss 0.17673690617084503 - f1-score (micro avg) 0.8596 2023-10-17 08:56:24,501 saving best model 2023-10-17 08:56:24,926 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:26,109 epoch 5 - iter 27/275 - loss 0.04629782 - time (sec): 1.18 - samples/sec: 1905.41 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:56:27,278 epoch 5 - iter 54/275 - loss 0.08299096 - time (sec): 2.35 - samples/sec: 1949.30 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:56:28,452 epoch 5 - iter 81/275 - loss 0.07130247 - time (sec): 3.52 - samples/sec: 2028.98 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:56:29,701 epoch 5 - iter 108/275 - loss 0.06222362 - time (sec): 4.77 - samples/sec: 1983.43 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:56:30,900 epoch 5 - iter 135/275 - loss 0.05801080 - time (sec): 5.97 - samples/sec: 1950.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:56:32,127 epoch 5 - iter 162/275 - loss 0.06186701 - time (sec): 7.20 - samples/sec: 1930.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:56:33,334 epoch 5 - iter 189/275 - loss 0.06133609 - time (sec): 8.41 - samples/sec: 1905.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:56:34,603 epoch 5 - iter 216/275 - loss 0.06357754 - time (sec): 9.67 - samples/sec: 1875.19 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:56:35,801 epoch 5 - iter 243/275 - loss 0.06492384 - time (sec): 10.87 - samples/sec: 1851.30 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:56:37,044 epoch 5 - iter 270/275 - loss 0.06238790 - time (sec): 12.12 - samples/sec: 1840.80 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:56:37,272 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:37,272 EPOCH 5 done: loss 0.0611 - lr: 0.000017 2023-10-17 08:56:37,903 DEV : loss 0.17739932239055634 - f1-score (micro avg) 0.8606 2023-10-17 08:56:37,908 saving best model 2023-10-17 08:56:38,336 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:39,597 epoch 6 - iter 27/275 - loss 0.04004423 - time (sec): 1.26 - samples/sec: 1778.61 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:56:40,845 epoch 6 - iter 54/275 - loss 0.04725373 - time (sec): 2.51 - samples/sec: 1872.03 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:56:42,082 epoch 6 - iter 81/275 - loss 0.06302954 - time (sec): 3.74 - samples/sec: 1855.70 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:56:43,298 epoch 6 - iter 108/275 - loss 0.06106043 - time (sec): 4.96 - samples/sec: 1816.09 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:56:44,594 epoch 6 - iter 135/275 - loss 0.05511360 - time (sec): 6.25 - samples/sec: 1782.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:56:45,855 epoch 6 - iter 162/275 - loss 0.05191944 - time (sec): 7.52 - samples/sec: 1794.09 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:56:47,123 epoch 6 - iter 189/275 - loss 0.05519380 - time (sec): 8.78 - samples/sec: 1807.71 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:56:48,340 epoch 6 - iter 216/275 - loss 0.05494369 - time (sec): 10.00 - samples/sec: 1810.20 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:56:49,567 epoch 6 - iter 243/275 - loss 0.05671698 - time (sec): 11.23 - samples/sec: 1798.93 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:56:50,807 epoch 6 - iter 270/275 - loss 0.05658233 - time (sec): 12.47 - samples/sec: 1796.82 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:56:51,041 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:51,041 EPOCH 6 done: loss 0.0560 - lr: 0.000013 2023-10-17 08:56:51,731 DEV : loss 0.17116226255893707 - f1-score (micro avg) 0.872 2023-10-17 08:56:51,737 saving best model 2023-10-17 08:56:52,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:53,481 epoch 7 - iter 27/275 - loss 0.01671396 - time (sec): 1.30 - samples/sec: 1707.70 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:56:54,727 epoch 7 - iter 54/275 - loss 0.04459572 - time (sec): 2.54 - samples/sec: 1730.73 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:56:55,952 epoch 7 - iter 81/275 - loss 0.03874362 - time (sec): 3.77 - samples/sec: 1786.92 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:56:57,214 epoch 7 - iter 108/275 - loss 0.03008703 - time (sec): 5.03 - samples/sec: 1770.20 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:56:58,436 epoch 7 - iter 135/275 - loss 0.03348542 - time (sec): 6.25 - samples/sec: 1778.93 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:56:59,703 epoch 7 - iter 162/275 - loss 0.03566052 - time (sec): 7.52 - samples/sec: 1776.73 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:57:00,947 epoch 7 - iter 189/275 - loss 0.03937889 - time (sec): 8.76 - samples/sec: 1767.34 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:57:02,194 epoch 7 - iter 216/275 - loss 0.03932326 - time (sec): 10.01 - samples/sec: 1782.63 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:57:03,474 epoch 7 - iter 243/275 - loss 0.04221997 - time (sec): 11.29 - samples/sec: 1791.61 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:57:04,785 epoch 7 - iter 270/275 - loss 0.04039839 - time (sec): 12.60 - samples/sec: 1773.53 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:57:05,024 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:05,024 EPOCH 7 done: loss 0.0397 - lr: 0.000010 2023-10-17 08:57:05,705 DEV : loss 0.17413972318172455 - f1-score (micro avg) 0.8825 2023-10-17 08:57:05,710 saving best model 2023-10-17 08:57:06,158 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:07,321 epoch 8 - iter 27/275 - loss 0.01204691 - time (sec): 1.16 - samples/sec: 2008.77 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:57:08,566 epoch 8 - iter 54/275 - loss 0.01133158 - time (sec): 2.41 - samples/sec: 1972.07 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:57:09,792 epoch 8 - iter 81/275 - loss 0.01608520 - time (sec): 3.63 - samples/sec: 1879.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:57:11,036 epoch 8 - iter 108/275 - loss 0.03281090 - time (sec): 4.88 - samples/sec: 1833.32 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:57:12,273 epoch 8 - iter 135/275 - loss 0.03200607 - time (sec): 6.11 - samples/sec: 1820.96 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:57:13,534 epoch 8 - iter 162/275 - loss 0.03240100 - time (sec): 7.37 - samples/sec: 1822.18 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:57:14,739 epoch 8 - iter 189/275 - loss 0.03559538 - time (sec): 8.58 - samples/sec: 1817.06 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:57:15,996 epoch 8 - iter 216/275 - loss 0.03199163 - time (sec): 9.84 - samples/sec: 1808.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:57:17,215 epoch 8 - iter 243/275 - loss 0.03381439 - time (sec): 11.06 - samples/sec: 1829.88 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:57:18,457 epoch 8 - iter 270/275 - loss 0.03197433 - time (sec): 12.30 - samples/sec: 1829.80 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:57:18,698 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:18,698 EPOCH 8 done: loss 0.0316 - lr: 0.000007 2023-10-17 08:57:19,373 DEV : loss 0.1771039515733719 - f1-score (micro avg) 0.8798 2023-10-17 08:57:19,377 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:20,628 epoch 9 - iter 27/275 - loss 0.02813971 - time (sec): 1.25 - samples/sec: 1840.45 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:57:21,895 epoch 9 - iter 54/275 - loss 0.02635667 - time (sec): 2.52 - samples/sec: 1763.63 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:57:23,118 epoch 9 - iter 81/275 - loss 0.02178628 - time (sec): 3.74 - samples/sec: 1761.85 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:57:24,421 epoch 9 - iter 108/275 - loss 0.02832111 - time (sec): 5.04 - samples/sec: 1754.85 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:57:25,687 epoch 9 - iter 135/275 - loss 0.02973470 - time (sec): 6.31 - samples/sec: 1746.75 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:57:26,905 epoch 9 - iter 162/275 - loss 0.02813329 - time (sec): 7.53 - samples/sec: 1750.95 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:57:28,193 epoch 9 - iter 189/275 - loss 0.02776991 - time (sec): 8.81 - samples/sec: 1774.05 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:57:29,485 epoch 9 - iter 216/275 - loss 0.02590608 - time (sec): 10.11 - samples/sec: 1794.45 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:57:30,729 epoch 9 - iter 243/275 - loss 0.02693269 - time (sec): 11.35 - samples/sec: 1801.78 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:57:31,984 epoch 9 - iter 270/275 - loss 0.02645928 - time (sec): 12.61 - samples/sec: 1776.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:57:32,203 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:32,203 EPOCH 9 done: loss 0.0260 - lr: 0.000003 2023-10-17 08:57:32,905 DEV : loss 0.18786108493804932 - f1-score (micro avg) 0.8727 2023-10-17 08:57:32,912 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:34,209 epoch 10 - iter 27/275 - loss 0.04963864 - time (sec): 1.30 - samples/sec: 1549.74 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:57:35,478 epoch 10 - iter 54/275 - loss 0.03193970 - time (sec): 2.56 - samples/sec: 1585.53 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:57:36,731 epoch 10 - iter 81/275 - loss 0.02484947 - time (sec): 3.82 - samples/sec: 1667.01 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:57:37,980 epoch 10 - iter 108/275 - loss 0.02099055 - time (sec): 5.07 - samples/sec: 1727.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:57:39,197 epoch 10 - iter 135/275 - loss 0.01954446 - time (sec): 6.28 - samples/sec: 1760.35 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:57:40,447 epoch 10 - iter 162/275 - loss 0.02556949 - time (sec): 7.53 - samples/sec: 1792.71 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:57:41,687 epoch 10 - iter 189/275 - loss 0.02406705 - time (sec): 8.77 - samples/sec: 1795.13 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:57:42,909 epoch 10 - iter 216/275 - loss 0.02111237 - time (sec): 10.00 - samples/sec: 1799.47 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:57:44,142 epoch 10 - iter 243/275 - loss 0.02045125 - time (sec): 11.23 - samples/sec: 1803.94 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:57:45,367 epoch 10 - iter 270/275 - loss 0.02020873 - time (sec): 12.45 - samples/sec: 1803.97 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:57:45,591 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:45,591 EPOCH 10 done: loss 0.0199 - lr: 0.000000 2023-10-17 08:57:46,243 DEV : loss 0.18549132347106934 - f1-score (micro avg) 0.8671 2023-10-17 08:57:46,603 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:57:46,604 Loading model from best epoch ... 2023-10-17 08:57:47,971 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:57:48,758 Results: - F-score (micro) 0.9074 - F-score (macro) 0.6766 - Accuracy 0.8345 By class: precision recall f1-score support scope 0.8750 0.9148 0.8944 176 pers 0.9685 0.9609 0.9647 128 work 0.8630 0.8514 0.8571 74 loc 1.0000 0.5000 0.6667 2 object 0.0000 0.0000 0.0000 2 micro avg 0.9039 0.9110 0.9074 382 macro avg 0.7413 0.6454 0.6766 382 weighted avg 0.9001 0.9110 0.9049 382 2023-10-17 08:57:48,758 ----------------------------------------------------------------------------------------------------