2023-10-17 11:57:40,102 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,104 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 11:57:40,105 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,105 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 11:57:40,105 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,105 Train: 6183 sentences 2023-10-17 11:57:40,105 (train_with_dev=False, train_with_test=False) 2023-10-17 11:57:40,105 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,105 Training Params: 2023-10-17 11:57:40,105 - learning_rate: "3e-05" 2023-10-17 11:57:40,105 - mini_batch_size: "8" 2023-10-17 11:57:40,105 - max_epochs: "10" 2023-10-17 11:57:40,105 - shuffle: "True" 2023-10-17 11:57:40,105 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,105 Plugins: 2023-10-17 11:57:40,106 - TensorboardLogger 2023-10-17 11:57:40,106 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 11:57:40,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,106 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 11:57:40,106 - metric: "('micro avg', 'f1-score')" 2023-10-17 11:57:40,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,106 Computation: 2023-10-17 11:57:40,106 - compute on device: cuda:0 2023-10-17 11:57:40,106 - embedding storage: none 2023-10-17 11:57:40,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,106 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 11:57:40,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,106 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:57:40,106 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 11:57:47,156 epoch 1 - iter 77/773 - loss 2.62934140 - time (sec): 7.05 - samples/sec: 1746.12 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:57:54,316 epoch 1 - iter 154/773 - loss 1.60891642 - time (sec): 14.21 - samples/sec: 1734.16 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:58:01,542 epoch 1 - iter 231/773 - loss 1.12039514 - time (sec): 21.43 - samples/sec: 1746.74 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:58:08,974 epoch 1 - iter 308/773 - loss 0.86475702 - time (sec): 28.87 - samples/sec: 1736.86 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:58:16,969 epoch 1 - iter 385/773 - loss 0.71122671 - time (sec): 36.86 - samples/sec: 1696.64 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:58:24,770 epoch 1 - iter 462/773 - loss 0.62105387 - time (sec): 44.66 - samples/sec: 1659.62 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:58:31,943 epoch 1 - iter 539/773 - loss 0.55330225 - time (sec): 51.83 - samples/sec: 1650.46 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:58:38,992 epoch 1 - iter 616/773 - loss 0.49625437 - time (sec): 58.88 - samples/sec: 1666.43 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:58:47,296 epoch 1 - iter 693/773 - loss 0.44972124 - time (sec): 67.19 - samples/sec: 1655.14 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:58:54,547 epoch 1 - iter 770/773 - loss 0.41433052 - time (sec): 74.44 - samples/sec: 1661.92 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:58:54,816 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:58:54,816 EPOCH 1 done: loss 0.4127 - lr: 0.000030 2023-10-17 11:58:57,267 DEV : loss 0.06118296831846237 - f1-score (micro avg) 0.758 2023-10-17 11:58:57,297 saving best model 2023-10-17 11:58:57,894 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:59:05,101 epoch 2 - iter 77/773 - loss 0.08295422 - time (sec): 7.20 - samples/sec: 1645.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:59:13,021 epoch 2 - iter 154/773 - loss 0.07699183 - time (sec): 15.12 - samples/sec: 1586.84 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:59:20,476 epoch 2 - iter 231/773 - loss 0.08084633 - time (sec): 22.58 - samples/sec: 1613.56 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:59:27,517 epoch 2 - iter 308/773 - loss 0.08317390 - time (sec): 29.62 - samples/sec: 1670.58 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:59:34,890 epoch 2 - iter 385/773 - loss 0.07884677 - time (sec): 36.99 - samples/sec: 1662.50 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:59:42,308 epoch 2 - iter 462/773 - loss 0.07771188 - time (sec): 44.41 - samples/sec: 1677.87 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:59:49,183 epoch 2 - iter 539/773 - loss 0.07602078 - time (sec): 51.29 - samples/sec: 1687.25 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:59:56,592 epoch 2 - iter 616/773 - loss 0.07570427 - time (sec): 58.70 - samples/sec: 1671.75 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:00:04,319 epoch 2 - iter 693/773 - loss 0.07508718 - time (sec): 66.42 - samples/sec: 1686.70 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:00:11,409 epoch 2 - iter 770/773 - loss 0.07433537 - time (sec): 73.51 - samples/sec: 1684.91 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:00:11,700 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:00:11,701 EPOCH 2 done: loss 0.0749 - lr: 0.000027 2023-10-17 12:00:14,996 DEV : loss 0.05837954208254814 - f1-score (micro avg) 0.7863 2023-10-17 12:00:15,033 saving best model 2023-10-17 12:00:16,970 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:00:24,464 epoch 3 - iter 77/773 - loss 0.04678876 - time (sec): 7.49 - samples/sec: 1585.33 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:00:31,494 epoch 3 - iter 154/773 - loss 0.04394092 - time (sec): 14.52 - samples/sec: 1587.83 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:00:38,728 epoch 3 - iter 231/773 - loss 0.04633664 - time (sec): 21.75 - samples/sec: 1633.15 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:00:46,112 epoch 3 - iter 308/773 - loss 0.05065715 - time (sec): 29.14 - samples/sec: 1663.43 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:00:53,391 epoch 3 - iter 385/773 - loss 0.04995056 - time (sec): 36.42 - samples/sec: 1682.30 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:01:00,605 epoch 3 - iter 462/773 - loss 0.04932353 - time (sec): 43.63 - samples/sec: 1689.86 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:01:08,272 epoch 3 - iter 539/773 - loss 0.04798072 - time (sec): 51.30 - samples/sec: 1679.29 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:01:16,193 epoch 3 - iter 616/773 - loss 0.04702234 - time (sec): 59.22 - samples/sec: 1664.05 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:01:23,550 epoch 3 - iter 693/773 - loss 0.04733206 - time (sec): 66.58 - samples/sec: 1670.05 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:01:30,536 epoch 3 - iter 770/773 - loss 0.04801480 - time (sec): 73.56 - samples/sec: 1684.20 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:01:30,791 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:01:30,791 EPOCH 3 done: loss 0.0480 - lr: 0.000023 2023-10-17 12:01:33,682 DEV : loss 0.06652045249938965 - f1-score (micro avg) 0.7692 2023-10-17 12:01:33,712 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:01:40,804 epoch 4 - iter 77/773 - loss 0.02865288 - time (sec): 7.09 - samples/sec: 1830.33 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:01:47,995 epoch 4 - iter 154/773 - loss 0.02826719 - time (sec): 14.28 - samples/sec: 1846.02 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:01:55,594 epoch 4 - iter 231/773 - loss 0.03000062 - time (sec): 21.88 - samples/sec: 1747.28 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:02:03,346 epoch 4 - iter 308/773 - loss 0.02857205 - time (sec): 29.63 - samples/sec: 1702.07 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:02:10,474 epoch 4 - iter 385/773 - loss 0.03006392 - time (sec): 36.76 - samples/sec: 1704.74 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:02:17,821 epoch 4 - iter 462/773 - loss 0.03064035 - time (sec): 44.11 - samples/sec: 1698.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:02:25,847 epoch 4 - iter 539/773 - loss 0.03032994 - time (sec): 52.13 - samples/sec: 1662.31 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:02:33,216 epoch 4 - iter 616/773 - loss 0.02951093 - time (sec): 59.50 - samples/sec: 1652.69 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:02:40,366 epoch 4 - iter 693/773 - loss 0.03013193 - time (sec): 66.65 - samples/sec: 1671.24 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:02:47,380 epoch 4 - iter 770/773 - loss 0.03053470 - time (sec): 73.67 - samples/sec: 1679.68 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:02:47,663 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:02:47,664 EPOCH 4 done: loss 0.0304 - lr: 0.000020 2023-10-17 12:02:50,759 DEV : loss 0.07929900288581848 - f1-score (micro avg) 0.7817 2023-10-17 12:02:50,793 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:02:57,890 epoch 5 - iter 77/773 - loss 0.02217696 - time (sec): 7.10 - samples/sec: 1726.68 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:03:05,723 epoch 5 - iter 154/773 - loss 0.01690662 - time (sec): 14.93 - samples/sec: 1715.44 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:03:12,550 epoch 5 - iter 231/773 - loss 0.02055394 - time (sec): 21.75 - samples/sec: 1703.10 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:03:19,440 epoch 5 - iter 308/773 - loss 0.02201265 - time (sec): 28.65 - samples/sec: 1705.46 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:03:26,685 epoch 5 - iter 385/773 - loss 0.02194969 - time (sec): 35.89 - samples/sec: 1699.08 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:03:34,360 epoch 5 - iter 462/773 - loss 0.02236795 - time (sec): 43.57 - samples/sec: 1683.10 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:03:42,008 epoch 5 - iter 539/773 - loss 0.02204855 - time (sec): 51.21 - samples/sec: 1682.44 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:03:49,369 epoch 5 - iter 616/773 - loss 0.02258293 - time (sec): 58.57 - samples/sec: 1688.36 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:03:56,358 epoch 5 - iter 693/773 - loss 0.02279416 - time (sec): 65.56 - samples/sec: 1697.30 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:04:03,656 epoch 5 - iter 770/773 - loss 0.02288678 - time (sec): 72.86 - samples/sec: 1695.26 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:04:03,956 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:04:03,957 EPOCH 5 done: loss 0.0228 - lr: 0.000017 2023-10-17 12:04:06,846 DEV : loss 0.09690136462450027 - f1-score (micro avg) 0.8083 2023-10-17 12:04:06,878 saving best model 2023-10-17 12:04:08,284 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:04:15,535 epoch 6 - iter 77/773 - loss 0.01574627 - time (sec): 7.25 - samples/sec: 1634.93 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:04:22,680 epoch 6 - iter 154/773 - loss 0.01503167 - time (sec): 14.39 - samples/sec: 1606.59 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:04:30,417 epoch 6 - iter 231/773 - loss 0.01525419 - time (sec): 22.13 - samples/sec: 1619.30 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:04:37,501 epoch 6 - iter 308/773 - loss 0.01604884 - time (sec): 29.21 - samples/sec: 1681.45 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:04:44,519 epoch 6 - iter 385/773 - loss 0.01647546 - time (sec): 36.23 - samples/sec: 1699.69 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:04:51,629 epoch 6 - iter 462/773 - loss 0.01649118 - time (sec): 43.34 - samples/sec: 1703.29 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:04:58,910 epoch 6 - iter 539/773 - loss 0.01640551 - time (sec): 50.62 - samples/sec: 1708.22 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:05:06,068 epoch 6 - iter 616/773 - loss 0.01616345 - time (sec): 57.78 - samples/sec: 1709.61 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:05:12,936 epoch 6 - iter 693/773 - loss 0.01661672 - time (sec): 64.65 - samples/sec: 1723.18 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:05:19,659 epoch 6 - iter 770/773 - loss 0.01623904 - time (sec): 71.37 - samples/sec: 1733.95 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:05:19,954 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:05:19,954 EPOCH 6 done: loss 0.0162 - lr: 0.000013 2023-10-17 12:05:22,876 DEV : loss 0.11253025382757187 - f1-score (micro avg) 0.7865 2023-10-17 12:05:22,905 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:05:29,670 epoch 7 - iter 77/773 - loss 0.01369959 - time (sec): 6.76 - samples/sec: 1740.54 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:05:36,767 epoch 7 - iter 154/773 - loss 0.01009396 - time (sec): 13.86 - samples/sec: 1732.16 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:05:43,890 epoch 7 - iter 231/773 - loss 0.00886952 - time (sec): 20.98 - samples/sec: 1749.26 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:05:50,813 epoch 7 - iter 308/773 - loss 0.00877763 - time (sec): 27.91 - samples/sec: 1739.80 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:05:57,700 epoch 7 - iter 385/773 - loss 0.00929459 - time (sec): 34.79 - samples/sec: 1733.43 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:06:04,692 epoch 7 - iter 462/773 - loss 0.00883992 - time (sec): 41.79 - samples/sec: 1743.78 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:06:12,154 epoch 7 - iter 539/773 - loss 0.00966097 - time (sec): 49.25 - samples/sec: 1769.92 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:06:19,005 epoch 7 - iter 616/773 - loss 0.00980374 - time (sec): 56.10 - samples/sec: 1767.70 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:06:25,948 epoch 7 - iter 693/773 - loss 0.01034364 - time (sec): 63.04 - samples/sec: 1764.13 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:06:33,616 epoch 7 - iter 770/773 - loss 0.01133078 - time (sec): 70.71 - samples/sec: 1747.32 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:06:33,975 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:06:33,975 EPOCH 7 done: loss 0.0114 - lr: 0.000010 2023-10-17 12:06:37,067 DEV : loss 0.11950261145830154 - f1-score (micro avg) 0.795 2023-10-17 12:06:37,096 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:06:44,893 epoch 8 - iter 77/773 - loss 0.00625140 - time (sec): 7.79 - samples/sec: 1497.23 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:06:51,920 epoch 8 - iter 154/773 - loss 0.00852418 - time (sec): 14.82 - samples/sec: 1633.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:06:58,955 epoch 8 - iter 231/773 - loss 0.00806622 - time (sec): 21.86 - samples/sec: 1629.72 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:07:05,728 epoch 8 - iter 308/773 - loss 0.00853359 - time (sec): 28.63 - samples/sec: 1652.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:07:12,918 epoch 8 - iter 385/773 - loss 0.00786367 - time (sec): 35.82 - samples/sec: 1688.05 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:07:20,139 epoch 8 - iter 462/773 - loss 0.00702458 - time (sec): 43.04 - samples/sec: 1715.38 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:07:27,147 epoch 8 - iter 539/773 - loss 0.00704735 - time (sec): 50.05 - samples/sec: 1722.89 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:07:34,445 epoch 8 - iter 616/773 - loss 0.00717121 - time (sec): 57.35 - samples/sec: 1727.76 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:07:41,580 epoch 8 - iter 693/773 - loss 0.00713439 - time (sec): 64.48 - samples/sec: 1724.76 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:07:49,000 epoch 8 - iter 770/773 - loss 0.00703776 - time (sec): 71.90 - samples/sec: 1721.06 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:07:49,276 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:49,276 EPOCH 8 done: loss 0.0070 - lr: 0.000007 2023-10-17 12:07:52,369 DEV : loss 0.12206191569566727 - f1-score (micro avg) 0.7942 2023-10-17 12:07:52,400 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:07:59,464 epoch 9 - iter 77/773 - loss 0.00424893 - time (sec): 7.06 - samples/sec: 1783.62 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:08:06,976 epoch 9 - iter 154/773 - loss 0.00355699 - time (sec): 14.57 - samples/sec: 1795.43 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:08:13,824 epoch 9 - iter 231/773 - loss 0.00385022 - time (sec): 21.42 - samples/sec: 1774.45 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:08:20,704 epoch 9 - iter 308/773 - loss 0.00363732 - time (sec): 28.30 - samples/sec: 1791.32 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:08:27,824 epoch 9 - iter 385/773 - loss 0.00344109 - time (sec): 35.42 - samples/sec: 1778.24 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:08:34,742 epoch 9 - iter 462/773 - loss 0.00372679 - time (sec): 42.34 - samples/sec: 1761.91 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:08:41,980 epoch 9 - iter 539/773 - loss 0.00415038 - time (sec): 49.58 - samples/sec: 1752.63 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:08:49,436 epoch 9 - iter 616/773 - loss 0.00418043 - time (sec): 57.03 - samples/sec: 1746.10 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:08:56,848 epoch 9 - iter 693/773 - loss 0.00417050 - time (sec): 64.45 - samples/sec: 1747.90 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:09:04,074 epoch 9 - iter 770/773 - loss 0.00447844 - time (sec): 71.67 - samples/sec: 1726.86 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:09:04,352 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:09:04,353 EPOCH 9 done: loss 0.0045 - lr: 0.000003 2023-10-17 12:09:07,443 DEV : loss 0.13099931180477142 - f1-score (micro avg) 0.7967 2023-10-17 12:09:07,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:09:14,596 epoch 10 - iter 77/773 - loss 0.00280890 - time (sec): 7.12 - samples/sec: 1699.23 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:09:22,633 epoch 10 - iter 154/773 - loss 0.00194984 - time (sec): 15.16 - samples/sec: 1701.26 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:09:29,624 epoch 10 - iter 231/773 - loss 0.00196051 - time (sec): 22.15 - samples/sec: 1728.94 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:09:36,424 epoch 10 - iter 308/773 - loss 0.00257133 - time (sec): 28.95 - samples/sec: 1732.49 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:09:43,620 epoch 10 - iter 385/773 - loss 0.00314051 - time (sec): 36.14 - samples/sec: 1741.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:09:50,669 epoch 10 - iter 462/773 - loss 0.00334541 - time (sec): 43.19 - samples/sec: 1755.01 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:09:57,728 epoch 10 - iter 539/773 - loss 0.00324725 - time (sec): 50.25 - samples/sec: 1753.88 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:10:04,717 epoch 10 - iter 616/773 - loss 0.00353126 - time (sec): 57.24 - samples/sec: 1728.44 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:10:11,916 epoch 10 - iter 693/773 - loss 0.00325470 - time (sec): 64.44 - samples/sec: 1736.98 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:10:19,023 epoch 10 - iter 770/773 - loss 0.00330436 - time (sec): 71.55 - samples/sec: 1730.28 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:10:19,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:19,310 EPOCH 10 done: loss 0.0033 - lr: 0.000000 2023-10-17 12:10:22,291 DEV : loss 0.1352890431880951 - f1-score (micro avg) 0.7844 2023-10-17 12:10:22,951 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:10:22,953 Loading model from best epoch ... 2023-10-17 12:10:25,332 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 12:10:33,825 Results: - F-score (micro) 0.8057 - F-score (macro) 0.7196 - Accuracy 0.6904 By class: precision recall f1-score support LOC 0.8495 0.8414 0.8455 946 BUILDING 0.6795 0.5730 0.6217 185 STREET 0.7255 0.6607 0.6916 56 micro avg 0.8208 0.7911 0.8057 1187 macro avg 0.7515 0.6917 0.7196 1187 weighted avg 0.8172 0.7911 0.8033 1187 2023-10-17 12:10:33,825 ----------------------------------------------------------------------------------------------------