2023-10-17 17:18:02,748 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 17:18:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 17:18:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 Train: 5777 sentences 2023-10-17 17:18:02,749 (train_with_dev=False, train_with_test=False) 2023-10-17 17:18:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 Training Params: 2023-10-17 17:18:02,749 - learning_rate: "3e-05" 2023-10-17 17:18:02,749 - mini_batch_size: "8" 2023-10-17 17:18:02,749 - max_epochs: "10" 2023-10-17 17:18:02,749 - shuffle: "True" 2023-10-17 17:18:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 Plugins: 2023-10-17 17:18:02,749 - TensorboardLogger 2023-10-17 17:18:02,749 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 17:18:02,749 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,749 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 17:18:02,750 - metric: "('micro avg', 'f1-score')" 2023-10-17 17:18:02,750 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,750 Computation: 2023-10-17 17:18:02,750 - compute on device: cuda:0 2023-10-17 17:18:02,750 - embedding storage: none 2023-10-17 17:18:02,750 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,750 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 17:18:02,750 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,750 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:02,750 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 17:18:07,968 epoch 1 - iter 72/723 - loss 2.92821673 - time (sec): 5.22 - samples/sec: 3292.93 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:18:13,496 epoch 1 - iter 144/723 - loss 1.89319940 - time (sec): 10.75 - samples/sec: 3167.75 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:18:18,433 epoch 1 - iter 216/723 - loss 1.32058159 - time (sec): 15.68 - samples/sec: 3297.96 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:18:23,739 epoch 1 - iter 288/723 - loss 1.03686421 - time (sec): 20.99 - samples/sec: 3311.38 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:18:29,178 epoch 1 - iter 360/723 - loss 0.84817600 - time (sec): 26.43 - samples/sec: 3337.99 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:18:34,423 epoch 1 - iter 432/723 - loss 0.72455943 - time (sec): 31.67 - samples/sec: 3362.64 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:18:39,508 epoch 1 - iter 504/723 - loss 0.63851899 - time (sec): 36.76 - samples/sec: 3367.76 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:18:44,724 epoch 1 - iter 576/723 - loss 0.57388505 - time (sec): 41.97 - samples/sec: 3368.99 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:18:49,488 epoch 1 - iter 648/723 - loss 0.52952118 - time (sec): 46.74 - samples/sec: 3359.70 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:18:54,674 epoch 1 - iter 720/723 - loss 0.48526860 - time (sec): 51.92 - samples/sec: 3378.76 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:18:54,959 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:18:54,959 EPOCH 1 done: loss 0.4834 - lr: 0.000030 2023-10-17 17:18:58,198 DEV : loss 0.09162620455026627 - f1-score (micro avg) 0.772 2023-10-17 17:18:58,220 saving best model 2023-10-17 17:18:58,619 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:19:03,898 epoch 2 - iter 72/723 - loss 0.08263348 - time (sec): 5.28 - samples/sec: 3288.51 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:19:09,170 epoch 2 - iter 144/723 - loss 0.09517055 - time (sec): 10.55 - samples/sec: 3297.74 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:19:14,153 epoch 2 - iter 216/723 - loss 0.09988608 - time (sec): 15.53 - samples/sec: 3333.60 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:19:19,327 epoch 2 - iter 288/723 - loss 0.09627162 - time (sec): 20.71 - samples/sec: 3336.49 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:19:24,981 epoch 2 - iter 360/723 - loss 0.09485556 - time (sec): 26.36 - samples/sec: 3334.24 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:19:31,183 epoch 2 - iter 432/723 - loss 0.09259966 - time (sec): 32.56 - samples/sec: 3304.35 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:19:36,309 epoch 2 - iter 504/723 - loss 0.09012823 - time (sec): 37.69 - samples/sec: 3323.07 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:19:41,244 epoch 2 - iter 576/723 - loss 0.08785448 - time (sec): 42.62 - samples/sec: 3334.29 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:19:46,438 epoch 2 - iter 648/723 - loss 0.08710096 - time (sec): 47.82 - samples/sec: 3318.09 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:19:51,687 epoch 2 - iter 720/723 - loss 0.08553578 - time (sec): 53.07 - samples/sec: 3307.76 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:19:51,866 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:19:51,867 EPOCH 2 done: loss 0.0855 - lr: 0.000027 2023-10-17 17:19:55,303 DEV : loss 0.07994066923856735 - f1-score (micro avg) 0.8096 2023-10-17 17:19:55,325 saving best model 2023-10-17 17:19:56,010 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:20:01,368 epoch 3 - iter 72/723 - loss 0.07121149 - time (sec): 5.36 - samples/sec: 3245.54 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:20:07,264 epoch 3 - iter 144/723 - loss 0.06286836 - time (sec): 11.25 - samples/sec: 3188.58 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:20:12,469 epoch 3 - iter 216/723 - loss 0.06127686 - time (sec): 16.46 - samples/sec: 3295.23 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:20:17,584 epoch 3 - iter 288/723 - loss 0.05764282 - time (sec): 21.57 - samples/sec: 3336.95 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:20:22,398 epoch 3 - iter 360/723 - loss 0.05703678 - time (sec): 26.39 - samples/sec: 3351.17 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:20:27,635 epoch 3 - iter 432/723 - loss 0.05805135 - time (sec): 31.62 - samples/sec: 3369.77 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:20:33,178 epoch 3 - iter 504/723 - loss 0.05996212 - time (sec): 37.17 - samples/sec: 3349.78 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:20:38,081 epoch 3 - iter 576/723 - loss 0.05992057 - time (sec): 42.07 - samples/sec: 3355.15 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:20:43,349 epoch 3 - iter 648/723 - loss 0.05949413 - time (sec): 47.34 - samples/sec: 3343.73 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:20:48,602 epoch 3 - iter 720/723 - loss 0.05955426 - time (sec): 52.59 - samples/sec: 3345.25 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:20:48,758 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:20:48,758 EPOCH 3 done: loss 0.0595 - lr: 0.000023 2023-10-17 17:20:52,178 DEV : loss 0.058894336223602295 - f1-score (micro avg) 0.8744 2023-10-17 17:20:52,208 saving best model 2023-10-17 17:20:52,731 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:20:57,764 epoch 4 - iter 72/723 - loss 0.04355367 - time (sec): 5.03 - samples/sec: 3320.09 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:21:03,407 epoch 4 - iter 144/723 - loss 0.04290886 - time (sec): 10.67 - samples/sec: 3246.22 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:21:08,387 epoch 4 - iter 216/723 - loss 0.04393652 - time (sec): 15.65 - samples/sec: 3306.63 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:21:13,715 epoch 4 - iter 288/723 - loss 0.04557269 - time (sec): 20.98 - samples/sec: 3300.10 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:21:18,664 epoch 4 - iter 360/723 - loss 0.04428411 - time (sec): 25.93 - samples/sec: 3328.19 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:21:23,819 epoch 4 - iter 432/723 - loss 0.04366172 - time (sec): 31.08 - samples/sec: 3342.24 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:21:29,071 epoch 4 - iter 504/723 - loss 0.04336719 - time (sec): 36.34 - samples/sec: 3350.79 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:21:34,728 epoch 4 - iter 576/723 - loss 0.04464133 - time (sec): 41.99 - samples/sec: 3357.04 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:21:39,750 epoch 4 - iter 648/723 - loss 0.04399677 - time (sec): 47.01 - samples/sec: 3359.57 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:21:44,913 epoch 4 - iter 720/723 - loss 0.04409540 - time (sec): 52.18 - samples/sec: 3368.25 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:21:45,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:21:45,084 EPOCH 4 done: loss 0.0441 - lr: 0.000020 2023-10-17 17:21:48,784 DEV : loss 0.05959029123187065 - f1-score (micro avg) 0.8725 2023-10-17 17:21:48,804 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:21:54,022 epoch 5 - iter 72/723 - loss 0.02878262 - time (sec): 5.22 - samples/sec: 3441.74 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:21:58,914 epoch 5 - iter 144/723 - loss 0.02578122 - time (sec): 10.11 - samples/sec: 3464.18 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:22:04,106 epoch 5 - iter 216/723 - loss 0.03035057 - time (sec): 15.30 - samples/sec: 3464.67 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:22:09,621 epoch 5 - iter 288/723 - loss 0.02847141 - time (sec): 20.82 - samples/sec: 3417.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:22:14,738 epoch 5 - iter 360/723 - loss 0.02912372 - time (sec): 25.93 - samples/sec: 3391.91 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:22:20,332 epoch 5 - iter 432/723 - loss 0.03092672 - time (sec): 31.53 - samples/sec: 3363.23 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:22:25,826 epoch 5 - iter 504/723 - loss 0.03207944 - time (sec): 37.02 - samples/sec: 3349.99 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:22:30,593 epoch 5 - iter 576/723 - loss 0.03267365 - time (sec): 41.79 - samples/sec: 3369.49 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:22:35,527 epoch 5 - iter 648/723 - loss 0.03353305 - time (sec): 46.72 - samples/sec: 3379.33 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:22:40,601 epoch 5 - iter 720/723 - loss 0.03284765 - time (sec): 51.80 - samples/sec: 3391.87 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:22:40,767 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:22:40,767 EPOCH 5 done: loss 0.0328 - lr: 0.000017 2023-10-17 17:22:44,271 DEV : loss 0.0892329216003418 - f1-score (micro avg) 0.8573 2023-10-17 17:22:44,302 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:22:49,647 epoch 6 - iter 72/723 - loss 0.02078891 - time (sec): 5.34 - samples/sec: 3508.17 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:22:55,042 epoch 6 - iter 144/723 - loss 0.01967175 - time (sec): 10.74 - samples/sec: 3329.42 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:23:00,622 epoch 6 - iter 216/723 - loss 0.02336652 - time (sec): 16.32 - samples/sec: 3328.08 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:23:06,281 epoch 6 - iter 288/723 - loss 0.02495221 - time (sec): 21.98 - samples/sec: 3302.47 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:23:11,871 epoch 6 - iter 360/723 - loss 0.02541744 - time (sec): 27.57 - samples/sec: 3251.44 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:23:17,063 epoch 6 - iter 432/723 - loss 0.02514231 - time (sec): 32.76 - samples/sec: 3287.96 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:23:22,077 epoch 6 - iter 504/723 - loss 0.02636371 - time (sec): 37.77 - samples/sec: 3294.29 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:23:27,176 epoch 6 - iter 576/723 - loss 0.02636581 - time (sec): 42.87 - samples/sec: 3312.52 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:23:32,197 epoch 6 - iter 648/723 - loss 0.02508128 - time (sec): 47.89 - samples/sec: 3312.20 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:23:37,126 epoch 6 - iter 720/723 - loss 0.02535939 - time (sec): 52.82 - samples/sec: 3327.37 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:23:37,304 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:23:37,305 EPOCH 6 done: loss 0.0254 - lr: 0.000013 2023-10-17 17:23:40,588 DEV : loss 0.10650834441184998 - f1-score (micro avg) 0.8453 2023-10-17 17:23:40,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:23:45,947 epoch 7 - iter 72/723 - loss 0.01512900 - time (sec): 5.33 - samples/sec: 3225.11 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:23:51,221 epoch 7 - iter 144/723 - loss 0.01994325 - time (sec): 10.60 - samples/sec: 3255.50 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:23:56,636 epoch 7 - iter 216/723 - loss 0.01985443 - time (sec): 16.01 - samples/sec: 3264.32 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:24:01,860 epoch 7 - iter 288/723 - loss 0.02211769 - time (sec): 21.24 - samples/sec: 3315.05 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:24:07,334 epoch 7 - iter 360/723 - loss 0.02057789 - time (sec): 26.71 - samples/sec: 3281.98 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:24:12,648 epoch 7 - iter 432/723 - loss 0.02150567 - time (sec): 32.03 - samples/sec: 3313.55 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:24:17,871 epoch 7 - iter 504/723 - loss 0.02078248 - time (sec): 37.25 - samples/sec: 3339.91 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:24:23,204 epoch 7 - iter 576/723 - loss 0.02009060 - time (sec): 42.58 - samples/sec: 3324.99 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:24:28,361 epoch 7 - iter 648/723 - loss 0.01899026 - time (sec): 47.74 - samples/sec: 3315.02 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:24:33,718 epoch 7 - iter 720/723 - loss 0.01860087 - time (sec): 53.10 - samples/sec: 3302.91 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:24:34,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:24:34,113 EPOCH 7 done: loss 0.0186 - lr: 0.000010 2023-10-17 17:24:38,254 DEV : loss 0.12094509601593018 - f1-score (micro avg) 0.8635 2023-10-17 17:24:38,277 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:24:43,290 epoch 8 - iter 72/723 - loss 0.01674001 - time (sec): 5.01 - samples/sec: 3332.86 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:24:48,609 epoch 8 - iter 144/723 - loss 0.01250439 - time (sec): 10.33 - samples/sec: 3286.70 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:24:53,757 epoch 8 - iter 216/723 - loss 0.01587244 - time (sec): 15.48 - samples/sec: 3308.07 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:24:59,031 epoch 8 - iter 288/723 - loss 0.01474739 - time (sec): 20.75 - samples/sec: 3322.17 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:25:04,515 epoch 8 - iter 360/723 - loss 0.01501380 - time (sec): 26.24 - samples/sec: 3299.67 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:25:09,679 epoch 8 - iter 432/723 - loss 0.01426960 - time (sec): 31.40 - samples/sec: 3309.25 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:25:15,004 epoch 8 - iter 504/723 - loss 0.01444987 - time (sec): 36.73 - samples/sec: 3318.21 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:25:20,149 epoch 8 - iter 576/723 - loss 0.01498026 - time (sec): 41.87 - samples/sec: 3332.63 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:25:25,819 epoch 8 - iter 648/723 - loss 0.01446914 - time (sec): 47.54 - samples/sec: 3344.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:25:30,904 epoch 8 - iter 720/723 - loss 0.01414308 - time (sec): 52.63 - samples/sec: 3337.05 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:25:31,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:25:31,090 EPOCH 8 done: loss 0.0142 - lr: 0.000007 2023-10-17 17:25:34,478 DEV : loss 0.11182001233100891 - f1-score (micro avg) 0.8732 2023-10-17 17:25:34,498 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:25:39,684 epoch 9 - iter 72/723 - loss 0.00674938 - time (sec): 5.18 - samples/sec: 3305.14 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:25:44,991 epoch 9 - iter 144/723 - loss 0.00755681 - time (sec): 10.49 - samples/sec: 3337.99 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:25:50,324 epoch 9 - iter 216/723 - loss 0.00853434 - time (sec): 15.82 - samples/sec: 3317.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:25:55,818 epoch 9 - iter 288/723 - loss 0.00885931 - time (sec): 21.32 - samples/sec: 3326.28 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:26:01,443 epoch 9 - iter 360/723 - loss 0.00947749 - time (sec): 26.94 - samples/sec: 3312.93 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:26:07,110 epoch 9 - iter 432/723 - loss 0.01029660 - time (sec): 32.61 - samples/sec: 3276.34 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:26:11,975 epoch 9 - iter 504/723 - loss 0.00971519 - time (sec): 37.48 - samples/sec: 3293.67 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:26:16,712 epoch 9 - iter 576/723 - loss 0.00913791 - time (sec): 42.21 - samples/sec: 3306.95 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:26:21,888 epoch 9 - iter 648/723 - loss 0.00920760 - time (sec): 47.39 - samples/sec: 3334.43 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:26:27,223 epoch 9 - iter 720/723 - loss 0.00937031 - time (sec): 52.72 - samples/sec: 3331.95 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:26:27,395 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:26:27,395 EPOCH 9 done: loss 0.0093 - lr: 0.000003 2023-10-17 17:26:30,608 DEV : loss 0.1297394335269928 - f1-score (micro avg) 0.8725 2023-10-17 17:26:30,624 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:26:35,888 epoch 10 - iter 72/723 - loss 0.01223344 - time (sec): 5.26 - samples/sec: 3424.14 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:26:40,837 epoch 10 - iter 144/723 - loss 0.00842767 - time (sec): 10.21 - samples/sec: 3401.50 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:26:46,403 epoch 10 - iter 216/723 - loss 0.00985753 - time (sec): 15.78 - samples/sec: 3366.92 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:26:51,833 epoch 10 - iter 288/723 - loss 0.00861439 - time (sec): 21.21 - samples/sec: 3356.16 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:26:56,978 epoch 10 - iter 360/723 - loss 0.00751018 - time (sec): 26.35 - samples/sec: 3368.57 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:27:02,349 epoch 10 - iter 432/723 - loss 0.00721843 - time (sec): 31.72 - samples/sec: 3352.43 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:27:07,627 epoch 10 - iter 504/723 - loss 0.00709729 - time (sec): 37.00 - samples/sec: 3336.75 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:27:13,126 epoch 10 - iter 576/723 - loss 0.00682846 - time (sec): 42.50 - samples/sec: 3317.73 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:27:18,630 epoch 10 - iter 648/723 - loss 0.00690953 - time (sec): 48.00 - samples/sec: 3309.77 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:27:23,834 epoch 10 - iter 720/723 - loss 0.00716323 - time (sec): 53.21 - samples/sec: 3304.82 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:27:23,986 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:27:23,986 EPOCH 10 done: loss 0.0071 - lr: 0.000000 2023-10-17 17:27:27,783 DEV : loss 0.12998518347740173 - f1-score (micro avg) 0.8688 2023-10-17 17:27:28,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:27:28,162 Loading model from best epoch ... 2023-10-17 17:27:29,632 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 17:27:32,734 Results: - F-score (micro) 0.8457 - F-score (macro) 0.7463 - Accuracy 0.7446 By class: precision recall f1-score support PER 0.7904 0.8921 0.8382 482 LOC 0.9222 0.8799 0.9006 458 ORG 0.5882 0.4348 0.5000 69 micro avg 0.8362 0.8553 0.8457 1009 macro avg 0.7670 0.7356 0.7463 1009 weighted avg 0.8364 0.8553 0.8434 1009 2023-10-17 17:27:32,735 ----------------------------------------------------------------------------------------------------