stefan-it's picture
Upload folder using huggingface_hub
c50d32c
2023-10-17 17:18:02,748 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:18:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 17:18:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 Train: 5777 sentences
2023-10-17 17:18:02,749 (train_with_dev=False, train_with_test=False)
2023-10-17 17:18:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 Training Params:
2023-10-17 17:18:02,749 - learning_rate: "3e-05"
2023-10-17 17:18:02,749 - mini_batch_size: "8"
2023-10-17 17:18:02,749 - max_epochs: "10"
2023-10-17 17:18:02,749 - shuffle: "True"
2023-10-17 17:18:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 Plugins:
2023-10-17 17:18:02,749 - TensorboardLogger
2023-10-17 17:18:02,749 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:18:02,749 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,749 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:18:02,750 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:18:02,750 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,750 Computation:
2023-10-17 17:18:02,750 - compute on device: cuda:0
2023-10-17 17:18:02,750 - embedding storage: none
2023-10-17 17:18:02,750 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,750 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 17:18:02,750 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,750 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:02,750 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:18:07,968 epoch 1 - iter 72/723 - loss 2.92821673 - time (sec): 5.22 - samples/sec: 3292.93 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:18:13,496 epoch 1 - iter 144/723 - loss 1.89319940 - time (sec): 10.75 - samples/sec: 3167.75 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:18:18,433 epoch 1 - iter 216/723 - loss 1.32058159 - time (sec): 15.68 - samples/sec: 3297.96 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:18:23,739 epoch 1 - iter 288/723 - loss 1.03686421 - time (sec): 20.99 - samples/sec: 3311.38 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:18:29,178 epoch 1 - iter 360/723 - loss 0.84817600 - time (sec): 26.43 - samples/sec: 3337.99 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:18:34,423 epoch 1 - iter 432/723 - loss 0.72455943 - time (sec): 31.67 - samples/sec: 3362.64 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:18:39,508 epoch 1 - iter 504/723 - loss 0.63851899 - time (sec): 36.76 - samples/sec: 3367.76 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:18:44,724 epoch 1 - iter 576/723 - loss 0.57388505 - time (sec): 41.97 - samples/sec: 3368.99 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:18:49,488 epoch 1 - iter 648/723 - loss 0.52952118 - time (sec): 46.74 - samples/sec: 3359.70 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:18:54,674 epoch 1 - iter 720/723 - loss 0.48526860 - time (sec): 51.92 - samples/sec: 3378.76 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:18:54,959 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:54,959 EPOCH 1 done: loss 0.4834 - lr: 0.000030
2023-10-17 17:18:58,198 DEV : loss 0.09162620455026627 - f1-score (micro avg) 0.772
2023-10-17 17:18:58,220 saving best model
2023-10-17 17:18:58,619 ----------------------------------------------------------------------------------------------------
2023-10-17 17:19:03,898 epoch 2 - iter 72/723 - loss 0.08263348 - time (sec): 5.28 - samples/sec: 3288.51 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:19:09,170 epoch 2 - iter 144/723 - loss 0.09517055 - time (sec): 10.55 - samples/sec: 3297.74 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:19:14,153 epoch 2 - iter 216/723 - loss 0.09988608 - time (sec): 15.53 - samples/sec: 3333.60 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:19:19,327 epoch 2 - iter 288/723 - loss 0.09627162 - time (sec): 20.71 - samples/sec: 3336.49 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:19:24,981 epoch 2 - iter 360/723 - loss 0.09485556 - time (sec): 26.36 - samples/sec: 3334.24 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:19:31,183 epoch 2 - iter 432/723 - loss 0.09259966 - time (sec): 32.56 - samples/sec: 3304.35 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:19:36,309 epoch 2 - iter 504/723 - loss 0.09012823 - time (sec): 37.69 - samples/sec: 3323.07 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:19:41,244 epoch 2 - iter 576/723 - loss 0.08785448 - time (sec): 42.62 - samples/sec: 3334.29 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:19:46,438 epoch 2 - iter 648/723 - loss 0.08710096 - time (sec): 47.82 - samples/sec: 3318.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:19:51,687 epoch 2 - iter 720/723 - loss 0.08553578 - time (sec): 53.07 - samples/sec: 3307.76 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:19:51,866 ----------------------------------------------------------------------------------------------------
2023-10-17 17:19:51,867 EPOCH 2 done: loss 0.0855 - lr: 0.000027
2023-10-17 17:19:55,303 DEV : loss 0.07994066923856735 - f1-score (micro avg) 0.8096
2023-10-17 17:19:55,325 saving best model
2023-10-17 17:19:56,010 ----------------------------------------------------------------------------------------------------
2023-10-17 17:20:01,368 epoch 3 - iter 72/723 - loss 0.07121149 - time (sec): 5.36 - samples/sec: 3245.54 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:20:07,264 epoch 3 - iter 144/723 - loss 0.06286836 - time (sec): 11.25 - samples/sec: 3188.58 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:20:12,469 epoch 3 - iter 216/723 - loss 0.06127686 - time (sec): 16.46 - samples/sec: 3295.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:20:17,584 epoch 3 - iter 288/723 - loss 0.05764282 - time (sec): 21.57 - samples/sec: 3336.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:20:22,398 epoch 3 - iter 360/723 - loss 0.05703678 - time (sec): 26.39 - samples/sec: 3351.17 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:20:27,635 epoch 3 - iter 432/723 - loss 0.05805135 - time (sec): 31.62 - samples/sec: 3369.77 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:20:33,178 epoch 3 - iter 504/723 - loss 0.05996212 - time (sec): 37.17 - samples/sec: 3349.78 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:20:38,081 epoch 3 - iter 576/723 - loss 0.05992057 - time (sec): 42.07 - samples/sec: 3355.15 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:20:43,349 epoch 3 - iter 648/723 - loss 0.05949413 - time (sec): 47.34 - samples/sec: 3343.73 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:20:48,602 epoch 3 - iter 720/723 - loss 0.05955426 - time (sec): 52.59 - samples/sec: 3345.25 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:20:48,758 ----------------------------------------------------------------------------------------------------
2023-10-17 17:20:48,758 EPOCH 3 done: loss 0.0595 - lr: 0.000023
2023-10-17 17:20:52,178 DEV : loss 0.058894336223602295 - f1-score (micro avg) 0.8744
2023-10-17 17:20:52,208 saving best model
2023-10-17 17:20:52,731 ----------------------------------------------------------------------------------------------------
2023-10-17 17:20:57,764 epoch 4 - iter 72/723 - loss 0.04355367 - time (sec): 5.03 - samples/sec: 3320.09 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:21:03,407 epoch 4 - iter 144/723 - loss 0.04290886 - time (sec): 10.67 - samples/sec: 3246.22 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:21:08,387 epoch 4 - iter 216/723 - loss 0.04393652 - time (sec): 15.65 - samples/sec: 3306.63 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:21:13,715 epoch 4 - iter 288/723 - loss 0.04557269 - time (sec): 20.98 - samples/sec: 3300.10 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:21:18,664 epoch 4 - iter 360/723 - loss 0.04428411 - time (sec): 25.93 - samples/sec: 3328.19 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:21:23,819 epoch 4 - iter 432/723 - loss 0.04366172 - time (sec): 31.08 - samples/sec: 3342.24 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:21:29,071 epoch 4 - iter 504/723 - loss 0.04336719 - time (sec): 36.34 - samples/sec: 3350.79 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:21:34,728 epoch 4 - iter 576/723 - loss 0.04464133 - time (sec): 41.99 - samples/sec: 3357.04 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:21:39,750 epoch 4 - iter 648/723 - loss 0.04399677 - time (sec): 47.01 - samples/sec: 3359.57 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:21:44,913 epoch 4 - iter 720/723 - loss 0.04409540 - time (sec): 52.18 - samples/sec: 3368.25 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:21:45,083 ----------------------------------------------------------------------------------------------------
2023-10-17 17:21:45,084 EPOCH 4 done: loss 0.0441 - lr: 0.000020
2023-10-17 17:21:48,784 DEV : loss 0.05959029123187065 - f1-score (micro avg) 0.8725
2023-10-17 17:21:48,804 ----------------------------------------------------------------------------------------------------
2023-10-17 17:21:54,022 epoch 5 - iter 72/723 - loss 0.02878262 - time (sec): 5.22 - samples/sec: 3441.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:21:58,914 epoch 5 - iter 144/723 - loss 0.02578122 - time (sec): 10.11 - samples/sec: 3464.18 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:22:04,106 epoch 5 - iter 216/723 - loss 0.03035057 - time (sec): 15.30 - samples/sec: 3464.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:22:09,621 epoch 5 - iter 288/723 - loss 0.02847141 - time (sec): 20.82 - samples/sec: 3417.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:22:14,738 epoch 5 - iter 360/723 - loss 0.02912372 - time (sec): 25.93 - samples/sec: 3391.91 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:22:20,332 epoch 5 - iter 432/723 - loss 0.03092672 - time (sec): 31.53 - samples/sec: 3363.23 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:22:25,826 epoch 5 - iter 504/723 - loss 0.03207944 - time (sec): 37.02 - samples/sec: 3349.99 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:22:30,593 epoch 5 - iter 576/723 - loss 0.03267365 - time (sec): 41.79 - samples/sec: 3369.49 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:22:35,527 epoch 5 - iter 648/723 - loss 0.03353305 - time (sec): 46.72 - samples/sec: 3379.33 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:22:40,601 epoch 5 - iter 720/723 - loss 0.03284765 - time (sec): 51.80 - samples/sec: 3391.87 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:22:40,767 ----------------------------------------------------------------------------------------------------
2023-10-17 17:22:40,767 EPOCH 5 done: loss 0.0328 - lr: 0.000017
2023-10-17 17:22:44,271 DEV : loss 0.0892329216003418 - f1-score (micro avg) 0.8573
2023-10-17 17:22:44,302 ----------------------------------------------------------------------------------------------------
2023-10-17 17:22:49,647 epoch 6 - iter 72/723 - loss 0.02078891 - time (sec): 5.34 - samples/sec: 3508.17 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:22:55,042 epoch 6 - iter 144/723 - loss 0.01967175 - time (sec): 10.74 - samples/sec: 3329.42 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:23:00,622 epoch 6 - iter 216/723 - loss 0.02336652 - time (sec): 16.32 - samples/sec: 3328.08 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:23:06,281 epoch 6 - iter 288/723 - loss 0.02495221 - time (sec): 21.98 - samples/sec: 3302.47 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:23:11,871 epoch 6 - iter 360/723 - loss 0.02541744 - time (sec): 27.57 - samples/sec: 3251.44 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:23:17,063 epoch 6 - iter 432/723 - loss 0.02514231 - time (sec): 32.76 - samples/sec: 3287.96 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:23:22,077 epoch 6 - iter 504/723 - loss 0.02636371 - time (sec): 37.77 - samples/sec: 3294.29 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:23:27,176 epoch 6 - iter 576/723 - loss 0.02636581 - time (sec): 42.87 - samples/sec: 3312.52 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:23:32,197 epoch 6 - iter 648/723 - loss 0.02508128 - time (sec): 47.89 - samples/sec: 3312.20 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:23:37,126 epoch 6 - iter 720/723 - loss 0.02535939 - time (sec): 52.82 - samples/sec: 3327.37 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:23:37,304 ----------------------------------------------------------------------------------------------------
2023-10-17 17:23:37,305 EPOCH 6 done: loss 0.0254 - lr: 0.000013
2023-10-17 17:23:40,588 DEV : loss 0.10650834441184998 - f1-score (micro avg) 0.8453
2023-10-17 17:23:40,620 ----------------------------------------------------------------------------------------------------
2023-10-17 17:23:45,947 epoch 7 - iter 72/723 - loss 0.01512900 - time (sec): 5.33 - samples/sec: 3225.11 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:23:51,221 epoch 7 - iter 144/723 - loss 0.01994325 - time (sec): 10.60 - samples/sec: 3255.50 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:23:56,636 epoch 7 - iter 216/723 - loss 0.01985443 - time (sec): 16.01 - samples/sec: 3264.32 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:24:01,860 epoch 7 - iter 288/723 - loss 0.02211769 - time (sec): 21.24 - samples/sec: 3315.05 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:24:07,334 epoch 7 - iter 360/723 - loss 0.02057789 - time (sec): 26.71 - samples/sec: 3281.98 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:24:12,648 epoch 7 - iter 432/723 - loss 0.02150567 - time (sec): 32.03 - samples/sec: 3313.55 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:24:17,871 epoch 7 - iter 504/723 - loss 0.02078248 - time (sec): 37.25 - samples/sec: 3339.91 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:24:23,204 epoch 7 - iter 576/723 - loss 0.02009060 - time (sec): 42.58 - samples/sec: 3324.99 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:24:28,361 epoch 7 - iter 648/723 - loss 0.01899026 - time (sec): 47.74 - samples/sec: 3315.02 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:24:33,718 epoch 7 - iter 720/723 - loss 0.01860087 - time (sec): 53.10 - samples/sec: 3302.91 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:24:34,113 ----------------------------------------------------------------------------------------------------
2023-10-17 17:24:34,113 EPOCH 7 done: loss 0.0186 - lr: 0.000010
2023-10-17 17:24:38,254 DEV : loss 0.12094509601593018 - f1-score (micro avg) 0.8635
2023-10-17 17:24:38,277 ----------------------------------------------------------------------------------------------------
2023-10-17 17:24:43,290 epoch 8 - iter 72/723 - loss 0.01674001 - time (sec): 5.01 - samples/sec: 3332.86 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:24:48,609 epoch 8 - iter 144/723 - loss 0.01250439 - time (sec): 10.33 - samples/sec: 3286.70 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:24:53,757 epoch 8 - iter 216/723 - loss 0.01587244 - time (sec): 15.48 - samples/sec: 3308.07 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:24:59,031 epoch 8 - iter 288/723 - loss 0.01474739 - time (sec): 20.75 - samples/sec: 3322.17 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:25:04,515 epoch 8 - iter 360/723 - loss 0.01501380 - time (sec): 26.24 - samples/sec: 3299.67 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:25:09,679 epoch 8 - iter 432/723 - loss 0.01426960 - time (sec): 31.40 - samples/sec: 3309.25 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:25:15,004 epoch 8 - iter 504/723 - loss 0.01444987 - time (sec): 36.73 - samples/sec: 3318.21 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:25:20,149 epoch 8 - iter 576/723 - loss 0.01498026 - time (sec): 41.87 - samples/sec: 3332.63 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:25:25,819 epoch 8 - iter 648/723 - loss 0.01446914 - time (sec): 47.54 - samples/sec: 3344.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:25:30,904 epoch 8 - iter 720/723 - loss 0.01414308 - time (sec): 52.63 - samples/sec: 3337.05 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:25:31,090 ----------------------------------------------------------------------------------------------------
2023-10-17 17:25:31,090 EPOCH 8 done: loss 0.0142 - lr: 0.000007
2023-10-17 17:25:34,478 DEV : loss 0.11182001233100891 - f1-score (micro avg) 0.8732
2023-10-17 17:25:34,498 ----------------------------------------------------------------------------------------------------
2023-10-17 17:25:39,684 epoch 9 - iter 72/723 - loss 0.00674938 - time (sec): 5.18 - samples/sec: 3305.14 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:25:44,991 epoch 9 - iter 144/723 - loss 0.00755681 - time (sec): 10.49 - samples/sec: 3337.99 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:25:50,324 epoch 9 - iter 216/723 - loss 0.00853434 - time (sec): 15.82 - samples/sec: 3317.89 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:25:55,818 epoch 9 - iter 288/723 - loss 0.00885931 - time (sec): 21.32 - samples/sec: 3326.28 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:26:01,443 epoch 9 - iter 360/723 - loss 0.00947749 - time (sec): 26.94 - samples/sec: 3312.93 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:26:07,110 epoch 9 - iter 432/723 - loss 0.01029660 - time (sec): 32.61 - samples/sec: 3276.34 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:26:11,975 epoch 9 - iter 504/723 - loss 0.00971519 - time (sec): 37.48 - samples/sec: 3293.67 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:26:16,712 epoch 9 - iter 576/723 - loss 0.00913791 - time (sec): 42.21 - samples/sec: 3306.95 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:26:21,888 epoch 9 - iter 648/723 - loss 0.00920760 - time (sec): 47.39 - samples/sec: 3334.43 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:26:27,223 epoch 9 - iter 720/723 - loss 0.00937031 - time (sec): 52.72 - samples/sec: 3331.95 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:26:27,395 ----------------------------------------------------------------------------------------------------
2023-10-17 17:26:27,395 EPOCH 9 done: loss 0.0093 - lr: 0.000003
2023-10-17 17:26:30,608 DEV : loss 0.1297394335269928 - f1-score (micro avg) 0.8725
2023-10-17 17:26:30,624 ----------------------------------------------------------------------------------------------------
2023-10-17 17:26:35,888 epoch 10 - iter 72/723 - loss 0.01223344 - time (sec): 5.26 - samples/sec: 3424.14 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:26:40,837 epoch 10 - iter 144/723 - loss 0.00842767 - time (sec): 10.21 - samples/sec: 3401.50 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:26:46,403 epoch 10 - iter 216/723 - loss 0.00985753 - time (sec): 15.78 - samples/sec: 3366.92 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:26:51,833 epoch 10 - iter 288/723 - loss 0.00861439 - time (sec): 21.21 - samples/sec: 3356.16 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:26:56,978 epoch 10 - iter 360/723 - loss 0.00751018 - time (sec): 26.35 - samples/sec: 3368.57 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:27:02,349 epoch 10 - iter 432/723 - loss 0.00721843 - time (sec): 31.72 - samples/sec: 3352.43 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:27:07,627 epoch 10 - iter 504/723 - loss 0.00709729 - time (sec): 37.00 - samples/sec: 3336.75 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:27:13,126 epoch 10 - iter 576/723 - loss 0.00682846 - time (sec): 42.50 - samples/sec: 3317.73 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:27:18,630 epoch 10 - iter 648/723 - loss 0.00690953 - time (sec): 48.00 - samples/sec: 3309.77 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:27:23,834 epoch 10 - iter 720/723 - loss 0.00716323 - time (sec): 53.21 - samples/sec: 3304.82 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:27:23,986 ----------------------------------------------------------------------------------------------------
2023-10-17 17:27:23,986 EPOCH 10 done: loss 0.0071 - lr: 0.000000
2023-10-17 17:27:27,783 DEV : loss 0.12998518347740173 - f1-score (micro avg) 0.8688
2023-10-17 17:27:28,161 ----------------------------------------------------------------------------------------------------
2023-10-17 17:27:28,162 Loading model from best epoch ...
2023-10-17 17:27:29,632 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 17:27:32,734
Results:
- F-score (micro) 0.8457
- F-score (macro) 0.7463
- Accuracy 0.7446
By class:
precision recall f1-score support
PER 0.7904 0.8921 0.8382 482
LOC 0.9222 0.8799 0.9006 458
ORG 0.5882 0.4348 0.5000 69
micro avg 0.8362 0.8553 0.8457 1009
macro avg 0.7670 0.7356 0.7463 1009
weighted avg 0.8364 0.8553 0.8434 1009
2023-10-17 17:27:32,735 ----------------------------------------------------------------------------------------------------