stefan-it's picture
Upload folder using huggingface_hub
4b880cc
2023-10-17 12:12:31,861 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,862 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:12:31,862 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,862 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Train: 7142 sentences
2023-10-17 12:12:31,863 (train_with_dev=False, train_with_test=False)
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Training Params:
2023-10-17 12:12:31,863 - learning_rate: "3e-05"
2023-10-17 12:12:31,863 - mini_batch_size: "8"
2023-10-17 12:12:31,863 - max_epochs: "10"
2023-10-17 12:12:31,863 - shuffle: "True"
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Plugins:
2023-10-17 12:12:31,863 - TensorboardLogger
2023-10-17 12:12:31,863 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:12:31,863 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Computation:
2023-10-17 12:12:31,863 - compute on device: cuda:0
2023-10-17 12:12:31,863 - embedding storage: none
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 ----------------------------------------------------------------------------------------------------
2023-10-17 12:12:31,863 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:12:39,868 epoch 1 - iter 89/893 - loss 2.85730488 - time (sec): 8.00 - samples/sec: 3061.44 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:12:46,421 epoch 1 - iter 178/893 - loss 1.85506791 - time (sec): 14.56 - samples/sec: 3412.31 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:12:53,180 epoch 1 - iter 267/893 - loss 1.38732665 - time (sec): 21.32 - samples/sec: 3522.46 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:12:59,540 epoch 1 - iter 356/893 - loss 1.14478631 - time (sec): 27.68 - samples/sec: 3551.80 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:13:06,415 epoch 1 - iter 445/893 - loss 0.97261226 - time (sec): 34.55 - samples/sec: 3563.18 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:13:13,098 epoch 1 - iter 534/893 - loss 0.84652087 - time (sec): 41.23 - samples/sec: 3596.87 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:13:19,594 epoch 1 - iter 623/893 - loss 0.75591951 - time (sec): 47.73 - samples/sec: 3616.65 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:13:26,521 epoch 1 - iter 712/893 - loss 0.67732184 - time (sec): 54.66 - samples/sec: 3626.52 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:13:33,467 epoch 1 - iter 801/893 - loss 0.62183579 - time (sec): 61.60 - samples/sec: 3611.56 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:13:40,480 epoch 1 - iter 890/893 - loss 0.57352952 - time (sec): 68.62 - samples/sec: 3614.04 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:13:40,670 ----------------------------------------------------------------------------------------------------
2023-10-17 12:13:40,670 EPOCH 1 done: loss 0.5723 - lr: 0.000030
2023-10-17 12:13:43,232 DEV : loss 0.10704871267080307 - f1-score (micro avg) 0.7293
2023-10-17 12:13:43,247 saving best model
2023-10-17 12:13:43,637 ----------------------------------------------------------------------------------------------------
2023-10-17 12:13:49,896 epoch 2 - iter 89/893 - loss 0.14146838 - time (sec): 6.26 - samples/sec: 3762.41 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:13:56,876 epoch 2 - iter 178/893 - loss 0.12893578 - time (sec): 13.24 - samples/sec: 3673.48 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:14:04,692 epoch 2 - iter 267/893 - loss 0.12429957 - time (sec): 21.05 - samples/sec: 3507.13 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:14:11,624 epoch 2 - iter 356/893 - loss 0.11841667 - time (sec): 27.99 - samples/sec: 3523.08 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:14:18,414 epoch 2 - iter 445/893 - loss 0.11401166 - time (sec): 34.78 - samples/sec: 3542.53 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:14:25,667 epoch 2 - iter 534/893 - loss 0.11386573 - time (sec): 42.03 - samples/sec: 3530.67 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:14:33,129 epoch 2 - iter 623/893 - loss 0.11214322 - time (sec): 49.49 - samples/sec: 3481.58 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:14:40,204 epoch 2 - iter 712/893 - loss 0.11040463 - time (sec): 56.57 - samples/sec: 3489.52 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:14:47,394 epoch 2 - iter 801/893 - loss 0.10936323 - time (sec): 63.76 - samples/sec: 3527.43 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:14:54,130 epoch 2 - iter 890/893 - loss 0.10834170 - time (sec): 70.49 - samples/sec: 3517.52 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:14:54,385 ----------------------------------------------------------------------------------------------------
2023-10-17 12:14:54,385 EPOCH 2 done: loss 0.1084 - lr: 0.000027
2023-10-17 12:14:58,657 DEV : loss 0.11247449368238449 - f1-score (micro avg) 0.7933
2023-10-17 12:14:58,676 saving best model
2023-10-17 12:14:59,276 ----------------------------------------------------------------------------------------------------
2023-10-17 12:15:06,504 epoch 3 - iter 89/893 - loss 0.06805108 - time (sec): 7.23 - samples/sec: 3598.82 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:15:13,326 epoch 3 - iter 178/893 - loss 0.06757782 - time (sec): 14.05 - samples/sec: 3596.86 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:15:20,879 epoch 3 - iter 267/893 - loss 0.06379743 - time (sec): 21.60 - samples/sec: 3528.07 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:15:27,864 epoch 3 - iter 356/893 - loss 0.06406541 - time (sec): 28.59 - samples/sec: 3600.32 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:15:35,036 epoch 3 - iter 445/893 - loss 0.06439386 - time (sec): 35.76 - samples/sec: 3585.04 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:15:41,504 epoch 3 - iter 534/893 - loss 0.06566818 - time (sec): 42.23 - samples/sec: 3602.45 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:15:48,277 epoch 3 - iter 623/893 - loss 0.06775661 - time (sec): 49.00 - samples/sec: 3606.18 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:15:55,000 epoch 3 - iter 712/893 - loss 0.06644627 - time (sec): 55.72 - samples/sec: 3600.94 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:16:02,106 epoch 3 - iter 801/893 - loss 0.06548290 - time (sec): 62.83 - samples/sec: 3580.94 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:16:08,494 epoch 3 - iter 890/893 - loss 0.06710688 - time (sec): 69.22 - samples/sec: 3582.96 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:16:08,721 ----------------------------------------------------------------------------------------------------
2023-10-17 12:16:08,721 EPOCH 3 done: loss 0.0670 - lr: 0.000023
2023-10-17 12:16:13,362 DEV : loss 0.11337490379810333 - f1-score (micro avg) 0.7952
2023-10-17 12:16:13,379 saving best model
2023-10-17 12:16:13,968 ----------------------------------------------------------------------------------------------------
2023-10-17 12:16:20,824 epoch 4 - iter 89/893 - loss 0.04583453 - time (sec): 6.85 - samples/sec: 3653.77 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:16:27,871 epoch 4 - iter 178/893 - loss 0.04916381 - time (sec): 13.90 - samples/sec: 3586.20 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:16:34,875 epoch 4 - iter 267/893 - loss 0.05184043 - time (sec): 20.90 - samples/sec: 3610.33 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:16:41,602 epoch 4 - iter 356/893 - loss 0.05199944 - time (sec): 27.63 - samples/sec: 3611.42 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:16:48,916 epoch 4 - iter 445/893 - loss 0.05051620 - time (sec): 34.95 - samples/sec: 3572.32 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:16:55,985 epoch 4 - iter 534/893 - loss 0.05017704 - time (sec): 42.02 - samples/sec: 3584.24 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:17:03,148 epoch 4 - iter 623/893 - loss 0.04874093 - time (sec): 49.18 - samples/sec: 3571.97 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:17:09,673 epoch 4 - iter 712/893 - loss 0.04824195 - time (sec): 55.70 - samples/sec: 3566.08 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:17:16,445 epoch 4 - iter 801/893 - loss 0.04854874 - time (sec): 62.48 - samples/sec: 3565.23 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:17:23,534 epoch 4 - iter 890/893 - loss 0.04877570 - time (sec): 69.56 - samples/sec: 3566.84 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:17:23,743 ----------------------------------------------------------------------------------------------------
2023-10-17 12:17:23,744 EPOCH 4 done: loss 0.0487 - lr: 0.000020
2023-10-17 12:17:28,138 DEV : loss 0.11932364106178284 - f1-score (micro avg) 0.8069
2023-10-17 12:17:28,161 saving best model
2023-10-17 12:17:28,790 ----------------------------------------------------------------------------------------------------
2023-10-17 12:17:35,379 epoch 5 - iter 89/893 - loss 0.03497413 - time (sec): 6.59 - samples/sec: 3716.09 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:17:41,734 epoch 5 - iter 178/893 - loss 0.03718769 - time (sec): 12.94 - samples/sec: 3685.55 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:17:48,871 epoch 5 - iter 267/893 - loss 0.04142280 - time (sec): 20.08 - samples/sec: 3620.04 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:17:55,949 epoch 5 - iter 356/893 - loss 0.04142570 - time (sec): 27.16 - samples/sec: 3600.69 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:18:03,278 epoch 5 - iter 445/893 - loss 0.04248257 - time (sec): 34.49 - samples/sec: 3609.75 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:18:10,953 epoch 5 - iter 534/893 - loss 0.03963812 - time (sec): 42.16 - samples/sec: 3538.91 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:18:18,355 epoch 5 - iter 623/893 - loss 0.03809248 - time (sec): 49.56 - samples/sec: 3529.55 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:18:25,313 epoch 5 - iter 712/893 - loss 0.03706784 - time (sec): 56.52 - samples/sec: 3533.29 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:18:32,665 epoch 5 - iter 801/893 - loss 0.03669771 - time (sec): 63.87 - samples/sec: 3520.70 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:18:39,249 epoch 5 - iter 890/893 - loss 0.03631606 - time (sec): 70.46 - samples/sec: 3521.70 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:18:39,413 ----------------------------------------------------------------------------------------------------
2023-10-17 12:18:39,414 EPOCH 5 done: loss 0.0364 - lr: 0.000017
2023-10-17 12:18:43,607 DEV : loss 0.16096670925617218 - f1-score (micro avg) 0.8136
2023-10-17 12:18:43,623 saving best model
2023-10-17 12:18:44,144 ----------------------------------------------------------------------------------------------------
2023-10-17 12:18:51,235 epoch 6 - iter 89/893 - loss 0.02863753 - time (sec): 7.09 - samples/sec: 3547.53 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:18:58,307 epoch 6 - iter 178/893 - loss 0.02897778 - time (sec): 14.16 - samples/sec: 3604.65 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:19:04,998 epoch 6 - iter 267/893 - loss 0.02773578 - time (sec): 20.85 - samples/sec: 3623.07 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:19:12,035 epoch 6 - iter 356/893 - loss 0.02766024 - time (sec): 27.89 - samples/sec: 3589.23 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:19:19,390 epoch 6 - iter 445/893 - loss 0.02785857 - time (sec): 35.24 - samples/sec: 3561.86 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:19:26,563 epoch 6 - iter 534/893 - loss 0.02883310 - time (sec): 42.42 - samples/sec: 3582.11 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:19:33,390 epoch 6 - iter 623/893 - loss 0.02939347 - time (sec): 49.24 - samples/sec: 3581.27 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:19:39,932 epoch 6 - iter 712/893 - loss 0.02855559 - time (sec): 55.79 - samples/sec: 3587.29 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:19:46,666 epoch 6 - iter 801/893 - loss 0.02814727 - time (sec): 62.52 - samples/sec: 3576.11 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:19:53,610 epoch 6 - iter 890/893 - loss 0.02837280 - time (sec): 69.46 - samples/sec: 3570.30 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:19:53,808 ----------------------------------------------------------------------------------------------------
2023-10-17 12:19:53,808 EPOCH 6 done: loss 0.0284 - lr: 0.000013
2023-10-17 12:19:58,395 DEV : loss 0.17645598948001862 - f1-score (micro avg) 0.825
2023-10-17 12:19:58,410 saving best model
2023-10-17 12:19:58,989 ----------------------------------------------------------------------------------------------------
2023-10-17 12:20:05,580 epoch 7 - iter 89/893 - loss 0.01795184 - time (sec): 6.59 - samples/sec: 3538.78 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:20:12,912 epoch 7 - iter 178/893 - loss 0.01923215 - time (sec): 13.92 - samples/sec: 3578.32 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:20:19,698 epoch 7 - iter 267/893 - loss 0.02233022 - time (sec): 20.71 - samples/sec: 3564.31 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:20:27,160 epoch 7 - iter 356/893 - loss 0.02153127 - time (sec): 28.17 - samples/sec: 3573.60 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:20:33,937 epoch 7 - iter 445/893 - loss 0.02333590 - time (sec): 34.95 - samples/sec: 3609.86 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:20:40,375 epoch 7 - iter 534/893 - loss 0.02295462 - time (sec): 41.38 - samples/sec: 3589.92 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:20:47,236 epoch 7 - iter 623/893 - loss 0.02344317 - time (sec): 48.25 - samples/sec: 3565.25 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:20:54,043 epoch 7 - iter 712/893 - loss 0.02321907 - time (sec): 55.05 - samples/sec: 3563.67 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:21:01,460 epoch 7 - iter 801/893 - loss 0.02240865 - time (sec): 62.47 - samples/sec: 3556.31 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:21:08,547 epoch 7 - iter 890/893 - loss 0.02194327 - time (sec): 69.56 - samples/sec: 3567.55 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:21:08,733 ----------------------------------------------------------------------------------------------------
2023-10-17 12:21:08,733 EPOCH 7 done: loss 0.0219 - lr: 0.000010
2023-10-17 12:21:12,882 DEV : loss 0.1844026744365692 - f1-score (micro avg) 0.8139
2023-10-17 12:21:12,898 ----------------------------------------------------------------------------------------------------
2023-10-17 12:21:20,744 epoch 8 - iter 89/893 - loss 0.01255689 - time (sec): 7.85 - samples/sec: 3339.00 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:21:28,038 epoch 8 - iter 178/893 - loss 0.01548700 - time (sec): 15.14 - samples/sec: 3389.82 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:21:34,919 epoch 8 - iter 267/893 - loss 0.01594896 - time (sec): 22.02 - samples/sec: 3458.08 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:21:41,668 epoch 8 - iter 356/893 - loss 0.01712178 - time (sec): 28.77 - samples/sec: 3468.64 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:21:48,918 epoch 8 - iter 445/893 - loss 0.01789972 - time (sec): 36.02 - samples/sec: 3464.56 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:21:56,104 epoch 8 - iter 534/893 - loss 0.01905710 - time (sec): 43.20 - samples/sec: 3475.53 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:22:03,342 epoch 8 - iter 623/893 - loss 0.01767420 - time (sec): 50.44 - samples/sec: 3483.39 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:22:10,713 epoch 8 - iter 712/893 - loss 0.01702791 - time (sec): 57.81 - samples/sec: 3501.71 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:22:17,326 epoch 8 - iter 801/893 - loss 0.01763268 - time (sec): 64.43 - samples/sec: 3507.31 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:22:23,553 epoch 8 - iter 890/893 - loss 0.01731707 - time (sec): 70.65 - samples/sec: 3511.06 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:22:23,766 ----------------------------------------------------------------------------------------------------
2023-10-17 12:22:23,766 EPOCH 8 done: loss 0.0173 - lr: 0.000007
2023-10-17 12:22:27,877 DEV : loss 0.19628190994262695 - f1-score (micro avg) 0.8153
2023-10-17 12:22:27,893 ----------------------------------------------------------------------------------------------------
2023-10-17 12:22:35,088 epoch 9 - iter 89/893 - loss 0.01038174 - time (sec): 7.19 - samples/sec: 3555.84 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:22:42,202 epoch 9 - iter 178/893 - loss 0.01419247 - time (sec): 14.31 - samples/sec: 3556.98 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:22:49,188 epoch 9 - iter 267/893 - loss 0.01418667 - time (sec): 21.29 - samples/sec: 3594.57 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:22:56,073 epoch 9 - iter 356/893 - loss 0.01361146 - time (sec): 28.18 - samples/sec: 3566.67 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:23:02,914 epoch 9 - iter 445/893 - loss 0.01423414 - time (sec): 35.02 - samples/sec: 3581.62 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:23:09,495 epoch 9 - iter 534/893 - loss 0.01463469 - time (sec): 41.60 - samples/sec: 3606.77 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:23:16,280 epoch 9 - iter 623/893 - loss 0.01461206 - time (sec): 48.39 - samples/sec: 3605.21 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:23:22,944 epoch 9 - iter 712/893 - loss 0.01414789 - time (sec): 55.05 - samples/sec: 3593.53 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:23:29,888 epoch 9 - iter 801/893 - loss 0.01360551 - time (sec): 61.99 - samples/sec: 3598.82 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:23:37,203 epoch 9 - iter 890/893 - loss 0.01349250 - time (sec): 69.31 - samples/sec: 3579.73 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:23:37,405 ----------------------------------------------------------------------------------------------------
2023-10-17 12:23:37,406 EPOCH 9 done: loss 0.0135 - lr: 0.000003
2023-10-17 12:23:42,085 DEV : loss 0.2013043463230133 - f1-score (micro avg) 0.8124
2023-10-17 12:23:42,102 ----------------------------------------------------------------------------------------------------
2023-10-17 12:23:49,278 epoch 10 - iter 89/893 - loss 0.01244930 - time (sec): 7.17 - samples/sec: 3525.80 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:23:56,454 epoch 10 - iter 178/893 - loss 0.01214030 - time (sec): 14.35 - samples/sec: 3488.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:24:03,245 epoch 10 - iter 267/893 - loss 0.01122784 - time (sec): 21.14 - samples/sec: 3515.99 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:24:10,345 epoch 10 - iter 356/893 - loss 0.01032258 - time (sec): 28.24 - samples/sec: 3545.28 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:24:17,003 epoch 10 - iter 445/893 - loss 0.01111884 - time (sec): 34.90 - samples/sec: 3575.90 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:24:24,085 epoch 10 - iter 534/893 - loss 0.01106441 - time (sec): 41.98 - samples/sec: 3536.02 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:24:30,739 epoch 10 - iter 623/893 - loss 0.01038407 - time (sec): 48.64 - samples/sec: 3544.80 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:24:37,750 epoch 10 - iter 712/893 - loss 0.01024039 - time (sec): 55.65 - samples/sec: 3529.60 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:24:44,628 epoch 10 - iter 801/893 - loss 0.00986859 - time (sec): 62.52 - samples/sec: 3541.25 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:24:51,860 epoch 10 - iter 890/893 - loss 0.00988030 - time (sec): 69.76 - samples/sec: 3556.93 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:24:52,068 ----------------------------------------------------------------------------------------------------
2023-10-17 12:24:52,068 EPOCH 10 done: loss 0.0099 - lr: 0.000000
2023-10-17 12:24:56,699 DEV : loss 0.1994680017232895 - f1-score (micro avg) 0.821
2023-10-17 12:24:57,112 ----------------------------------------------------------------------------------------------------
2023-10-17 12:24:57,113 Loading model from best epoch ...
2023-10-17 12:24:59,004 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 12:25:08,749
Results:
- F-score (micro) 0.7033
- F-score (macro) 0.638
- Accuracy 0.5639
By class:
precision recall f1-score support
LOC 0.7209 0.7242 0.7226 1095
PER 0.7767 0.7836 0.7801 1012
ORG 0.4123 0.6190 0.4950 357
HumanProd 0.4600 0.6970 0.5542 33
micro avg 0.6760 0.7329 0.7033 2497
macro avg 0.5925 0.7060 0.6380 2497
weighted avg 0.6959 0.7329 0.7111 2497
2023-10-17 12:25:08,749 ----------------------------------------------------------------------------------------------------