2023-10-17 12:12:31,861 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,862 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:12:31,862 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,862 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Train: 7142 sentences 2023-10-17 12:12:31,863 (train_with_dev=False, train_with_test=False) 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Training Params: 2023-10-17 12:12:31,863 - learning_rate: "3e-05" 2023-10-17 12:12:31,863 - mini_batch_size: "8" 2023-10-17 12:12:31,863 - max_epochs: "10" 2023-10-17 12:12:31,863 - shuffle: "True" 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Plugins: 2023-10-17 12:12:31,863 - TensorboardLogger 2023-10-17 12:12:31,863 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:12:31,863 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Computation: 2023-10-17 12:12:31,863 - compute on device: cuda:0 2023-10-17 12:12:31,863 - embedding storage: none 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:12:31,863 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:12:39,868 epoch 1 - iter 89/893 - loss 2.85730488 - time (sec): 8.00 - samples/sec: 3061.44 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:12:46,421 epoch 1 - iter 178/893 - loss 1.85506791 - time (sec): 14.56 - samples/sec: 3412.31 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:12:53,180 epoch 1 - iter 267/893 - loss 1.38732665 - time (sec): 21.32 - samples/sec: 3522.46 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:12:59,540 epoch 1 - iter 356/893 - loss 1.14478631 - time (sec): 27.68 - samples/sec: 3551.80 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:13:06,415 epoch 1 - iter 445/893 - loss 0.97261226 - time (sec): 34.55 - samples/sec: 3563.18 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:13:13,098 epoch 1 - iter 534/893 - loss 0.84652087 - time (sec): 41.23 - samples/sec: 3596.87 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:13:19,594 epoch 1 - iter 623/893 - loss 0.75591951 - time (sec): 47.73 - samples/sec: 3616.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:13:26,521 epoch 1 - iter 712/893 - loss 0.67732184 - time (sec): 54.66 - samples/sec: 3626.52 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:13:33,467 epoch 1 - iter 801/893 - loss 0.62183579 - time (sec): 61.60 - samples/sec: 3611.56 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:13:40,480 epoch 1 - iter 890/893 - loss 0.57352952 - time (sec): 68.62 - samples/sec: 3614.04 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:13:40,670 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:13:40,670 EPOCH 1 done: loss 0.5723 - lr: 0.000030 2023-10-17 12:13:43,232 DEV : loss 0.10704871267080307 - f1-score (micro avg) 0.7293 2023-10-17 12:13:43,247 saving best model 2023-10-17 12:13:43,637 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:13:49,896 epoch 2 - iter 89/893 - loss 0.14146838 - time (sec): 6.26 - samples/sec: 3762.41 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:13:56,876 epoch 2 - iter 178/893 - loss 0.12893578 - time (sec): 13.24 - samples/sec: 3673.48 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:14:04,692 epoch 2 - iter 267/893 - loss 0.12429957 - time (sec): 21.05 - samples/sec: 3507.13 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:14:11,624 epoch 2 - iter 356/893 - loss 0.11841667 - time (sec): 27.99 - samples/sec: 3523.08 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:14:18,414 epoch 2 - iter 445/893 - loss 0.11401166 - time (sec): 34.78 - samples/sec: 3542.53 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:14:25,667 epoch 2 - iter 534/893 - loss 0.11386573 - time (sec): 42.03 - samples/sec: 3530.67 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:14:33,129 epoch 2 - iter 623/893 - loss 0.11214322 - time (sec): 49.49 - samples/sec: 3481.58 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:14:40,204 epoch 2 - iter 712/893 - loss 0.11040463 - time (sec): 56.57 - samples/sec: 3489.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:14:47,394 epoch 2 - iter 801/893 - loss 0.10936323 - time (sec): 63.76 - samples/sec: 3527.43 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:14:54,130 epoch 2 - iter 890/893 - loss 0.10834170 - time (sec): 70.49 - samples/sec: 3517.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:14:54,385 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:14:54,385 EPOCH 2 done: loss 0.1084 - lr: 0.000027 2023-10-17 12:14:58,657 DEV : loss 0.11247449368238449 - f1-score (micro avg) 0.7933 2023-10-17 12:14:58,676 saving best model 2023-10-17 12:14:59,276 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:15:06,504 epoch 3 - iter 89/893 - loss 0.06805108 - time (sec): 7.23 - samples/sec: 3598.82 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:15:13,326 epoch 3 - iter 178/893 - loss 0.06757782 - time (sec): 14.05 - samples/sec: 3596.86 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:15:20,879 epoch 3 - iter 267/893 - loss 0.06379743 - time (sec): 21.60 - samples/sec: 3528.07 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:15:27,864 epoch 3 - iter 356/893 - loss 0.06406541 - time (sec): 28.59 - samples/sec: 3600.32 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:15:35,036 epoch 3 - iter 445/893 - loss 0.06439386 - time (sec): 35.76 - samples/sec: 3585.04 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:15:41,504 epoch 3 - iter 534/893 - loss 0.06566818 - time (sec): 42.23 - samples/sec: 3602.45 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:15:48,277 epoch 3 - iter 623/893 - loss 0.06775661 - time (sec): 49.00 - samples/sec: 3606.18 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:15:55,000 epoch 3 - iter 712/893 - loss 0.06644627 - time (sec): 55.72 - samples/sec: 3600.94 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:16:02,106 epoch 3 - iter 801/893 - loss 0.06548290 - time (sec): 62.83 - samples/sec: 3580.94 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:16:08,494 epoch 3 - iter 890/893 - loss 0.06710688 - time (sec): 69.22 - samples/sec: 3582.96 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:16:08,721 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:16:08,721 EPOCH 3 done: loss 0.0670 - lr: 0.000023 2023-10-17 12:16:13,362 DEV : loss 0.11337490379810333 - f1-score (micro avg) 0.7952 2023-10-17 12:16:13,379 saving best model 2023-10-17 12:16:13,968 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:16:20,824 epoch 4 - iter 89/893 - loss 0.04583453 - time (sec): 6.85 - samples/sec: 3653.77 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:16:27,871 epoch 4 - iter 178/893 - loss 0.04916381 - time (sec): 13.90 - samples/sec: 3586.20 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:16:34,875 epoch 4 - iter 267/893 - loss 0.05184043 - time (sec): 20.90 - samples/sec: 3610.33 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:16:41,602 epoch 4 - iter 356/893 - loss 0.05199944 - time (sec): 27.63 - samples/sec: 3611.42 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:16:48,916 epoch 4 - iter 445/893 - loss 0.05051620 - time (sec): 34.95 - samples/sec: 3572.32 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:16:55,985 epoch 4 - iter 534/893 - loss 0.05017704 - time (sec): 42.02 - samples/sec: 3584.24 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:17:03,148 epoch 4 - iter 623/893 - loss 0.04874093 - time (sec): 49.18 - samples/sec: 3571.97 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:17:09,673 epoch 4 - iter 712/893 - loss 0.04824195 - time (sec): 55.70 - samples/sec: 3566.08 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:17:16,445 epoch 4 - iter 801/893 - loss 0.04854874 - time (sec): 62.48 - samples/sec: 3565.23 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:17:23,534 epoch 4 - iter 890/893 - loss 0.04877570 - time (sec): 69.56 - samples/sec: 3566.84 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:17:23,743 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:23,744 EPOCH 4 done: loss 0.0487 - lr: 0.000020 2023-10-17 12:17:28,138 DEV : loss 0.11932364106178284 - f1-score (micro avg) 0.8069 2023-10-17 12:17:28,161 saving best model 2023-10-17 12:17:28,790 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:17:35,379 epoch 5 - iter 89/893 - loss 0.03497413 - time (sec): 6.59 - samples/sec: 3716.09 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:17:41,734 epoch 5 - iter 178/893 - loss 0.03718769 - time (sec): 12.94 - samples/sec: 3685.55 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:17:48,871 epoch 5 - iter 267/893 - loss 0.04142280 - time (sec): 20.08 - samples/sec: 3620.04 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:17:55,949 epoch 5 - iter 356/893 - loss 0.04142570 - time (sec): 27.16 - samples/sec: 3600.69 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:18:03,278 epoch 5 - iter 445/893 - loss 0.04248257 - time (sec): 34.49 - samples/sec: 3609.75 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:18:10,953 epoch 5 - iter 534/893 - loss 0.03963812 - time (sec): 42.16 - samples/sec: 3538.91 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:18:18,355 epoch 5 - iter 623/893 - loss 0.03809248 - time (sec): 49.56 - samples/sec: 3529.55 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:18:25,313 epoch 5 - iter 712/893 - loss 0.03706784 - time (sec): 56.52 - samples/sec: 3533.29 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:18:32,665 epoch 5 - iter 801/893 - loss 0.03669771 - time (sec): 63.87 - samples/sec: 3520.70 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:18:39,249 epoch 5 - iter 890/893 - loss 0.03631606 - time (sec): 70.46 - samples/sec: 3521.70 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:18:39,413 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:39,414 EPOCH 5 done: loss 0.0364 - lr: 0.000017 2023-10-17 12:18:43,607 DEV : loss 0.16096670925617218 - f1-score (micro avg) 0.8136 2023-10-17 12:18:43,623 saving best model 2023-10-17 12:18:44,144 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:18:51,235 epoch 6 - iter 89/893 - loss 0.02863753 - time (sec): 7.09 - samples/sec: 3547.53 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:18:58,307 epoch 6 - iter 178/893 - loss 0.02897778 - time (sec): 14.16 - samples/sec: 3604.65 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:19:04,998 epoch 6 - iter 267/893 - loss 0.02773578 - time (sec): 20.85 - samples/sec: 3623.07 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:19:12,035 epoch 6 - iter 356/893 - loss 0.02766024 - time (sec): 27.89 - samples/sec: 3589.23 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:19:19,390 epoch 6 - iter 445/893 - loss 0.02785857 - time (sec): 35.24 - samples/sec: 3561.86 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:19:26,563 epoch 6 - iter 534/893 - loss 0.02883310 - time (sec): 42.42 - samples/sec: 3582.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:19:33,390 epoch 6 - iter 623/893 - loss 0.02939347 - time (sec): 49.24 - samples/sec: 3581.27 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:19:39,932 epoch 6 - iter 712/893 - loss 0.02855559 - time (sec): 55.79 - samples/sec: 3587.29 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:19:46,666 epoch 6 - iter 801/893 - loss 0.02814727 - time (sec): 62.52 - samples/sec: 3576.11 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:19:53,610 epoch 6 - iter 890/893 - loss 0.02837280 - time (sec): 69.46 - samples/sec: 3570.30 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:19:53,808 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:19:53,808 EPOCH 6 done: loss 0.0284 - lr: 0.000013 2023-10-17 12:19:58,395 DEV : loss 0.17645598948001862 - f1-score (micro avg) 0.825 2023-10-17 12:19:58,410 saving best model 2023-10-17 12:19:58,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:20:05,580 epoch 7 - iter 89/893 - loss 0.01795184 - time (sec): 6.59 - samples/sec: 3538.78 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:20:12,912 epoch 7 - iter 178/893 - loss 0.01923215 - time (sec): 13.92 - samples/sec: 3578.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:20:19,698 epoch 7 - iter 267/893 - loss 0.02233022 - time (sec): 20.71 - samples/sec: 3564.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:20:27,160 epoch 7 - iter 356/893 - loss 0.02153127 - time (sec): 28.17 - samples/sec: 3573.60 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:20:33,937 epoch 7 - iter 445/893 - loss 0.02333590 - time (sec): 34.95 - samples/sec: 3609.86 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:20:40,375 epoch 7 - iter 534/893 - loss 0.02295462 - time (sec): 41.38 - samples/sec: 3589.92 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:20:47,236 epoch 7 - iter 623/893 - loss 0.02344317 - time (sec): 48.25 - samples/sec: 3565.25 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:20:54,043 epoch 7 - iter 712/893 - loss 0.02321907 - time (sec): 55.05 - samples/sec: 3563.67 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:21:01,460 epoch 7 - iter 801/893 - loss 0.02240865 - time (sec): 62.47 - samples/sec: 3556.31 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:21:08,547 epoch 7 - iter 890/893 - loss 0.02194327 - time (sec): 69.56 - samples/sec: 3567.55 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:21:08,733 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:21:08,733 EPOCH 7 done: loss 0.0219 - lr: 0.000010 2023-10-17 12:21:12,882 DEV : loss 0.1844026744365692 - f1-score (micro avg) 0.8139 2023-10-17 12:21:12,898 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:21:20,744 epoch 8 - iter 89/893 - loss 0.01255689 - time (sec): 7.85 - samples/sec: 3339.00 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:21:28,038 epoch 8 - iter 178/893 - loss 0.01548700 - time (sec): 15.14 - samples/sec: 3389.82 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:21:34,919 epoch 8 - iter 267/893 - loss 0.01594896 - time (sec): 22.02 - samples/sec: 3458.08 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:21:41,668 epoch 8 - iter 356/893 - loss 0.01712178 - time (sec): 28.77 - samples/sec: 3468.64 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:21:48,918 epoch 8 - iter 445/893 - loss 0.01789972 - time (sec): 36.02 - samples/sec: 3464.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:21:56,104 epoch 8 - iter 534/893 - loss 0.01905710 - time (sec): 43.20 - samples/sec: 3475.53 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:22:03,342 epoch 8 - iter 623/893 - loss 0.01767420 - time (sec): 50.44 - samples/sec: 3483.39 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:22:10,713 epoch 8 - iter 712/893 - loss 0.01702791 - time (sec): 57.81 - samples/sec: 3501.71 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:22:17,326 epoch 8 - iter 801/893 - loss 0.01763268 - time (sec): 64.43 - samples/sec: 3507.31 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:22:23,553 epoch 8 - iter 890/893 - loss 0.01731707 - time (sec): 70.65 - samples/sec: 3511.06 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:22:23,766 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:22:23,766 EPOCH 8 done: loss 0.0173 - lr: 0.000007 2023-10-17 12:22:27,877 DEV : loss 0.19628190994262695 - f1-score (micro avg) 0.8153 2023-10-17 12:22:27,893 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:22:35,088 epoch 9 - iter 89/893 - loss 0.01038174 - time (sec): 7.19 - samples/sec: 3555.84 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:22:42,202 epoch 9 - iter 178/893 - loss 0.01419247 - time (sec): 14.31 - samples/sec: 3556.98 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:22:49,188 epoch 9 - iter 267/893 - loss 0.01418667 - time (sec): 21.29 - samples/sec: 3594.57 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:22:56,073 epoch 9 - iter 356/893 - loss 0.01361146 - time (sec): 28.18 - samples/sec: 3566.67 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:23:02,914 epoch 9 - iter 445/893 - loss 0.01423414 - time (sec): 35.02 - samples/sec: 3581.62 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:23:09,495 epoch 9 - iter 534/893 - loss 0.01463469 - time (sec): 41.60 - samples/sec: 3606.77 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:23:16,280 epoch 9 - iter 623/893 - loss 0.01461206 - time (sec): 48.39 - samples/sec: 3605.21 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:23:22,944 epoch 9 - iter 712/893 - loss 0.01414789 - time (sec): 55.05 - samples/sec: 3593.53 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:23:29,888 epoch 9 - iter 801/893 - loss 0.01360551 - time (sec): 61.99 - samples/sec: 3598.82 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:23:37,203 epoch 9 - iter 890/893 - loss 0.01349250 - time (sec): 69.31 - samples/sec: 3579.73 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:23:37,405 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:23:37,406 EPOCH 9 done: loss 0.0135 - lr: 0.000003 2023-10-17 12:23:42,085 DEV : loss 0.2013043463230133 - f1-score (micro avg) 0.8124 2023-10-17 12:23:42,102 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:23:49,278 epoch 10 - iter 89/893 - loss 0.01244930 - time (sec): 7.17 - samples/sec: 3525.80 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:23:56,454 epoch 10 - iter 178/893 - loss 0.01214030 - time (sec): 14.35 - samples/sec: 3488.04 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:24:03,245 epoch 10 - iter 267/893 - loss 0.01122784 - time (sec): 21.14 - samples/sec: 3515.99 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:24:10,345 epoch 10 - iter 356/893 - loss 0.01032258 - time (sec): 28.24 - samples/sec: 3545.28 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:24:17,003 epoch 10 - iter 445/893 - loss 0.01111884 - time (sec): 34.90 - samples/sec: 3575.90 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:24:24,085 epoch 10 - iter 534/893 - loss 0.01106441 - time (sec): 41.98 - samples/sec: 3536.02 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:24:30,739 epoch 10 - iter 623/893 - loss 0.01038407 - time (sec): 48.64 - samples/sec: 3544.80 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:24:37,750 epoch 10 - iter 712/893 - loss 0.01024039 - time (sec): 55.65 - samples/sec: 3529.60 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:24:44,628 epoch 10 - iter 801/893 - loss 0.00986859 - time (sec): 62.52 - samples/sec: 3541.25 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:24:51,860 epoch 10 - iter 890/893 - loss 0.00988030 - time (sec): 69.76 - samples/sec: 3556.93 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:24:52,068 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:24:52,068 EPOCH 10 done: loss 0.0099 - lr: 0.000000 2023-10-17 12:24:56,699 DEV : loss 0.1994680017232895 - f1-score (micro avg) 0.821 2023-10-17 12:24:57,112 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:24:57,113 Loading model from best epoch ... 2023-10-17 12:24:59,004 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 12:25:08,749 Results: - F-score (micro) 0.7033 - F-score (macro) 0.638 - Accuracy 0.5639 By class: precision recall f1-score support LOC 0.7209 0.7242 0.7226 1095 PER 0.7767 0.7836 0.7801 1012 ORG 0.4123 0.6190 0.4950 357 HumanProd 0.4600 0.6970 0.5542 33 micro avg 0.6760 0.7329 0.7033 2497 macro avg 0.5925 0.7060 0.6380 2497 weighted avg 0.6959 0.7329 0.7111 2497 2023-10-17 12:25:08,749 ----------------------------------------------------------------------------------------------------