2023-10-17 10:50:20,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Train: 966 sentences 2023-10-17 10:50:20,329 (train_with_dev=False, train_with_test=False) 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Training Params: 2023-10-17 10:50:20,329 - learning_rate: "5e-05" 2023-10-17 10:50:20,329 - mini_batch_size: "8" 2023-10-17 10:50:20,329 - max_epochs: "10" 2023-10-17 10:50:20,329 - shuffle: "True" 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Plugins: 2023-10-17 10:50:20,329 - TensorboardLogger 2023-10-17 10:50:20,329 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:50:20,329 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Computation: 2023-10-17 10:50:20,329 - compute on device: cuda:0 2023-10-17 10:50:20,329 - embedding storage: none 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:20,330 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:50:21,015 epoch 1 - iter 12/121 - loss 4.32717646 - time (sec): 0.68 - samples/sec: 3534.81 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:50:21,794 epoch 1 - iter 24/121 - loss 3.79490891 - time (sec): 1.46 - samples/sec: 3486.90 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:50:22,476 epoch 1 - iter 36/121 - loss 3.02438972 - time (sec): 2.15 - samples/sec: 3542.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:50:23,191 epoch 1 - iter 48/121 - loss 2.44713095 - time (sec): 2.86 - samples/sec: 3522.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:50:23,961 epoch 1 - iter 60/121 - loss 2.02221030 - time (sec): 3.63 - samples/sec: 3481.44 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:50:24,723 epoch 1 - iter 72/121 - loss 1.76776424 - time (sec): 4.39 - samples/sec: 3425.36 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:50:25,436 epoch 1 - iter 84/121 - loss 1.58304931 - time (sec): 5.11 - samples/sec: 3404.68 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:50:26,157 epoch 1 - iter 96/121 - loss 1.43030745 - time (sec): 5.83 - samples/sec: 3397.39 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:50:26,900 epoch 1 - iter 108/121 - loss 1.31326734 - time (sec): 6.57 - samples/sec: 3364.32 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:50:27,691 epoch 1 - iter 120/121 - loss 1.20599299 - time (sec): 7.36 - samples/sec: 3340.92 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:50:27,742 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:27,743 EPOCH 1 done: loss 1.1989 - lr: 0.000049 2023-10-17 10:50:28,619 DEV : loss 0.2106715589761734 - f1-score (micro avg) 0.6094 2023-10-17 10:50:28,624 saving best model 2023-10-17 10:50:29,074 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:29,893 epoch 2 - iter 12/121 - loss 0.23738903 - time (sec): 0.82 - samples/sec: 3194.92 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:50:30,612 epoch 2 - iter 24/121 - loss 0.24865588 - time (sec): 1.54 - samples/sec: 3223.62 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:50:31,317 epoch 2 - iter 36/121 - loss 0.23597534 - time (sec): 2.24 - samples/sec: 3328.38 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:50:32,044 epoch 2 - iter 48/121 - loss 0.22613230 - time (sec): 2.97 - samples/sec: 3296.33 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:50:32,802 epoch 2 - iter 60/121 - loss 0.22136440 - time (sec): 3.73 - samples/sec: 3223.84 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:50:33,545 epoch 2 - iter 72/121 - loss 0.21509557 - time (sec): 4.47 - samples/sec: 3255.40 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:50:34,256 epoch 2 - iter 84/121 - loss 0.20744860 - time (sec): 5.18 - samples/sec: 3245.71 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:50:35,035 epoch 2 - iter 96/121 - loss 0.19170653 - time (sec): 5.96 - samples/sec: 3304.39 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:50:35,743 epoch 2 - iter 108/121 - loss 0.18452437 - time (sec): 6.67 - samples/sec: 3309.48 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:50:36,560 epoch 2 - iter 120/121 - loss 0.18185624 - time (sec): 7.48 - samples/sec: 3283.39 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:50:36,613 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:36,614 EPOCH 2 done: loss 0.1816 - lr: 0.000045 2023-10-17 10:50:37,356 DEV : loss 0.12012884765863419 - f1-score (micro avg) 0.8149 2023-10-17 10:50:37,361 saving best model 2023-10-17 10:50:37,919 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:38,615 epoch 3 - iter 12/121 - loss 0.15633326 - time (sec): 0.69 - samples/sec: 3413.70 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:50:39,379 epoch 3 - iter 24/121 - loss 0.11693087 - time (sec): 1.46 - samples/sec: 3358.81 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:50:40,100 epoch 3 - iter 36/121 - loss 0.09791334 - time (sec): 2.18 - samples/sec: 3307.00 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:50:40,777 epoch 3 - iter 48/121 - loss 0.09570602 - time (sec): 2.85 - samples/sec: 3390.21 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:50:41,463 epoch 3 - iter 60/121 - loss 0.10341246 - time (sec): 3.54 - samples/sec: 3382.12 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:50:42,164 epoch 3 - iter 72/121 - loss 0.10258523 - time (sec): 4.24 - samples/sec: 3444.02 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:50:42,911 epoch 3 - iter 84/121 - loss 0.10332344 - time (sec): 4.99 - samples/sec: 3448.10 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:50:43,640 epoch 3 - iter 96/121 - loss 0.10408491 - time (sec): 5.72 - samples/sec: 3429.02 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:50:44,345 epoch 3 - iter 108/121 - loss 0.09963223 - time (sec): 6.42 - samples/sec: 3420.37 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:50:45,131 epoch 3 - iter 120/121 - loss 0.10104994 - time (sec): 7.21 - samples/sec: 3413.52 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:50:45,199 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:45,200 EPOCH 3 done: loss 0.1006 - lr: 0.000039 2023-10-17 10:50:45,951 DEV : loss 0.12703241407871246 - f1-score (micro avg) 0.8174 2023-10-17 10:50:45,956 saving best model 2023-10-17 10:50:46,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:47,241 epoch 4 - iter 12/121 - loss 0.07640209 - time (sec): 0.76 - samples/sec: 3140.72 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:50:48,011 epoch 4 - iter 24/121 - loss 0.06068045 - time (sec): 1.53 - samples/sec: 3274.30 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:50:48,722 epoch 4 - iter 36/121 - loss 0.06492967 - time (sec): 2.24 - samples/sec: 3320.97 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:50:49,455 epoch 4 - iter 48/121 - loss 0.07473805 - time (sec): 2.98 - samples/sec: 3346.46 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:50:50,188 epoch 4 - iter 60/121 - loss 0.07679395 - time (sec): 3.71 - samples/sec: 3309.00 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:50:50,964 epoch 4 - iter 72/121 - loss 0.07496194 - time (sec): 4.49 - samples/sec: 3326.15 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:50:51,721 epoch 4 - iter 84/121 - loss 0.07700276 - time (sec): 5.24 - samples/sec: 3289.90 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:50:52,558 epoch 4 - iter 96/121 - loss 0.07436746 - time (sec): 6.08 - samples/sec: 3265.35 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:50:53,232 epoch 4 - iter 108/121 - loss 0.07539758 - time (sec): 6.75 - samples/sec: 3275.69 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:50:53,982 epoch 4 - iter 120/121 - loss 0.07183802 - time (sec): 7.50 - samples/sec: 3261.73 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:50:54,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:54,065 EPOCH 4 done: loss 0.0712 - lr: 0.000034 2023-10-17 10:50:54,810 DEV : loss 0.15281036496162415 - f1-score (micro avg) 0.7975 2023-10-17 10:50:54,815 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:50:55,545 epoch 5 - iter 12/121 - loss 0.08270934 - time (sec): 0.73 - samples/sec: 3631.70 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:50:56,326 epoch 5 - iter 24/121 - loss 0.06659483 - time (sec): 1.51 - samples/sec: 3368.31 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:50:57,088 epoch 5 - iter 36/121 - loss 0.06233471 - time (sec): 2.27 - samples/sec: 3299.63 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:50:57,863 epoch 5 - iter 48/121 - loss 0.06132773 - time (sec): 3.05 - samples/sec: 3308.14 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:50:58,652 epoch 5 - iter 60/121 - loss 0.05814848 - time (sec): 3.84 - samples/sec: 3243.44 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:50:59,394 epoch 5 - iter 72/121 - loss 0.05297482 - time (sec): 4.58 - samples/sec: 3267.35 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:51:00,138 epoch 5 - iter 84/121 - loss 0.05054123 - time (sec): 5.32 - samples/sec: 3291.35 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:51:00,891 epoch 5 - iter 96/121 - loss 0.04886654 - time (sec): 6.08 - samples/sec: 3257.38 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:51:01,606 epoch 5 - iter 108/121 - loss 0.04837317 - time (sec): 6.79 - samples/sec: 3269.49 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:51:02,298 epoch 5 - iter 120/121 - loss 0.04816303 - time (sec): 7.48 - samples/sec: 3277.29 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:51:02,357 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:02,357 EPOCH 5 done: loss 0.0481 - lr: 0.000028 2023-10-17 10:51:03,111 DEV : loss 0.14528368413448334 - f1-score (micro avg) 0.8388 2023-10-17 10:51:03,116 saving best model 2023-10-17 10:51:03,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:04,397 epoch 6 - iter 12/121 - loss 0.05046710 - time (sec): 0.76 - samples/sec: 3238.54 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:51:05,155 epoch 6 - iter 24/121 - loss 0.03305271 - time (sec): 1.52 - samples/sec: 3151.43 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:51:05,855 epoch 6 - iter 36/121 - loss 0.03931931 - time (sec): 2.22 - samples/sec: 3237.12 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:51:06,625 epoch 6 - iter 48/121 - loss 0.03731659 - time (sec): 2.99 - samples/sec: 3269.12 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:51:07,391 epoch 6 - iter 60/121 - loss 0.03486466 - time (sec): 3.75 - samples/sec: 3267.13 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:51:08,190 epoch 6 - iter 72/121 - loss 0.03218689 - time (sec): 4.55 - samples/sec: 3276.40 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:51:08,940 epoch 6 - iter 84/121 - loss 0.03699222 - time (sec): 5.30 - samples/sec: 3288.75 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:51:09,668 epoch 6 - iter 96/121 - loss 0.03583866 - time (sec): 6.03 - samples/sec: 3272.02 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:51:10,413 epoch 6 - iter 108/121 - loss 0.03569511 - time (sec): 6.78 - samples/sec: 3268.97 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:51:11,175 epoch 6 - iter 120/121 - loss 0.03576859 - time (sec): 7.54 - samples/sec: 3269.06 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:51:11,220 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:11,220 EPOCH 6 done: loss 0.0356 - lr: 0.000022 2023-10-17 10:51:11,970 DEV : loss 0.1540733426809311 - f1-score (micro avg) 0.8438 2023-10-17 10:51:11,975 saving best model 2023-10-17 10:51:12,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:13,352 epoch 7 - iter 12/121 - loss 0.03229565 - time (sec): 0.82 - samples/sec: 3322.96 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:51:14,033 epoch 7 - iter 24/121 - loss 0.02874215 - time (sec): 1.50 - samples/sec: 3195.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:51:14,762 epoch 7 - iter 36/121 - loss 0.03499203 - time (sec): 2.23 - samples/sec: 3250.18 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:51:15,468 epoch 7 - iter 48/121 - loss 0.03181783 - time (sec): 2.93 - samples/sec: 3277.16 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:51:16,232 epoch 7 - iter 60/121 - loss 0.02848033 - time (sec): 3.70 - samples/sec: 3299.28 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:51:17,088 epoch 7 - iter 72/121 - loss 0.02842693 - time (sec): 4.55 - samples/sec: 3298.42 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:51:17,859 epoch 7 - iter 84/121 - loss 0.02616365 - time (sec): 5.33 - samples/sec: 3237.01 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:51:18,666 epoch 7 - iter 96/121 - loss 0.02440421 - time (sec): 6.13 - samples/sec: 3249.58 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:51:19,407 epoch 7 - iter 108/121 - loss 0.02378147 - time (sec): 6.87 - samples/sec: 3222.47 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:51:20,149 epoch 7 - iter 120/121 - loss 0.02375173 - time (sec): 7.61 - samples/sec: 3225.51 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:51:20,199 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:20,199 EPOCH 7 done: loss 0.0236 - lr: 0.000017 2023-10-17 10:51:20,946 DEV : loss 0.18394897878170013 - f1-score (micro avg) 0.8486 2023-10-17 10:51:20,951 saving best model 2023-10-17 10:51:21,446 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:22,256 epoch 8 - iter 12/121 - loss 0.01483429 - time (sec): 0.81 - samples/sec: 2850.77 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:51:23,054 epoch 8 - iter 24/121 - loss 0.01352030 - time (sec): 1.60 - samples/sec: 3133.34 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:51:23,784 epoch 8 - iter 36/121 - loss 0.01328131 - time (sec): 2.33 - samples/sec: 3155.77 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:51:24,708 epoch 8 - iter 48/121 - loss 0.02044563 - time (sec): 3.26 - samples/sec: 3066.02 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:51:25,493 epoch 8 - iter 60/121 - loss 0.01788009 - time (sec): 4.04 - samples/sec: 3057.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:51:26,287 epoch 8 - iter 72/121 - loss 0.01649561 - time (sec): 4.84 - samples/sec: 3080.65 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:51:27,048 epoch 8 - iter 84/121 - loss 0.01657942 - time (sec): 5.60 - samples/sec: 3117.59 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:51:27,791 epoch 8 - iter 96/121 - loss 0.01844025 - time (sec): 6.34 - samples/sec: 3112.92 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:51:28,510 epoch 8 - iter 108/121 - loss 0.01744183 - time (sec): 7.06 - samples/sec: 3138.61 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:51:29,269 epoch 8 - iter 120/121 - loss 0.01642666 - time (sec): 7.82 - samples/sec: 3150.18 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:51:29,319 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:29,319 EPOCH 8 done: loss 0.0165 - lr: 0.000011 2023-10-17 10:51:30,065 DEV : loss 0.2010391652584076 - f1-score (micro avg) 0.8458 2023-10-17 10:51:30,070 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:30,833 epoch 9 - iter 12/121 - loss 0.00202037 - time (sec): 0.76 - samples/sec: 3322.91 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:51:31,620 epoch 9 - iter 24/121 - loss 0.00557648 - time (sec): 1.55 - samples/sec: 3299.18 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:51:32,334 epoch 9 - iter 36/121 - loss 0.00835979 - time (sec): 2.26 - samples/sec: 3355.49 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:51:33,072 epoch 9 - iter 48/121 - loss 0.00997265 - time (sec): 3.00 - samples/sec: 3296.50 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:51:33,794 epoch 9 - iter 60/121 - loss 0.01256146 - time (sec): 3.72 - samples/sec: 3290.16 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:51:34,472 epoch 9 - iter 72/121 - loss 0.01172354 - time (sec): 4.40 - samples/sec: 3276.50 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:51:35,286 epoch 9 - iter 84/121 - loss 0.01341421 - time (sec): 5.22 - samples/sec: 3266.20 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:51:36,037 epoch 9 - iter 96/121 - loss 0.01221592 - time (sec): 5.97 - samples/sec: 3278.11 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:51:36,870 epoch 9 - iter 108/121 - loss 0.01168407 - time (sec): 6.80 - samples/sec: 3266.46 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:51:37,591 epoch 9 - iter 120/121 - loss 0.01135339 - time (sec): 7.52 - samples/sec: 3264.48 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:51:37,643 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:37,644 EPOCH 9 done: loss 0.0113 - lr: 0.000006 2023-10-17 10:51:38,396 DEV : loss 0.21391461789608002 - f1-score (micro avg) 0.8365 2023-10-17 10:51:38,401 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:39,098 epoch 10 - iter 12/121 - loss 0.00259867 - time (sec): 0.70 - samples/sec: 3364.48 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:51:39,884 epoch 10 - iter 24/121 - loss 0.01367263 - time (sec): 1.48 - samples/sec: 3276.75 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:51:40,587 epoch 10 - iter 36/121 - loss 0.01004612 - time (sec): 2.19 - samples/sec: 3290.10 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:51:41,349 epoch 10 - iter 48/121 - loss 0.00780401 - time (sec): 2.95 - samples/sec: 3354.48 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:51:42,214 epoch 10 - iter 60/121 - loss 0.00683455 - time (sec): 3.81 - samples/sec: 3306.74 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:51:42,924 epoch 10 - iter 72/121 - loss 0.00900062 - time (sec): 4.52 - samples/sec: 3303.98 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:51:43,632 epoch 10 - iter 84/121 - loss 0.00996906 - time (sec): 5.23 - samples/sec: 3332.64 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:51:44,407 epoch 10 - iter 96/121 - loss 0.00903771 - time (sec): 6.01 - samples/sec: 3329.89 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:51:45,110 epoch 10 - iter 108/121 - loss 0.00926567 - time (sec): 6.71 - samples/sec: 3323.21 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:51:45,842 epoch 10 - iter 120/121 - loss 0.00996103 - time (sec): 7.44 - samples/sec: 3313.35 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:51:45,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:45,889 EPOCH 10 done: loss 0.0099 - lr: 0.000000 2023-10-17 10:51:46,646 DEV : loss 0.21232573688030243 - f1-score (micro avg) 0.85 2023-10-17 10:51:46,651 saving best model 2023-10-17 10:51:47,559 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:51:47,560 Loading model from best epoch ... 2023-10-17 10:51:48,927 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:51:49,597 Results: - F-score (micro) 0.832 - F-score (macro) 0.5646 - Accuracy 0.731 By class: precision recall f1-score support pers 0.8732 0.8921 0.8826 139 scope 0.8540 0.9070 0.8797 129 work 0.6667 0.8000 0.7273 80 loc 0.6667 0.2222 0.3333 9 date 0.0000 0.0000 0.0000 3 micro avg 0.8122 0.8528 0.8320 360 macro avg 0.6121 0.5643 0.5646 360 weighted avg 0.8080 0.8528 0.8259 360 2023-10-17 10:51:49,597 ----------------------------------------------------------------------------------------------------