2023-10-17 23:09:42,760 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Train: 5901 sentences 2023-10-17 23:09:42,761 (train_with_dev=False, train_with_test=False) 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Training Params: 2023-10-17 23:09:42,761 - learning_rate: "3e-05" 2023-10-17 23:09:42,761 - mini_batch_size: "8" 2023-10-17 23:09:42,761 - max_epochs: "10" 2023-10-17 23:09:42,761 - shuffle: "True" 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Plugins: 2023-10-17 23:09:42,761 - TensorboardLogger 2023-10-17 23:09:42,761 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 23:09:42,761 - metric: "('micro avg', 'f1-score')" 2023-10-17 23:09:42,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,761 Computation: 2023-10-17 23:09:42,761 - compute on device: cuda:0 2023-10-17 23:09:42,762 - embedding storage: none 2023-10-17 23:09:42,762 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,762 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 23:09:42,762 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,762 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:09:42,762 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 23:09:48,039 epoch 1 - iter 73/738 - loss 3.19073122 - time (sec): 5.28 - samples/sec: 3191.87 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:09:53,271 epoch 1 - iter 146/738 - loss 2.15194656 - time (sec): 10.51 - samples/sec: 3233.82 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:09:58,812 epoch 1 - iter 219/738 - loss 1.58213457 - time (sec): 16.05 - samples/sec: 3238.82 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:10:04,082 epoch 1 - iter 292/738 - loss 1.29002871 - time (sec): 21.32 - samples/sec: 3228.78 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:10:09,235 epoch 1 - iter 365/738 - loss 1.10948564 - time (sec): 26.47 - samples/sec: 3217.15 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:10:13,873 epoch 1 - iter 438/738 - loss 0.98852831 - time (sec): 31.11 - samples/sec: 3214.60 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:10:18,506 epoch 1 - iter 511/738 - loss 0.89161312 - time (sec): 35.74 - samples/sec: 3224.99 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:10:23,989 epoch 1 - iter 584/738 - loss 0.80257303 - time (sec): 41.23 - samples/sec: 3249.04 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:10:28,747 epoch 1 - iter 657/738 - loss 0.74356117 - time (sec): 45.98 - samples/sec: 3234.03 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:10:33,574 epoch 1 - iter 730/738 - loss 0.68669956 - time (sec): 50.81 - samples/sec: 3244.15 - lr: 0.000030 - momentum: 0.000000 2023-10-17 23:10:34,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:10:34,058 EPOCH 1 done: loss 0.6819 - lr: 0.000030 2023-10-17 23:10:40,005 DEV : loss 0.1276639848947525 - f1-score (micro avg) 0.7373 2023-10-17 23:10:40,039 saving best model 2023-10-17 23:10:40,487 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:10:45,940 epoch 2 - iter 73/738 - loss 0.14984187 - time (sec): 5.45 - samples/sec: 3104.96 - lr: 0.000030 - momentum: 0.000000 2023-10-17 23:10:51,172 epoch 2 - iter 146/738 - loss 0.13233162 - time (sec): 10.68 - samples/sec: 3089.34 - lr: 0.000029 - momentum: 0.000000 2023-10-17 23:10:55,945 epoch 2 - iter 219/738 - loss 0.12669524 - time (sec): 15.46 - samples/sec: 3192.49 - lr: 0.000029 - momentum: 0.000000 2023-10-17 23:11:00,853 epoch 2 - iter 292/738 - loss 0.12635740 - time (sec): 20.36 - samples/sec: 3201.48 - lr: 0.000029 - momentum: 0.000000 2023-10-17 23:11:05,967 epoch 2 - iter 365/738 - loss 0.12718419 - time (sec): 25.48 - samples/sec: 3154.73 - lr: 0.000028 - momentum: 0.000000 2023-10-17 23:11:10,919 epoch 2 - iter 438/738 - loss 0.12366621 - time (sec): 30.43 - samples/sec: 3195.23 - lr: 0.000028 - momentum: 0.000000 2023-10-17 23:11:15,809 epoch 2 - iter 511/738 - loss 0.12210520 - time (sec): 35.32 - samples/sec: 3223.55 - lr: 0.000028 - momentum: 0.000000 2023-10-17 23:11:21,889 epoch 2 - iter 584/738 - loss 0.11882532 - time (sec): 41.40 - samples/sec: 3223.82 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:11:26,949 epoch 2 - iter 657/738 - loss 0.11894765 - time (sec): 46.46 - samples/sec: 3221.37 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:11:31,595 epoch 2 - iter 730/738 - loss 0.11967070 - time (sec): 51.11 - samples/sec: 3226.47 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:11:32,023 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:11:32,023 EPOCH 2 done: loss 0.1194 - lr: 0.000027 2023-10-17 23:11:43,766 DEV : loss 0.09906981885433197 - f1-score (micro avg) 0.8284 2023-10-17 23:11:43,814 saving best model 2023-10-17 23:11:44,354 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:11:49,669 epoch 3 - iter 73/738 - loss 0.06177229 - time (sec): 5.31 - samples/sec: 3036.84 - lr: 0.000026 - momentum: 0.000000 2023-10-17 23:11:54,599 epoch 3 - iter 146/738 - loss 0.06208086 - time (sec): 10.24 - samples/sec: 3115.58 - lr: 0.000026 - momentum: 0.000000 2023-10-17 23:11:59,668 epoch 3 - iter 219/738 - loss 0.06901740 - time (sec): 15.31 - samples/sec: 3123.15 - lr: 0.000026 - momentum: 0.000000 2023-10-17 23:12:04,924 epoch 3 - iter 292/738 - loss 0.07155310 - time (sec): 20.57 - samples/sec: 3126.20 - lr: 0.000025 - momentum: 0.000000 2023-10-17 23:12:10,335 epoch 3 - iter 365/738 - loss 0.07449535 - time (sec): 25.98 - samples/sec: 3154.16 - lr: 0.000025 - momentum: 0.000000 2023-10-17 23:12:15,067 epoch 3 - iter 438/738 - loss 0.07444339 - time (sec): 30.71 - samples/sec: 3174.87 - lr: 0.000025 - momentum: 0.000000 2023-10-17 23:12:20,629 epoch 3 - iter 511/738 - loss 0.07543798 - time (sec): 36.27 - samples/sec: 3196.05 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:12:25,578 epoch 3 - iter 584/738 - loss 0.07404811 - time (sec): 41.22 - samples/sec: 3192.01 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:12:30,450 epoch 3 - iter 657/738 - loss 0.07268849 - time (sec): 46.09 - samples/sec: 3206.97 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:12:35,719 epoch 3 - iter 730/738 - loss 0.07290388 - time (sec): 51.36 - samples/sec: 3211.92 - lr: 0.000023 - momentum: 0.000000 2023-10-17 23:12:36,157 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:12:36,158 EPOCH 3 done: loss 0.0740 - lr: 0.000023 2023-10-17 23:12:47,879 DEV : loss 0.10388551652431488 - f1-score (micro avg) 0.8357 2023-10-17 23:12:47,918 saving best model 2023-10-17 23:12:48,437 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:12:53,475 epoch 4 - iter 73/738 - loss 0.03830018 - time (sec): 5.04 - samples/sec: 3392.37 - lr: 0.000023 - momentum: 0.000000 2023-10-17 23:12:58,170 epoch 4 - iter 146/738 - loss 0.04844825 - time (sec): 9.73 - samples/sec: 3325.83 - lr: 0.000023 - momentum: 0.000000 2023-10-17 23:13:03,973 epoch 4 - iter 219/738 - loss 0.05141719 - time (sec): 15.53 - samples/sec: 3268.78 - lr: 0.000022 - momentum: 0.000000 2023-10-17 23:13:09,017 epoch 4 - iter 292/738 - loss 0.05536399 - time (sec): 20.58 - samples/sec: 3302.66 - lr: 0.000022 - momentum: 0.000000 2023-10-17 23:13:13,618 epoch 4 - iter 365/738 - loss 0.05397456 - time (sec): 25.18 - samples/sec: 3307.06 - lr: 0.000022 - momentum: 0.000000 2023-10-17 23:13:18,267 epoch 4 - iter 438/738 - loss 0.05338880 - time (sec): 29.83 - samples/sec: 3282.20 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:13:23,460 epoch 4 - iter 511/738 - loss 0.05233451 - time (sec): 35.02 - samples/sec: 3277.87 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:13:28,930 epoch 4 - iter 584/738 - loss 0.05264342 - time (sec): 40.49 - samples/sec: 3252.63 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:13:33,763 epoch 4 - iter 657/738 - loss 0.05249207 - time (sec): 45.32 - samples/sec: 3248.08 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:13:38,970 epoch 4 - iter 730/738 - loss 0.05189176 - time (sec): 50.53 - samples/sec: 3249.70 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:13:39,666 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:13:39,667 EPOCH 4 done: loss 0.0518 - lr: 0.000020 2023-10-17 23:13:51,275 DEV : loss 0.12748253345489502 - f1-score (micro avg) 0.8524 2023-10-17 23:13:51,312 saving best model 2023-10-17 23:13:51,854 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:13:56,810 epoch 5 - iter 73/738 - loss 0.03736179 - time (sec): 4.95 - samples/sec: 3201.82 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:14:01,605 epoch 5 - iter 146/738 - loss 0.03929666 - time (sec): 9.75 - samples/sec: 3245.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:14:06,236 epoch 5 - iter 219/738 - loss 0.03541404 - time (sec): 14.38 - samples/sec: 3334.61 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:14:11,236 epoch 5 - iter 292/738 - loss 0.03601041 - time (sec): 19.38 - samples/sec: 3341.30 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:14:15,907 epoch 5 - iter 365/738 - loss 0.03509454 - time (sec): 24.05 - samples/sec: 3338.83 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:14:21,821 epoch 5 - iter 438/738 - loss 0.03662298 - time (sec): 29.97 - samples/sec: 3348.28 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:14:27,303 epoch 5 - iter 511/738 - loss 0.03680615 - time (sec): 35.45 - samples/sec: 3328.56 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:14:32,171 epoch 5 - iter 584/738 - loss 0.03686372 - time (sec): 40.31 - samples/sec: 3298.46 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:14:36,629 epoch 5 - iter 657/738 - loss 0.03620067 - time (sec): 44.77 - samples/sec: 3285.23 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:14:41,982 epoch 5 - iter 730/738 - loss 0.03656225 - time (sec): 50.13 - samples/sec: 3286.04 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:14:42,562 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:14:42,563 EPOCH 5 done: loss 0.0369 - lr: 0.000017 2023-10-17 23:14:54,151 DEV : loss 0.15179269015789032 - f1-score (micro avg) 0.8499 2023-10-17 23:14:54,184 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:14:59,556 epoch 6 - iter 73/738 - loss 0.02916893 - time (sec): 5.37 - samples/sec: 3378.62 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:15:04,598 epoch 6 - iter 146/738 - loss 0.02243981 - time (sec): 10.41 - samples/sec: 3287.02 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:15:09,232 epoch 6 - iter 219/738 - loss 0.02328755 - time (sec): 15.05 - samples/sec: 3321.79 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:15:15,781 epoch 6 - iter 292/738 - loss 0.02427567 - time (sec): 21.59 - samples/sec: 3235.07 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:15:20,572 epoch 6 - iter 365/738 - loss 0.02457259 - time (sec): 26.39 - samples/sec: 3231.48 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:15:25,509 epoch 6 - iter 438/738 - loss 0.02583212 - time (sec): 31.32 - samples/sec: 3216.36 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:15:30,690 epoch 6 - iter 511/738 - loss 0.02625489 - time (sec): 36.50 - samples/sec: 3224.66 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:15:35,595 epoch 6 - iter 584/738 - loss 0.02561750 - time (sec): 41.41 - samples/sec: 3214.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:15:40,205 epoch 6 - iter 657/738 - loss 0.02564556 - time (sec): 46.02 - samples/sec: 3230.41 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:15:45,030 epoch 6 - iter 730/738 - loss 0.02550200 - time (sec): 50.84 - samples/sec: 3239.57 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:15:45,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:15:45,530 EPOCH 6 done: loss 0.0253 - lr: 0.000013 2023-10-17 23:15:57,118 DEV : loss 0.17110277712345123 - f1-score (micro avg) 0.8594 2023-10-17 23:15:57,152 saving best model 2023-10-17 23:15:57,727 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:16:02,253 epoch 7 - iter 73/738 - loss 0.02540609 - time (sec): 4.52 - samples/sec: 3392.61 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:16:07,182 epoch 7 - iter 146/738 - loss 0.02062297 - time (sec): 9.45 - samples/sec: 3343.81 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:16:11,938 epoch 7 - iter 219/738 - loss 0.02088188 - time (sec): 14.21 - samples/sec: 3308.10 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:16:16,710 epoch 7 - iter 292/738 - loss 0.02062599 - time (sec): 18.98 - samples/sec: 3286.23 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:16:22,336 epoch 7 - iter 365/738 - loss 0.01888039 - time (sec): 24.60 - samples/sec: 3287.90 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:16:26,959 epoch 7 - iter 438/738 - loss 0.01920684 - time (sec): 29.23 - samples/sec: 3282.43 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:16:32,361 epoch 7 - iter 511/738 - loss 0.02131205 - time (sec): 34.63 - samples/sec: 3247.63 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:16:37,805 epoch 7 - iter 584/738 - loss 0.02072057 - time (sec): 40.07 - samples/sec: 3244.46 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:16:42,983 epoch 7 - iter 657/738 - loss 0.02077910 - time (sec): 45.25 - samples/sec: 3245.39 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:16:48,295 epoch 7 - iter 730/738 - loss 0.02022206 - time (sec): 50.56 - samples/sec: 3253.26 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:16:48,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:16:48,889 EPOCH 7 done: loss 0.0202 - lr: 0.000010 2023-10-17 23:17:00,601 DEV : loss 0.17666654288768768 - f1-score (micro avg) 0.8551 2023-10-17 23:17:00,635 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:17:05,647 epoch 8 - iter 73/738 - loss 0.01531019 - time (sec): 5.01 - samples/sec: 3234.30 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:17:10,312 epoch 8 - iter 146/738 - loss 0.01455679 - time (sec): 9.68 - samples/sec: 3163.59 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:17:15,983 epoch 8 - iter 219/738 - loss 0.01387033 - time (sec): 15.35 - samples/sec: 3175.09 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:17:20,948 epoch 8 - iter 292/738 - loss 0.01356124 - time (sec): 20.31 - samples/sec: 3135.26 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:17:26,656 epoch 8 - iter 365/738 - loss 0.01664918 - time (sec): 26.02 - samples/sec: 3151.06 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:17:31,979 epoch 8 - iter 438/738 - loss 0.01621963 - time (sec): 31.34 - samples/sec: 3137.15 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:17:36,582 epoch 8 - iter 511/738 - loss 0.01654134 - time (sec): 35.95 - samples/sec: 3158.08 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:17:41,113 epoch 8 - iter 584/738 - loss 0.01556838 - time (sec): 40.48 - samples/sec: 3184.98 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:17:45,561 epoch 8 - iter 657/738 - loss 0.01548565 - time (sec): 44.93 - samples/sec: 3205.46 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:17:51,126 epoch 8 - iter 730/738 - loss 0.01493462 - time (sec): 50.49 - samples/sec: 3219.65 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:17:52,129 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:17:52,130 EPOCH 8 done: loss 0.0151 - lr: 0.000007 2023-10-17 23:18:03,991 DEV : loss 0.17948344349861145 - f1-score (micro avg) 0.8615 2023-10-17 23:18:04,027 saving best model 2023-10-17 23:18:04,556 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:18:09,493 epoch 9 - iter 73/738 - loss 0.00850263 - time (sec): 4.93 - samples/sec: 3262.72 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:18:14,417 epoch 9 - iter 146/738 - loss 0.01512494 - time (sec): 9.86 - samples/sec: 3241.41 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:18:19,807 epoch 9 - iter 219/738 - loss 0.01323011 - time (sec): 15.25 - samples/sec: 3261.67 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:18:24,677 epoch 9 - iter 292/738 - loss 0.01076360 - time (sec): 20.12 - samples/sec: 3256.57 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:18:29,821 epoch 9 - iter 365/738 - loss 0.01053439 - time (sec): 25.26 - samples/sec: 3287.97 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:18:34,680 epoch 9 - iter 438/738 - loss 0.01257739 - time (sec): 30.12 - samples/sec: 3276.62 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:18:39,356 epoch 9 - iter 511/738 - loss 0.01192892 - time (sec): 34.80 - samples/sec: 3271.42 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:18:44,914 epoch 9 - iter 584/738 - loss 0.01131486 - time (sec): 40.35 - samples/sec: 3252.99 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:18:49,881 epoch 9 - iter 657/738 - loss 0.01101467 - time (sec): 45.32 - samples/sec: 3255.87 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:18:55,034 epoch 9 - iter 730/738 - loss 0.01067850 - time (sec): 50.47 - samples/sec: 3252.46 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:18:55,753 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:18:55,753 EPOCH 9 done: loss 0.0107 - lr: 0.000003 2023-10-17 23:19:07,442 DEV : loss 0.18703721463680267 - f1-score (micro avg) 0.8547 2023-10-17 23:19:07,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:19:13,177 epoch 10 - iter 73/738 - loss 0.00499895 - time (sec): 5.70 - samples/sec: 3084.23 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:19:18,035 epoch 10 - iter 146/738 - loss 0.00950824 - time (sec): 10.56 - samples/sec: 3242.33 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:19:23,134 epoch 10 - iter 219/738 - loss 0.00884607 - time (sec): 15.66 - samples/sec: 3211.21 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:19:28,840 epoch 10 - iter 292/738 - loss 0.00992232 - time (sec): 21.36 - samples/sec: 3204.50 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:19:33,908 epoch 10 - iter 365/738 - loss 0.00903071 - time (sec): 26.43 - samples/sec: 3193.68 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:19:38,750 epoch 10 - iter 438/738 - loss 0.00856681 - time (sec): 31.27 - samples/sec: 3227.00 - lr: 0.000001 - momentum: 0.000000 2023-10-17 23:19:43,220 epoch 10 - iter 511/738 - loss 0.00799097 - time (sec): 35.74 - samples/sec: 3247.84 - lr: 0.000001 - momentum: 0.000000 2023-10-17 23:19:48,193 epoch 10 - iter 584/738 - loss 0.00745098 - time (sec): 40.71 - samples/sec: 3242.78 - lr: 0.000001 - momentum: 0.000000 2023-10-17 23:19:53,198 epoch 10 - iter 657/738 - loss 0.00772499 - time (sec): 45.72 - samples/sec: 3246.86 - lr: 0.000000 - momentum: 0.000000 2023-10-17 23:19:58,146 epoch 10 - iter 730/738 - loss 0.00809159 - time (sec): 50.67 - samples/sec: 3246.74 - lr: 0.000000 - momentum: 0.000000 2023-10-17 23:19:58,702 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:19:58,703 EPOCH 10 done: loss 0.0081 - lr: 0.000000 2023-10-17 23:20:10,391 DEV : loss 0.19134128093719482 - f1-score (micro avg) 0.8576 2023-10-17 23:20:10,835 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:20:10,836 Loading model from best epoch ... 2023-10-17 23:20:12,318 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 23:20:19,334 Results: - F-score (micro) 0.8144 - F-score (macro) 0.7223 - Accuracy 0.7034 By class: precision recall f1-score support loc 0.8638 0.8869 0.8752 858 pers 0.7860 0.8138 0.7996 537 org 0.6250 0.5682 0.5952 132 prod 0.7419 0.7541 0.7480 61 time 0.5469 0.6481 0.5932 54 micro avg 0.8045 0.8246 0.8144 1642 macro avg 0.7127 0.7342 0.7223 1642 weighted avg 0.8042 0.8246 0.8140 1642 2023-10-17 23:20:19,335 ----------------------------------------------------------------------------------------------------