2023-10-17 10:30:46,480 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,481 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:30:46,481 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,481 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:30:46,481 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,481 Train: 966 sentences 2023-10-17 10:30:46,481 (train_with_dev=False, train_with_test=False) 2023-10-17 10:30:46,481 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,481 Training Params: 2023-10-17 10:30:46,481 - learning_rate: "3e-05" 2023-10-17 10:30:46,481 - mini_batch_size: "8" 2023-10-17 10:30:46,482 - max_epochs: "10" 2023-10-17 10:30:46,482 - shuffle: "True" 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 Plugins: 2023-10-17 10:30:46,482 - TensorboardLogger 2023-10-17 10:30:46,482 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:30:46,482 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 Computation: 2023-10-17 10:30:46,482 - compute on device: cuda:0 2023-10-17 10:30:46,482 - embedding storage: none 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:46,482 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:30:47,180 epoch 1 - iter 12/121 - loss 3.32014436 - time (sec): 0.70 - samples/sec: 3341.33 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:30:47,895 epoch 1 - iter 24/121 - loss 3.03698276 - time (sec): 1.41 - samples/sec: 3164.77 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:30:48,614 epoch 1 - iter 36/121 - loss 2.60289546 - time (sec): 2.13 - samples/sec: 3210.20 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:30:49,403 epoch 1 - iter 48/121 - loss 2.12575385 - time (sec): 2.92 - samples/sec: 3218.41 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:30:50,203 epoch 1 - iter 60/121 - loss 1.79218853 - time (sec): 3.72 - samples/sec: 3229.86 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:30:50,952 epoch 1 - iter 72/121 - loss 1.58367041 - time (sec): 4.47 - samples/sec: 3227.29 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:30:51,711 epoch 1 - iter 84/121 - loss 1.41113578 - time (sec): 5.23 - samples/sec: 3230.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:30:52,492 epoch 1 - iter 96/121 - loss 1.26321860 - time (sec): 6.01 - samples/sec: 3242.76 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:30:53,219 epoch 1 - iter 108/121 - loss 1.16515662 - time (sec): 6.74 - samples/sec: 3258.23 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:30:54,001 epoch 1 - iter 120/121 - loss 1.07050963 - time (sec): 7.52 - samples/sec: 3269.52 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:30:54,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:54,065 EPOCH 1 done: loss 1.0649 - lr: 0.000030 2023-10-17 10:30:54,663 DEV : loss 0.2681349813938141 - f1-score (micro avg) 0.4935 2023-10-17 10:30:54,684 saving best model 2023-10-17 10:30:55,080 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:55,820 epoch 2 - iter 12/121 - loss 0.25245481 - time (sec): 0.74 - samples/sec: 3337.64 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:30:56,535 epoch 2 - iter 24/121 - loss 0.26119176 - time (sec): 1.45 - samples/sec: 3133.98 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:30:57,237 epoch 2 - iter 36/121 - loss 0.24934766 - time (sec): 2.16 - samples/sec: 3221.62 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:30:58,019 epoch 2 - iter 48/121 - loss 0.23822482 - time (sec): 2.94 - samples/sec: 3309.63 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:30:58,819 epoch 2 - iter 60/121 - loss 0.22963477 - time (sec): 3.74 - samples/sec: 3281.18 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:30:59,537 epoch 2 - iter 72/121 - loss 0.22640817 - time (sec): 4.46 - samples/sec: 3325.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:31:00,284 epoch 2 - iter 84/121 - loss 0.21803280 - time (sec): 5.20 - samples/sec: 3297.21 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:31:01,073 epoch 2 - iter 96/121 - loss 0.21012624 - time (sec): 5.99 - samples/sec: 3292.82 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:31:01,889 epoch 2 - iter 108/121 - loss 0.20287535 - time (sec): 6.81 - samples/sec: 3265.45 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:31:02,631 epoch 2 - iter 120/121 - loss 0.19816081 - time (sec): 7.55 - samples/sec: 3253.85 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:31:02,683 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:02,684 EPOCH 2 done: loss 0.1970 - lr: 0.000027 2023-10-17 10:31:03,620 DEV : loss 0.14390070736408234 - f1-score (micro avg) 0.7865 2023-10-17 10:31:03,626 saving best model 2023-10-17 10:31:04,142 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:04,908 epoch 3 - iter 12/121 - loss 0.12851785 - time (sec): 0.76 - samples/sec: 3266.94 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:31:05,676 epoch 3 - iter 24/121 - loss 0.11801002 - time (sec): 1.53 - samples/sec: 3172.63 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:31:06,544 epoch 3 - iter 36/121 - loss 0.11883823 - time (sec): 2.40 - samples/sec: 3152.48 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:31:07,247 epoch 3 - iter 48/121 - loss 0.11584504 - time (sec): 3.10 - samples/sec: 3236.72 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:31:07,977 epoch 3 - iter 60/121 - loss 0.11598349 - time (sec): 3.83 - samples/sec: 3231.54 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:31:08,729 epoch 3 - iter 72/121 - loss 0.11195592 - time (sec): 4.59 - samples/sec: 3307.42 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:31:09,437 epoch 3 - iter 84/121 - loss 0.11244840 - time (sec): 5.29 - samples/sec: 3263.56 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:31:10,139 epoch 3 - iter 96/121 - loss 0.11154599 - time (sec): 6.00 - samples/sec: 3284.81 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:31:10,902 epoch 3 - iter 108/121 - loss 0.11047003 - time (sec): 6.76 - samples/sec: 3316.08 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:31:11,650 epoch 3 - iter 120/121 - loss 0.11039208 - time (sec): 7.51 - samples/sec: 3286.62 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:31:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:11,702 EPOCH 3 done: loss 0.1103 - lr: 0.000023 2023-10-17 10:31:12,489 DEV : loss 0.12160782516002655 - f1-score (micro avg) 0.8291 2023-10-17 10:31:12,496 saving best model 2023-10-17 10:31:13,042 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:13,802 epoch 4 - iter 12/121 - loss 0.11078327 - time (sec): 0.76 - samples/sec: 3345.15 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:31:14,541 epoch 4 - iter 24/121 - loss 0.08753515 - time (sec): 1.49 - samples/sec: 3287.89 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:31:15,310 epoch 4 - iter 36/121 - loss 0.07917905 - time (sec): 2.26 - samples/sec: 3283.48 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:31:16,076 epoch 4 - iter 48/121 - loss 0.08826692 - time (sec): 3.03 - samples/sec: 3202.60 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:31:16,818 epoch 4 - iter 60/121 - loss 0.08527749 - time (sec): 3.77 - samples/sec: 3296.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:31:17,630 epoch 4 - iter 72/121 - loss 0.07703033 - time (sec): 4.58 - samples/sec: 3254.92 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:31:18,439 epoch 4 - iter 84/121 - loss 0.07353993 - time (sec): 5.39 - samples/sec: 3231.75 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:31:19,176 epoch 4 - iter 96/121 - loss 0.08026535 - time (sec): 6.13 - samples/sec: 3235.43 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:31:19,945 epoch 4 - iter 108/121 - loss 0.07860544 - time (sec): 6.90 - samples/sec: 3210.15 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:31:20,747 epoch 4 - iter 120/121 - loss 0.07637764 - time (sec): 7.70 - samples/sec: 3188.34 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:31:20,801 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:20,801 EPOCH 4 done: loss 0.0759 - lr: 0.000020 2023-10-17 10:31:21,561 DEV : loss 0.15159755945205688 - f1-score (micro avg) 0.8256 2023-10-17 10:31:21,567 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:22,406 epoch 5 - iter 12/121 - loss 0.04424314 - time (sec): 0.84 - samples/sec: 3332.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:31:23,111 epoch 5 - iter 24/121 - loss 0.04608236 - time (sec): 1.54 - samples/sec: 3209.53 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:31:23,879 epoch 5 - iter 36/121 - loss 0.05514976 - time (sec): 2.31 - samples/sec: 3254.88 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:31:24,663 epoch 5 - iter 48/121 - loss 0.05155823 - time (sec): 3.10 - samples/sec: 3216.27 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:31:25,390 epoch 5 - iter 60/121 - loss 0.04998764 - time (sec): 3.82 - samples/sec: 3231.78 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:31:26,176 epoch 5 - iter 72/121 - loss 0.04908976 - time (sec): 4.61 - samples/sec: 3215.16 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:31:26,888 epoch 5 - iter 84/121 - loss 0.05123180 - time (sec): 5.32 - samples/sec: 3284.23 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:31:27,640 epoch 5 - iter 96/121 - loss 0.05213783 - time (sec): 6.07 - samples/sec: 3239.37 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:31:28,342 epoch 5 - iter 108/121 - loss 0.05143224 - time (sec): 6.77 - samples/sec: 3261.78 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:31:29,078 epoch 5 - iter 120/121 - loss 0.05380168 - time (sec): 7.51 - samples/sec: 3269.60 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:31:29,148 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:29,148 EPOCH 5 done: loss 0.0535 - lr: 0.000017 2023-10-17 10:31:29,934 DEV : loss 0.1807931661605835 - f1-score (micro avg) 0.8184 2023-10-17 10:31:29,939 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:30,664 epoch 6 - iter 12/121 - loss 0.02907978 - time (sec): 0.72 - samples/sec: 3666.62 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:31:31,448 epoch 6 - iter 24/121 - loss 0.03751196 - time (sec): 1.51 - samples/sec: 3403.32 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:31:32,212 epoch 6 - iter 36/121 - loss 0.03357465 - time (sec): 2.27 - samples/sec: 3379.66 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:31:32,943 epoch 6 - iter 48/121 - loss 0.03308335 - time (sec): 3.00 - samples/sec: 3306.25 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:31:33,743 epoch 6 - iter 60/121 - loss 0.03111144 - time (sec): 3.80 - samples/sec: 3262.25 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:31:34,478 epoch 6 - iter 72/121 - loss 0.03577265 - time (sec): 4.54 - samples/sec: 3274.54 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:31:35,208 epoch 6 - iter 84/121 - loss 0.03581138 - time (sec): 5.27 - samples/sec: 3276.39 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:31:35,984 epoch 6 - iter 96/121 - loss 0.03882524 - time (sec): 6.04 - samples/sec: 3271.90 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:31:36,683 epoch 6 - iter 108/121 - loss 0.04161598 - time (sec): 6.74 - samples/sec: 3285.34 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:31:37,436 epoch 6 - iter 120/121 - loss 0.04249901 - time (sec): 7.50 - samples/sec: 3284.29 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:31:37,483 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:37,483 EPOCH 6 done: loss 0.0428 - lr: 0.000013 2023-10-17 10:31:38,252 DEV : loss 0.1748889535665512 - f1-score (micro avg) 0.8253 2023-10-17 10:31:38,257 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:39,002 epoch 7 - iter 12/121 - loss 0.03329241 - time (sec): 0.74 - samples/sec: 3548.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:31:39,762 epoch 7 - iter 24/121 - loss 0.03701914 - time (sec): 1.50 - samples/sec: 3354.87 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:31:40,548 epoch 7 - iter 36/121 - loss 0.03906978 - time (sec): 2.29 - samples/sec: 3286.41 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:31:41,283 epoch 7 - iter 48/121 - loss 0.03915569 - time (sec): 3.02 - samples/sec: 3288.91 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:31:42,096 epoch 7 - iter 60/121 - loss 0.03641657 - time (sec): 3.84 - samples/sec: 3299.32 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:31:42,798 epoch 7 - iter 72/121 - loss 0.03452311 - time (sec): 4.54 - samples/sec: 3329.70 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:31:43,580 epoch 7 - iter 84/121 - loss 0.03223815 - time (sec): 5.32 - samples/sec: 3301.68 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:31:44,284 epoch 7 - iter 96/121 - loss 0.03201468 - time (sec): 6.03 - samples/sec: 3266.72 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:31:45,041 epoch 7 - iter 108/121 - loss 0.03162600 - time (sec): 6.78 - samples/sec: 3269.40 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:31:45,765 epoch 7 - iter 120/121 - loss 0.03052726 - time (sec): 7.51 - samples/sec: 3270.34 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:31:45,821 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:45,821 EPOCH 7 done: loss 0.0307 - lr: 0.000010 2023-10-17 10:31:46,595 DEV : loss 0.19148583710193634 - f1-score (micro avg) 0.8285 2023-10-17 10:31:46,602 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:47,350 epoch 8 - iter 12/121 - loss 0.02759122 - time (sec): 0.75 - samples/sec: 3163.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:31:48,073 epoch 8 - iter 24/121 - loss 0.01779458 - time (sec): 1.47 - samples/sec: 3283.46 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:31:48,882 epoch 8 - iter 36/121 - loss 0.01975104 - time (sec): 2.28 - samples/sec: 3250.84 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:31:49,674 epoch 8 - iter 48/121 - loss 0.02004782 - time (sec): 3.07 - samples/sec: 3265.70 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:31:50,435 epoch 8 - iter 60/121 - loss 0.02196078 - time (sec): 3.83 - samples/sec: 3274.88 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:31:51,149 epoch 8 - iter 72/121 - loss 0.02076302 - time (sec): 4.55 - samples/sec: 3326.58 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:31:51,896 epoch 8 - iter 84/121 - loss 0.02046546 - time (sec): 5.29 - samples/sec: 3305.81 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:31:52,592 epoch 8 - iter 96/121 - loss 0.02295288 - time (sec): 5.99 - samples/sec: 3273.69 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:31:53,434 epoch 8 - iter 108/121 - loss 0.02253970 - time (sec): 6.83 - samples/sec: 3288.26 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:31:54,157 epoch 8 - iter 120/121 - loss 0.02232507 - time (sec): 7.55 - samples/sec: 3256.21 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:31:54,209 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:54,209 EPOCH 8 done: loss 0.0222 - lr: 0.000007 2023-10-17 10:31:54,991 DEV : loss 0.20282834768295288 - f1-score (micro avg) 0.8335 2023-10-17 10:31:54,998 saving best model 2023-10-17 10:31:55,515 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:31:56,245 epoch 9 - iter 12/121 - loss 0.02835444 - time (sec): 0.73 - samples/sec: 3199.43 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:31:57,028 epoch 9 - iter 24/121 - loss 0.02810493 - time (sec): 1.51 - samples/sec: 3014.12 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:31:57,834 epoch 9 - iter 36/121 - loss 0.02511085 - time (sec): 2.32 - samples/sec: 3074.24 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:31:58,596 epoch 9 - iter 48/121 - loss 0.01989365 - time (sec): 3.08 - samples/sec: 3162.59 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:31:59,361 epoch 9 - iter 60/121 - loss 0.02370575 - time (sec): 3.84 - samples/sec: 3136.55 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:32:00,170 epoch 9 - iter 72/121 - loss 0.02151435 - time (sec): 4.65 - samples/sec: 3130.56 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:32:00,899 epoch 9 - iter 84/121 - loss 0.02058558 - time (sec): 5.38 - samples/sec: 3137.75 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:32:01,638 epoch 9 - iter 96/121 - loss 0.01989155 - time (sec): 6.12 - samples/sec: 3183.90 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:32:02,392 epoch 9 - iter 108/121 - loss 0.01893313 - time (sec): 6.88 - samples/sec: 3207.02 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:32:03,161 epoch 9 - iter 120/121 - loss 0.01778422 - time (sec): 7.64 - samples/sec: 3210.96 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:32:03,225 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:32:03,226 EPOCH 9 done: loss 0.0176 - lr: 0.000004 2023-10-17 10:32:04,012 DEV : loss 0.21387448906898499 - f1-score (micro avg) 0.8369 2023-10-17 10:32:04,018 saving best model 2023-10-17 10:32:04,521 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:32:05,226 epoch 10 - iter 12/121 - loss 0.00693211 - time (sec): 0.70 - samples/sec: 3366.89 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:32:05,952 epoch 10 - iter 24/121 - loss 0.00979826 - time (sec): 1.43 - samples/sec: 3436.56 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:32:06,680 epoch 10 - iter 36/121 - loss 0.00987826 - time (sec): 2.16 - samples/sec: 3403.89 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:32:07,502 epoch 10 - iter 48/121 - loss 0.00849950 - time (sec): 2.98 - samples/sec: 3324.42 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:32:08,354 epoch 10 - iter 60/121 - loss 0.00902092 - time (sec): 3.83 - samples/sec: 3253.92 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:32:09,127 epoch 10 - iter 72/121 - loss 0.00916563 - time (sec): 4.60 - samples/sec: 3237.50 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:32:09,874 epoch 10 - iter 84/121 - loss 0.00985326 - time (sec): 5.35 - samples/sec: 3243.53 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:32:10,679 epoch 10 - iter 96/121 - loss 0.01078985 - time (sec): 6.16 - samples/sec: 3250.04 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:32:11,422 epoch 10 - iter 108/121 - loss 0.01288405 - time (sec): 6.90 - samples/sec: 3244.72 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:32:12,146 epoch 10 - iter 120/121 - loss 0.01283307 - time (sec): 7.62 - samples/sec: 3234.37 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:32:12,195 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:32:12,195 EPOCH 10 done: loss 0.0128 - lr: 0.000000 2023-10-17 10:32:12,981 DEV : loss 0.21676737070083618 - f1-score (micro avg) 0.8396 2023-10-17 10:32:12,986 saving best model 2023-10-17 10:32:13,945 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:32:13,947 Loading model from best epoch ... 2023-10-17 10:32:15,322 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:32:16,199 Results: - F-score (micro) 0.8093 - F-score (macro) 0.5585 - Accuracy 0.7005 By class: precision recall f1-score support pers 0.8207 0.8561 0.8380 139 scope 0.8248 0.8760 0.8496 129 work 0.7000 0.7875 0.7412 80 loc 1.0000 0.2222 0.3636 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7941 0.8250 0.8093 360 macro avg 0.6691 0.5484 0.5585 360 weighted avg 0.7930 0.8250 0.8018 360 2023-10-17 10:32:16,199 ----------------------------------------------------------------------------------------------------