2023-10-17 16:47:16,055 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,056 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 Train: 14465 sentences 2023-10-17 16:47:16,057 (train_with_dev=False, train_with_test=False) 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 Training Params: 2023-10-17 16:47:16,057 - learning_rate: "3e-05" 2023-10-17 16:47:16,057 - mini_batch_size: "8" 2023-10-17 16:47:16,057 - max_epochs: "10" 2023-10-17 16:47:16,057 - shuffle: "True" 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 Plugins: 2023-10-17 16:47:16,057 - TensorboardLogger 2023-10-17 16:47:16,057 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:47:16,057 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:47:16,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,057 Computation: 2023-10-17 16:47:16,057 - compute on device: cuda:0 2023-10-17 16:47:16,058 - embedding storage: none 2023-10-17 16:47:16,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,058 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 16:47:16,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:16,058 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:47:28,647 epoch 1 - iter 180/1809 - loss 2.37829275 - time (sec): 12.59 - samples/sec: 2902.52 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:47:41,603 epoch 1 - iter 360/1809 - loss 1.29142331 - time (sec): 25.54 - samples/sec: 2959.16 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:47:54,374 epoch 1 - iter 540/1809 - loss 0.91673625 - time (sec): 38.31 - samples/sec: 2962.10 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:48:07,489 epoch 1 - iter 720/1809 - loss 0.72239703 - time (sec): 51.43 - samples/sec: 2963.13 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:48:20,496 epoch 1 - iter 900/1809 - loss 0.60634230 - time (sec): 64.44 - samples/sec: 2936.89 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:48:33,381 epoch 1 - iter 1080/1809 - loss 0.52686295 - time (sec): 77.32 - samples/sec: 2940.72 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:48:46,313 epoch 1 - iter 1260/1809 - loss 0.46693740 - time (sec): 90.25 - samples/sec: 2944.10 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:48:59,460 epoch 1 - iter 1440/1809 - loss 0.42178664 - time (sec): 103.40 - samples/sec: 2949.31 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:49:12,569 epoch 1 - iter 1620/1809 - loss 0.38783448 - time (sec): 116.51 - samples/sec: 2935.62 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:49:25,738 epoch 1 - iter 1800/1809 - loss 0.36040896 - time (sec): 129.68 - samples/sec: 2918.82 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:49:26,324 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:26,324 EPOCH 1 done: loss 0.3594 - lr: 0.000030 2023-10-17 16:49:31,785 DEV : loss 0.10473097860813141 - f1-score (micro avg) 0.6133 2023-10-17 16:49:31,826 saving best model 2023-10-17 16:49:32,320 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:45,341 epoch 2 - iter 180/1809 - loss 0.09588522 - time (sec): 13.02 - samples/sec: 2975.94 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:49:58,243 epoch 2 - iter 360/1809 - loss 0.08882249 - time (sec): 25.92 - samples/sec: 2950.13 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:50:11,235 epoch 2 - iter 540/1809 - loss 0.08363106 - time (sec): 38.91 - samples/sec: 2948.32 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:50:23,803 epoch 2 - iter 720/1809 - loss 0.08455393 - time (sec): 51.48 - samples/sec: 2937.41 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:50:37,115 epoch 2 - iter 900/1809 - loss 0.08508426 - time (sec): 64.79 - samples/sec: 2906.26 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:50:49,757 epoch 2 - iter 1080/1809 - loss 0.08610959 - time (sec): 77.44 - samples/sec: 2898.77 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:51:03,116 epoch 2 - iter 1260/1809 - loss 0.08766651 - time (sec): 90.80 - samples/sec: 2888.85 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:51:16,240 epoch 2 - iter 1440/1809 - loss 0.08705399 - time (sec): 103.92 - samples/sec: 2899.26 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:51:29,593 epoch 2 - iter 1620/1809 - loss 0.08720869 - time (sec): 117.27 - samples/sec: 2886.58 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:51:42,832 epoch 2 - iter 1800/1809 - loss 0.08686349 - time (sec): 130.51 - samples/sec: 2895.75 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:51:43,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:43,550 EPOCH 2 done: loss 0.0868 - lr: 0.000027 2023-10-17 16:51:50,679 DEV : loss 0.09997577220201492 - f1-score (micro avg) 0.6206 2023-10-17 16:51:50,720 saving best model 2023-10-17 16:51:51,301 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:04,562 epoch 3 - iter 180/1809 - loss 0.06057686 - time (sec): 13.26 - samples/sec: 2882.78 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:52:17,739 epoch 3 - iter 360/1809 - loss 0.05898876 - time (sec): 26.44 - samples/sec: 2894.23 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:52:30,634 epoch 3 - iter 540/1809 - loss 0.05667154 - time (sec): 39.33 - samples/sec: 2893.24 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:52:43,874 epoch 3 - iter 720/1809 - loss 0.05829348 - time (sec): 52.57 - samples/sec: 2883.78 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:52:56,734 epoch 3 - iter 900/1809 - loss 0.05931267 - time (sec): 65.43 - samples/sec: 2887.88 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:53:09,638 epoch 3 - iter 1080/1809 - loss 0.05999951 - time (sec): 78.34 - samples/sec: 2893.92 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:53:23,322 epoch 3 - iter 1260/1809 - loss 0.05998945 - time (sec): 92.02 - samples/sec: 2880.58 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:53:37,421 epoch 3 - iter 1440/1809 - loss 0.06121733 - time (sec): 106.12 - samples/sec: 2846.19 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:53:51,481 epoch 3 - iter 1620/1809 - loss 0.06250987 - time (sec): 120.18 - samples/sec: 2822.20 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:54:05,576 epoch 3 - iter 1800/1809 - loss 0.06260649 - time (sec): 134.27 - samples/sec: 2818.01 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:54:06,170 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:06,171 EPOCH 3 done: loss 0.0627 - lr: 0.000023 2023-10-17 16:54:12,478 DEV : loss 0.13520988821983337 - f1-score (micro avg) 0.6323 2023-10-17 16:54:12,520 saving best model 2023-10-17 16:54:13,120 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:26,854 epoch 4 - iter 180/1809 - loss 0.03589391 - time (sec): 13.73 - samples/sec: 2725.59 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:54:41,062 epoch 4 - iter 360/1809 - loss 0.04243949 - time (sec): 27.94 - samples/sec: 2704.29 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:54:55,361 epoch 4 - iter 540/1809 - loss 0.04312213 - time (sec): 42.24 - samples/sec: 2711.73 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:55:08,899 epoch 4 - iter 720/1809 - loss 0.04407922 - time (sec): 55.78 - samples/sec: 2715.57 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:55:21,317 epoch 4 - iter 900/1809 - loss 0.04309184 - time (sec): 68.20 - samples/sec: 2751.30 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:55:35,293 epoch 4 - iter 1080/1809 - loss 0.04461675 - time (sec): 82.17 - samples/sec: 2765.77 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:55:47,893 epoch 4 - iter 1260/1809 - loss 0.04482022 - time (sec): 94.77 - samples/sec: 2791.18 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:56:00,644 epoch 4 - iter 1440/1809 - loss 0.04402527 - time (sec): 107.52 - samples/sec: 2805.31 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:56:13,843 epoch 4 - iter 1620/1809 - loss 0.04508043 - time (sec): 120.72 - samples/sec: 2821.11 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:56:26,940 epoch 4 - iter 1800/1809 - loss 0.04577307 - time (sec): 133.82 - samples/sec: 2826.69 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:56:27,553 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:27,554 EPOCH 4 done: loss 0.0458 - lr: 0.000020 2023-10-17 16:56:33,892 DEV : loss 0.19658434391021729 - f1-score (micro avg) 0.65 2023-10-17 16:56:33,932 saving best model 2023-10-17 16:56:34,506 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:47,272 epoch 5 - iter 180/1809 - loss 0.02885208 - time (sec): 12.76 - samples/sec: 2976.47 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:57:00,161 epoch 5 - iter 360/1809 - loss 0.03100421 - time (sec): 25.65 - samples/sec: 2946.51 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:13,078 epoch 5 - iter 540/1809 - loss 0.03368269 - time (sec): 38.57 - samples/sec: 2921.39 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:25,741 epoch 5 - iter 720/1809 - loss 0.03096940 - time (sec): 51.23 - samples/sec: 2936.84 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:57:38,316 epoch 5 - iter 900/1809 - loss 0.03114871 - time (sec): 63.81 - samples/sec: 2937.46 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:57:50,872 epoch 5 - iter 1080/1809 - loss 0.03098647 - time (sec): 76.36 - samples/sec: 2935.27 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:58:04,501 epoch 5 - iter 1260/1809 - loss 0.03174442 - time (sec): 89.99 - samples/sec: 2923.36 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:58:18,452 epoch 5 - iter 1440/1809 - loss 0.03137972 - time (sec): 103.94 - samples/sec: 2896.31 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:58:32,933 epoch 5 - iter 1620/1809 - loss 0.03319045 - time (sec): 118.42 - samples/sec: 2866.30 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:58:47,625 epoch 5 - iter 1800/1809 - loss 0.03251336 - time (sec): 133.12 - samples/sec: 2841.97 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:58:48,292 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:58:48,292 EPOCH 5 done: loss 0.0324 - lr: 0.000017 2023-10-17 16:58:55,306 DEV : loss 0.2977985441684723 - f1-score (micro avg) 0.6377 2023-10-17 16:58:55,351 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:59:09,632 epoch 6 - iter 180/1809 - loss 0.02528166 - time (sec): 14.28 - samples/sec: 2659.02 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:59:22,461 epoch 6 - iter 360/1809 - loss 0.02336089 - time (sec): 27.11 - samples/sec: 2746.08 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:59:36,882 epoch 6 - iter 540/1809 - loss 0.02596319 - time (sec): 41.53 - samples/sec: 2699.54 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:59:51,458 epoch 6 - iter 720/1809 - loss 0.02376158 - time (sec): 56.11 - samples/sec: 2706.80 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:00:04,585 epoch 6 - iter 900/1809 - loss 0.02282026 - time (sec): 69.23 - samples/sec: 2735.97 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:00:18,532 epoch 6 - iter 1080/1809 - loss 0.02378981 - time (sec): 83.18 - samples/sec: 2736.20 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:00:32,532 epoch 6 - iter 1260/1809 - loss 0.02488103 - time (sec): 97.18 - samples/sec: 2705.92 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:00:47,158 epoch 6 - iter 1440/1809 - loss 0.02406285 - time (sec): 111.81 - samples/sec: 2693.01 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:01:00,454 epoch 6 - iter 1620/1809 - loss 0.02458081 - time (sec): 125.10 - samples/sec: 2712.95 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:01:14,994 epoch 6 - iter 1800/1809 - loss 0.02416930 - time (sec): 139.64 - samples/sec: 2709.21 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:01:15,660 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:01:15,661 EPOCH 6 done: loss 0.0241 - lr: 0.000013 2023-10-17 17:01:22,099 DEV : loss 0.2875325679779053 - f1-score (micro avg) 0.6463 2023-10-17 17:01:22,141 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:01:36,174 epoch 7 - iter 180/1809 - loss 0.01469512 - time (sec): 14.03 - samples/sec: 2577.49 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:01:50,551 epoch 7 - iter 360/1809 - loss 0.01466126 - time (sec): 28.41 - samples/sec: 2562.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:02:04,459 epoch 7 - iter 540/1809 - loss 0.01507780 - time (sec): 42.32 - samples/sec: 2582.29 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:02:18,493 epoch 7 - iter 720/1809 - loss 0.01484288 - time (sec): 56.35 - samples/sec: 2631.16 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:02:32,995 epoch 7 - iter 900/1809 - loss 0.01551643 - time (sec): 70.85 - samples/sec: 2653.49 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:02:47,523 epoch 7 - iter 1080/1809 - loss 0.01579238 - time (sec): 85.38 - samples/sec: 2661.61 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:03:01,558 epoch 7 - iter 1260/1809 - loss 0.01541974 - time (sec): 99.42 - samples/sec: 2659.01 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:03:15,932 epoch 7 - iter 1440/1809 - loss 0.01565492 - time (sec): 113.79 - samples/sec: 2650.76 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:03:30,407 epoch 7 - iter 1620/1809 - loss 0.01578131 - time (sec): 128.26 - samples/sec: 2646.94 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:03:45,544 epoch 7 - iter 1800/1809 - loss 0.01577763 - time (sec): 143.40 - samples/sec: 2635.32 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:03:46,233 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:03:46,234 EPOCH 7 done: loss 0.0157 - lr: 0.000010 2023-10-17 17:03:52,481 DEV : loss 0.3317669928073883 - f1-score (micro avg) 0.6604 2023-10-17 17:03:52,524 saving best model 2023-10-17 17:03:53,119 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:04:06,717 epoch 8 - iter 180/1809 - loss 0.01083201 - time (sec): 13.60 - samples/sec: 2733.81 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:04:20,695 epoch 8 - iter 360/1809 - loss 0.01085551 - time (sec): 27.57 - samples/sec: 2672.29 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:04:34,811 epoch 8 - iter 540/1809 - loss 0.01160109 - time (sec): 41.69 - samples/sec: 2662.51 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:04:48,963 epoch 8 - iter 720/1809 - loss 0.01150271 - time (sec): 55.84 - samples/sec: 2688.28 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:05:02,029 epoch 8 - iter 900/1809 - loss 0.01158055 - time (sec): 68.91 - samples/sec: 2729.09 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:05:16,268 epoch 8 - iter 1080/1809 - loss 0.01114901 - time (sec): 83.15 - samples/sec: 2709.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:05:29,595 epoch 8 - iter 1260/1809 - loss 0.01121147 - time (sec): 96.47 - samples/sec: 2733.10 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:05:43,752 epoch 8 - iter 1440/1809 - loss 0.01133313 - time (sec): 110.63 - samples/sec: 2731.90 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:05:56,559 epoch 8 - iter 1620/1809 - loss 0.01079204 - time (sec): 123.44 - samples/sec: 2744.91 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:06:09,770 epoch 8 - iter 1800/1809 - loss 0.01055259 - time (sec): 136.65 - samples/sec: 2764.47 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:06:10,429 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:06:10,429 EPOCH 8 done: loss 0.0106 - lr: 0.000007 2023-10-17 17:06:17,407 DEV : loss 0.36872246861457825 - f1-score (micro avg) 0.6586 2023-10-17 17:06:17,455 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:06:30,029 epoch 9 - iter 180/1809 - loss 0.00589952 - time (sec): 12.57 - samples/sec: 2889.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:06:42,937 epoch 9 - iter 360/1809 - loss 0.00549166 - time (sec): 25.48 - samples/sec: 2899.28 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:06:55,861 epoch 9 - iter 540/1809 - loss 0.00605800 - time (sec): 38.40 - samples/sec: 2916.24 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:07:09,026 epoch 9 - iter 720/1809 - loss 0.00638723 - time (sec): 51.57 - samples/sec: 2919.17 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:07:21,620 epoch 9 - iter 900/1809 - loss 0.00618261 - time (sec): 64.16 - samples/sec: 2935.84 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:07:34,600 epoch 9 - iter 1080/1809 - loss 0.00676500 - time (sec): 77.14 - samples/sec: 2934.03 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:07:48,046 epoch 9 - iter 1260/1809 - loss 0.00688842 - time (sec): 90.59 - samples/sec: 2939.28 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:08:00,908 epoch 9 - iter 1440/1809 - loss 0.00706836 - time (sec): 103.45 - samples/sec: 2943.34 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:08:13,738 epoch 9 - iter 1620/1809 - loss 0.00696957 - time (sec): 116.28 - samples/sec: 2937.59 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:08:26,475 epoch 9 - iter 1800/1809 - loss 0.00698359 - time (sec): 129.02 - samples/sec: 2933.27 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:08:27,089 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:08:27,090 EPOCH 9 done: loss 0.0070 - lr: 0.000003 2023-10-17 17:08:33,338 DEV : loss 0.38837048411369324 - f1-score (micro avg) 0.6705 2023-10-17 17:08:33,381 saving best model 2023-10-17 17:08:34,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:08:47,073 epoch 10 - iter 180/1809 - loss 0.00224797 - time (sec): 13.07 - samples/sec: 2881.57 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:09:01,239 epoch 10 - iter 360/1809 - loss 0.00362755 - time (sec): 27.23 - samples/sec: 2854.41 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:09:14,204 epoch 10 - iter 540/1809 - loss 0.00370153 - time (sec): 40.20 - samples/sec: 2863.08 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:09:26,807 epoch 10 - iter 720/1809 - loss 0.00450308 - time (sec): 52.80 - samples/sec: 2898.14 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:09:39,229 epoch 10 - iter 900/1809 - loss 0.00476154 - time (sec): 65.22 - samples/sec: 2896.81 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:09:52,036 epoch 10 - iter 1080/1809 - loss 0.00513412 - time (sec): 78.03 - samples/sec: 2917.16 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:10:04,882 epoch 10 - iter 1260/1809 - loss 0.00519172 - time (sec): 90.87 - samples/sec: 2926.77 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:10:17,583 epoch 10 - iter 1440/1809 - loss 0.00512814 - time (sec): 103.58 - samples/sec: 2929.95 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:10:30,625 epoch 10 - iter 1620/1809 - loss 0.00520500 - time (sec): 116.62 - samples/sec: 2936.59 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:10:43,601 epoch 10 - iter 1800/1809 - loss 0.00503373 - time (sec): 129.59 - samples/sec: 2921.20 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:10:44,249 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:10:44,249 EPOCH 10 done: loss 0.0050 - lr: 0.000000 2023-10-17 17:10:51,342 DEV : loss 0.397739440202713 - f1-score (micro avg) 0.6676 2023-10-17 17:10:51,878 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:10:51,880 Loading model from best epoch ... 2023-10-17 17:10:53,630 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-17 17:11:01,735 Results: - F-score (micro) 0.6596 - F-score (macro) 0.5192 - Accuracy 0.5023 By class: precision recall f1-score support loc 0.6533 0.7970 0.7180 591 pers 0.5792 0.7479 0.6528 357 org 0.1972 0.1772 0.1867 79 micro avg 0.6002 0.7322 0.6596 1027 macro avg 0.4765 0.5740 0.5192 1027 weighted avg 0.5924 0.7322 0.6545 1027 2023-10-17 17:11:01,735 ----------------------------------------------------------------------------------------------------