2023-10-17 08:53:34,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,962 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:53:34,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Train: 1100 sentences 2023-10-17 08:53:34,963 (train_with_dev=False, train_with_test=False) 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Training Params: 2023-10-17 08:53:34,963 - learning_rate: "5e-05" 2023-10-17 08:53:34,963 - mini_batch_size: "8" 2023-10-17 08:53:34,963 - max_epochs: "10" 2023-10-17 08:53:34,963 - shuffle: "True" 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Plugins: 2023-10-17 08:53:34,963 - TensorboardLogger 2023-10-17 08:53:34,963 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:53:34,963 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Computation: 2023-10-17 08:53:34,963 - compute on device: cuda:0 2023-10-17 08:53:34,963 - embedding storage: none 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:34,963 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:53:35,685 epoch 1 - iter 13/138 - loss 4.06947742 - time (sec): 0.72 - samples/sec: 3084.80 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:53:36,373 epoch 1 - iter 26/138 - loss 3.53460833 - time (sec): 1.41 - samples/sec: 2884.84 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:53:37,113 epoch 1 - iter 39/138 - loss 2.81906985 - time (sec): 2.15 - samples/sec: 2934.39 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:53:37,859 epoch 1 - iter 52/138 - loss 2.32739470 - time (sec): 2.89 - samples/sec: 2912.07 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:53:38,624 epoch 1 - iter 65/138 - loss 1.97469116 - time (sec): 3.66 - samples/sec: 2879.38 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:53:39,364 epoch 1 - iter 78/138 - loss 1.72761590 - time (sec): 4.40 - samples/sec: 2869.36 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:53:40,121 epoch 1 - iter 91/138 - loss 1.51466269 - time (sec): 5.16 - samples/sec: 2916.91 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:53:40,865 epoch 1 - iter 104/138 - loss 1.35593559 - time (sec): 5.90 - samples/sec: 2974.88 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:53:41,568 epoch 1 - iter 117/138 - loss 1.24744656 - time (sec): 6.60 - samples/sec: 2961.80 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:53:42,325 epoch 1 - iter 130/138 - loss 1.15825875 - time (sec): 7.36 - samples/sec: 2932.06 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:53:42,773 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:42,773 EPOCH 1 done: loss 1.1129 - lr: 0.000047 2023-10-17 08:53:43,555 DEV : loss 0.1926119476556778 - f1-score (micro avg) 0.7766 2023-10-17 08:53:43,560 saving best model 2023-10-17 08:53:43,920 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:44,655 epoch 2 - iter 13/138 - loss 0.17238555 - time (sec): 0.73 - samples/sec: 2968.02 - lr: 0.000050 - momentum: 0.000000 2023-10-17 08:53:45,359 epoch 2 - iter 26/138 - loss 0.17242125 - time (sec): 1.44 - samples/sec: 2980.23 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:53:46,110 epoch 2 - iter 39/138 - loss 0.17201537 - time (sec): 2.19 - samples/sec: 2920.07 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:53:46,864 epoch 2 - iter 52/138 - loss 0.17632595 - time (sec): 2.94 - samples/sec: 2963.47 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:53:47,592 epoch 2 - iter 65/138 - loss 0.16585204 - time (sec): 3.67 - samples/sec: 2930.80 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:53:48,358 epoch 2 - iter 78/138 - loss 0.17233931 - time (sec): 4.44 - samples/sec: 2950.18 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:53:49,132 epoch 2 - iter 91/138 - loss 0.17024609 - time (sec): 5.21 - samples/sec: 2926.88 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:53:49,959 epoch 2 - iter 104/138 - loss 0.17589124 - time (sec): 6.04 - samples/sec: 2925.32 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:53:50,690 epoch 2 - iter 117/138 - loss 0.16910689 - time (sec): 6.77 - samples/sec: 2913.93 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:53:51,426 epoch 2 - iter 130/138 - loss 0.16263917 - time (sec): 7.50 - samples/sec: 2887.16 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:53:51,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:51,877 EPOCH 2 done: loss 0.1644 - lr: 0.000045 2023-10-17 08:53:52,509 DEV : loss 0.1401386559009552 - f1-score (micro avg) 0.8 2023-10-17 08:53:52,514 saving best model 2023-10-17 08:53:52,950 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:53,657 epoch 3 - iter 13/138 - loss 0.10839392 - time (sec): 0.71 - samples/sec: 2862.79 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:53:54,345 epoch 3 - iter 26/138 - loss 0.09821987 - time (sec): 1.39 - samples/sec: 2726.84 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:53:55,042 epoch 3 - iter 39/138 - loss 0.09920961 - time (sec): 2.09 - samples/sec: 2832.75 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:53:55,868 epoch 3 - iter 52/138 - loss 0.10149162 - time (sec): 2.92 - samples/sec: 2879.44 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:53:56,566 epoch 3 - iter 65/138 - loss 0.09871974 - time (sec): 3.61 - samples/sec: 2855.39 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:53:57,338 epoch 3 - iter 78/138 - loss 0.09599052 - time (sec): 4.39 - samples/sec: 2870.37 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:53:58,058 epoch 3 - iter 91/138 - loss 0.09023016 - time (sec): 5.11 - samples/sec: 2878.12 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:53:58,800 epoch 3 - iter 104/138 - loss 0.08954002 - time (sec): 5.85 - samples/sec: 2892.06 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:53:59,591 epoch 3 - iter 117/138 - loss 0.08970388 - time (sec): 6.64 - samples/sec: 2886.08 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:54:00,344 epoch 3 - iter 130/138 - loss 0.09656639 - time (sec): 7.39 - samples/sec: 2903.71 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:54:00,809 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:00,809 EPOCH 3 done: loss 0.0961 - lr: 0.000039 2023-10-17 08:54:01,436 DEV : loss 0.12580935657024384 - f1-score (micro avg) 0.8492 2023-10-17 08:54:01,440 saving best model 2023-10-17 08:54:01,874 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:02,587 epoch 4 - iter 13/138 - loss 0.07047690 - time (sec): 0.71 - samples/sec: 2860.23 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:54:03,311 epoch 4 - iter 26/138 - loss 0.05110513 - time (sec): 1.44 - samples/sec: 2869.51 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:54:04,036 epoch 4 - iter 39/138 - loss 0.08593546 - time (sec): 2.16 - samples/sec: 2894.35 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:54:04,775 epoch 4 - iter 52/138 - loss 0.09311636 - time (sec): 2.90 - samples/sec: 2907.89 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:54:05,531 epoch 4 - iter 65/138 - loss 0.08520031 - time (sec): 3.66 - samples/sec: 2866.50 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:54:06,289 epoch 4 - iter 78/138 - loss 0.07696350 - time (sec): 4.41 - samples/sec: 2883.48 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:54:07,056 epoch 4 - iter 91/138 - loss 0.07443197 - time (sec): 5.18 - samples/sec: 2868.28 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:54:07,813 epoch 4 - iter 104/138 - loss 0.07494235 - time (sec): 5.94 - samples/sec: 2888.26 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:54:08,578 epoch 4 - iter 117/138 - loss 0.07524586 - time (sec): 6.70 - samples/sec: 2887.19 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:54:09,360 epoch 4 - iter 130/138 - loss 0.07303091 - time (sec): 7.49 - samples/sec: 2870.51 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:54:09,846 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:09,846 EPOCH 4 done: loss 0.0738 - lr: 0.000034 2023-10-17 08:54:10,479 DEV : loss 0.1469777375459671 - f1-score (micro avg) 0.8681 2023-10-17 08:54:10,484 saving best model 2023-10-17 08:54:10,928 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:11,633 epoch 5 - iter 13/138 - loss 0.09518052 - time (sec): 0.70 - samples/sec: 3280.21 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:54:12,326 epoch 5 - iter 26/138 - loss 0.06688716 - time (sec): 1.40 - samples/sec: 3186.06 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:54:13,021 epoch 5 - iter 39/138 - loss 0.05861502 - time (sec): 2.09 - samples/sec: 3059.78 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:54:13,701 epoch 5 - iter 52/138 - loss 0.06250234 - time (sec): 2.77 - samples/sec: 3000.26 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:54:14,423 epoch 5 - iter 65/138 - loss 0.06755541 - time (sec): 3.49 - samples/sec: 2995.35 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:54:15,175 epoch 5 - iter 78/138 - loss 0.06415957 - time (sec): 4.25 - samples/sec: 2977.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:54:15,945 epoch 5 - iter 91/138 - loss 0.06402975 - time (sec): 5.02 - samples/sec: 2950.01 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:54:16,713 epoch 5 - iter 104/138 - loss 0.05989598 - time (sec): 5.78 - samples/sec: 2965.03 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:54:17,459 epoch 5 - iter 117/138 - loss 0.05868487 - time (sec): 6.53 - samples/sec: 2961.32 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:54:18,229 epoch 5 - iter 130/138 - loss 0.05774891 - time (sec): 7.30 - samples/sec: 2950.84 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:54:18,677 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:18,678 EPOCH 5 done: loss 0.0555 - lr: 0.000028 2023-10-17 08:54:19,310 DEV : loss 0.16393792629241943 - f1-score (micro avg) 0.8619 2023-10-17 08:54:19,314 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:20,016 epoch 6 - iter 13/138 - loss 0.03744158 - time (sec): 0.70 - samples/sec: 2748.59 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:54:20,736 epoch 6 - iter 26/138 - loss 0.04542780 - time (sec): 1.42 - samples/sec: 2836.48 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:54:21,486 epoch 6 - iter 39/138 - loss 0.04843577 - time (sec): 2.17 - samples/sec: 2839.99 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:54:22,237 epoch 6 - iter 52/138 - loss 0.03793948 - time (sec): 2.92 - samples/sec: 2830.12 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:54:22,975 epoch 6 - iter 65/138 - loss 0.04494788 - time (sec): 3.66 - samples/sec: 2886.63 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:54:23,723 epoch 6 - iter 78/138 - loss 0.03905436 - time (sec): 4.41 - samples/sec: 2887.40 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:54:24,485 epoch 6 - iter 91/138 - loss 0.03803730 - time (sec): 5.17 - samples/sec: 2861.38 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:54:25,272 epoch 6 - iter 104/138 - loss 0.04047424 - time (sec): 5.96 - samples/sec: 2851.01 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:54:26,105 epoch 6 - iter 117/138 - loss 0.04769041 - time (sec): 6.79 - samples/sec: 2850.07 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:54:26,875 epoch 6 - iter 130/138 - loss 0.04603896 - time (sec): 7.56 - samples/sec: 2860.08 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:54:27,299 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:27,300 EPOCH 6 done: loss 0.0444 - lr: 0.000023 2023-10-17 08:54:27,937 DEV : loss 0.1778355985879898 - f1-score (micro avg) 0.867 2023-10-17 08:54:27,941 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:28,667 epoch 7 - iter 13/138 - loss 0.08203944 - time (sec): 0.72 - samples/sec: 2779.90 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:54:29,442 epoch 7 - iter 26/138 - loss 0.06638147 - time (sec): 1.50 - samples/sec: 2795.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:54:30,208 epoch 7 - iter 39/138 - loss 0.07442183 - time (sec): 2.27 - samples/sec: 2872.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:54:30,956 epoch 7 - iter 52/138 - loss 0.06170432 - time (sec): 3.01 - samples/sec: 2872.64 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:54:31,711 epoch 7 - iter 65/138 - loss 0.05092242 - time (sec): 3.77 - samples/sec: 2837.54 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:54:32,455 epoch 7 - iter 78/138 - loss 0.04529981 - time (sec): 4.51 - samples/sec: 2858.77 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:54:33,249 epoch 7 - iter 91/138 - loss 0.04041344 - time (sec): 5.31 - samples/sec: 2855.92 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:54:33,992 epoch 7 - iter 104/138 - loss 0.04050339 - time (sec): 6.05 - samples/sec: 2871.54 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:54:34,755 epoch 7 - iter 117/138 - loss 0.03924770 - time (sec): 6.81 - samples/sec: 2842.01 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:54:35,464 epoch 7 - iter 130/138 - loss 0.03944895 - time (sec): 7.52 - samples/sec: 2858.12 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:54:35,904 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:35,905 EPOCH 7 done: loss 0.0376 - lr: 0.000017 2023-10-17 08:54:36,571 DEV : loss 0.19011272490024567 - f1-score (micro avg) 0.8633 2023-10-17 08:54:36,575 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:37,347 epoch 8 - iter 13/138 - loss 0.01406125 - time (sec): 0.77 - samples/sec: 3038.31 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:54:38,068 epoch 8 - iter 26/138 - loss 0.02028170 - time (sec): 1.49 - samples/sec: 3041.21 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:54:38,782 epoch 8 - iter 39/138 - loss 0.02305527 - time (sec): 2.21 - samples/sec: 2941.13 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:54:39,559 epoch 8 - iter 52/138 - loss 0.01949837 - time (sec): 2.98 - samples/sec: 2905.93 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:54:40,353 epoch 8 - iter 65/138 - loss 0.01753884 - time (sec): 3.78 - samples/sec: 2938.16 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:54:41,072 epoch 8 - iter 78/138 - loss 0.01859406 - time (sec): 4.50 - samples/sec: 2930.11 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:54:41,860 epoch 8 - iter 91/138 - loss 0.02017696 - time (sec): 5.28 - samples/sec: 2914.13 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:54:42,631 epoch 8 - iter 104/138 - loss 0.02936815 - time (sec): 6.06 - samples/sec: 2908.50 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:54:43,370 epoch 8 - iter 117/138 - loss 0.03072266 - time (sec): 6.79 - samples/sec: 2888.90 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:54:44,137 epoch 8 - iter 130/138 - loss 0.03009208 - time (sec): 7.56 - samples/sec: 2869.34 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:54:44,550 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:44,550 EPOCH 8 done: loss 0.0298 - lr: 0.000012 2023-10-17 08:54:45,191 DEV : loss 0.17324329912662506 - f1-score (micro avg) 0.8726 2023-10-17 08:54:45,197 saving best model 2023-10-17 08:54:45,655 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:46,459 epoch 9 - iter 13/138 - loss 0.03966138 - time (sec): 0.80 - samples/sec: 2842.12 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:54:47,277 epoch 9 - iter 26/138 - loss 0.02599233 - time (sec): 1.62 - samples/sec: 2846.41 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:54:48,009 epoch 9 - iter 39/138 - loss 0.02395130 - time (sec): 2.35 - samples/sec: 2821.06 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:54:48,740 epoch 9 - iter 52/138 - loss 0.02886960 - time (sec): 3.08 - samples/sec: 2828.78 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:54:49,446 epoch 9 - iter 65/138 - loss 0.02820047 - time (sec): 3.79 - samples/sec: 2779.66 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:54:50,180 epoch 9 - iter 78/138 - loss 0.03240256 - time (sec): 4.52 - samples/sec: 2843.95 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:54:50,889 epoch 9 - iter 91/138 - loss 0.03208418 - time (sec): 5.23 - samples/sec: 2861.50 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:54:51,651 epoch 9 - iter 104/138 - loss 0.02859865 - time (sec): 5.99 - samples/sec: 2838.39 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:54:52,406 epoch 9 - iter 117/138 - loss 0.02625805 - time (sec): 6.75 - samples/sec: 2840.41 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:54:53,128 epoch 9 - iter 130/138 - loss 0.02679964 - time (sec): 7.47 - samples/sec: 2863.51 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:54:53,616 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:53,616 EPOCH 9 done: loss 0.0259 - lr: 0.000006 2023-10-17 08:54:54,251 DEV : loss 0.1794566512107849 - f1-score (micro avg) 0.8719 2023-10-17 08:54:54,256 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:55,006 epoch 10 - iter 13/138 - loss 0.01099146 - time (sec): 0.75 - samples/sec: 2743.28 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:54:55,776 epoch 10 - iter 26/138 - loss 0.01326709 - time (sec): 1.52 - samples/sec: 2925.20 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:54:56,488 epoch 10 - iter 39/138 - loss 0.01241961 - time (sec): 2.23 - samples/sec: 2940.35 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:54:57,181 epoch 10 - iter 52/138 - loss 0.01622066 - time (sec): 2.92 - samples/sec: 2851.97 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:54:57,917 epoch 10 - iter 65/138 - loss 0.01665366 - time (sec): 3.66 - samples/sec: 2875.42 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:54:58,666 epoch 10 - iter 78/138 - loss 0.01562260 - time (sec): 4.41 - samples/sec: 2849.59 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:54:59,384 epoch 10 - iter 91/138 - loss 0.01461411 - time (sec): 5.13 - samples/sec: 2877.77 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:55:00,182 epoch 10 - iter 104/138 - loss 0.01947795 - time (sec): 5.92 - samples/sec: 2890.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:55:00,978 epoch 10 - iter 117/138 - loss 0.01852363 - time (sec): 6.72 - samples/sec: 2879.18 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:55:01,755 epoch 10 - iter 130/138 - loss 0.01707996 - time (sec): 7.50 - samples/sec: 2872.11 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:55:02,225 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:02,226 EPOCH 10 done: loss 0.0168 - lr: 0.000000 2023-10-17 08:55:02,944 DEV : loss 0.18495064973831177 - f1-score (micro avg) 0.8705 2023-10-17 08:55:03,287 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:55:03,289 Loading model from best epoch ... 2023-10-17 08:55:04,669 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:55:05,488 Results: - F-score (micro) 0.9029 - F-score (macro) 0.7414 - Accuracy 0.8309 By class: precision recall f1-score support scope 0.9029 0.8977 0.9003 176 pers 0.9672 0.9219 0.9440 128 work 0.8354 0.8919 0.8627 74 object 0.0000 0.0000 0.0000 2 loc 1.0000 1.0000 1.0000 2 micro avg 0.9053 0.9005 0.9029 382 macro avg 0.7411 0.7423 0.7414 382 weighted avg 0.9071 0.9005 0.9035 382 2023-10-17 08:55:05,488 ----------------------------------------------------------------------------------------------------