2023-10-17 08:35:02,192 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,193 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:35:02,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,193 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:35:02,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,193 Train: 1100 sentences 2023-10-17 08:35:02,193 (train_with_dev=False, train_with_test=False) 2023-10-17 08:35:02,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,193 Training Params: 2023-10-17 08:35:02,193 - learning_rate: "5e-05" 2023-10-17 08:35:02,193 - mini_batch_size: "8" 2023-10-17 08:35:02,193 - max_epochs: "10" 2023-10-17 08:35:02,193 - shuffle: "True" 2023-10-17 08:35:02,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,193 Plugins: 2023-10-17 08:35:02,194 - TensorboardLogger 2023-10-17 08:35:02,194 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:35:02,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,194 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:35:02,194 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:35:02,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,194 Computation: 2023-10-17 08:35:02,194 - compute on device: cuda:0 2023-10-17 08:35:02,194 - embedding storage: none 2023-10-17 08:35:02,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,194 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 08:35:02,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,194 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:02,194 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:35:02,926 epoch 1 - iter 13/138 - loss 4.22040289 - time (sec): 0.73 - samples/sec: 2911.78 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:35:03,651 epoch 1 - iter 26/138 - loss 3.70465772 - time (sec): 1.46 - samples/sec: 2775.13 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:35:04,368 epoch 1 - iter 39/138 - loss 3.00403495 - time (sec): 2.17 - samples/sec: 2774.98 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:35:05,127 epoch 1 - iter 52/138 - loss 2.50089604 - time (sec): 2.93 - samples/sec: 2741.49 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:35:05,943 epoch 1 - iter 65/138 - loss 2.07745839 - time (sec): 3.75 - samples/sec: 2764.64 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:35:06,711 epoch 1 - iter 78/138 - loss 1.81431920 - time (sec): 4.52 - samples/sec: 2781.72 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:35:07,462 epoch 1 - iter 91/138 - loss 1.59917697 - time (sec): 5.27 - samples/sec: 2803.24 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:35:08,228 epoch 1 - iter 104/138 - loss 1.43328790 - time (sec): 6.03 - samples/sec: 2814.17 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:35:08,990 epoch 1 - iter 117/138 - loss 1.30451502 - time (sec): 6.80 - samples/sec: 2833.88 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:35:09,780 epoch 1 - iter 130/138 - loss 1.20666017 - time (sec): 7.59 - samples/sec: 2828.47 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:35:10,283 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:10,283 EPOCH 1 done: loss 1.1565 - lr: 0.000047 2023-10-17 08:35:11,092 DEV : loss 0.20371423661708832 - f1-score (micro avg) 0.6853 2023-10-17 08:35:11,097 saving best model 2023-10-17 08:35:11,441 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:12,205 epoch 2 - iter 13/138 - loss 0.18510989 - time (sec): 0.76 - samples/sec: 2908.46 - lr: 0.000050 - momentum: 0.000000 2023-10-17 08:35:12,987 epoch 2 - iter 26/138 - loss 0.22286473 - time (sec): 1.54 - samples/sec: 3015.76 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:35:13,732 epoch 2 - iter 39/138 - loss 0.21354263 - time (sec): 2.29 - samples/sec: 2992.66 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:35:14,511 epoch 2 - iter 52/138 - loss 0.20370386 - time (sec): 3.07 - samples/sec: 2972.89 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:35:15,209 epoch 2 - iter 65/138 - loss 0.19557993 - time (sec): 3.77 - samples/sec: 2942.96 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:35:15,959 epoch 2 - iter 78/138 - loss 0.18670587 - time (sec): 4.52 - samples/sec: 2887.25 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:35:16,651 epoch 2 - iter 91/138 - loss 0.18106996 - time (sec): 5.21 - samples/sec: 2857.52 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:35:17,408 epoch 2 - iter 104/138 - loss 0.17656932 - time (sec): 5.97 - samples/sec: 2896.66 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:35:18,168 epoch 2 - iter 117/138 - loss 0.17142517 - time (sec): 6.73 - samples/sec: 2880.42 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:35:18,911 epoch 2 - iter 130/138 - loss 0.17318261 - time (sec): 7.47 - samples/sec: 2888.38 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:35:19,343 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:19,344 EPOCH 2 done: loss 0.1682 - lr: 0.000045 2023-10-17 08:35:19,974 DEV : loss 0.15141892433166504 - f1-score (micro avg) 0.8127 2023-10-17 08:35:19,979 saving best model 2023-10-17 08:35:20,422 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:21,203 epoch 3 - iter 13/138 - loss 0.10718037 - time (sec): 0.78 - samples/sec: 2872.78 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:35:21,944 epoch 3 - iter 26/138 - loss 0.09491367 - time (sec): 1.52 - samples/sec: 2859.72 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:35:22,667 epoch 3 - iter 39/138 - loss 0.10218904 - time (sec): 2.24 - samples/sec: 2925.36 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:35:23,382 epoch 3 - iter 52/138 - loss 0.11929519 - time (sec): 2.95 - samples/sec: 2918.24 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:35:24,120 epoch 3 - iter 65/138 - loss 0.10689042 - time (sec): 3.69 - samples/sec: 2929.70 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:35:24,820 epoch 3 - iter 78/138 - loss 0.10328288 - time (sec): 4.39 - samples/sec: 2895.41 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:35:25,584 epoch 3 - iter 91/138 - loss 0.10884179 - time (sec): 5.16 - samples/sec: 2943.16 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:35:26,296 epoch 3 - iter 104/138 - loss 0.10748961 - time (sec): 5.87 - samples/sec: 2897.24 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:35:27,061 epoch 3 - iter 117/138 - loss 0.10155459 - time (sec): 6.63 - samples/sec: 2914.05 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:35:27,789 epoch 3 - iter 130/138 - loss 0.10365295 - time (sec): 7.36 - samples/sec: 2909.58 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:35:28,255 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:28,255 EPOCH 3 done: loss 0.1017 - lr: 0.000039 2023-10-17 08:35:28,989 DEV : loss 0.1499420553445816 - f1-score (micro avg) 0.8242 2023-10-17 08:35:28,994 saving best model 2023-10-17 08:35:29,427 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:30,170 epoch 4 - iter 13/138 - loss 0.08215706 - time (sec): 0.74 - samples/sec: 2889.38 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:35:30,893 epoch 4 - iter 26/138 - loss 0.09423860 - time (sec): 1.46 - samples/sec: 2903.96 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:35:31,658 epoch 4 - iter 39/138 - loss 0.07897657 - time (sec): 2.23 - samples/sec: 3006.49 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:35:32,376 epoch 4 - iter 52/138 - loss 0.07584959 - time (sec): 2.95 - samples/sec: 2960.47 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:35:33,139 epoch 4 - iter 65/138 - loss 0.07486259 - time (sec): 3.71 - samples/sec: 2962.15 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:35:33,888 epoch 4 - iter 78/138 - loss 0.07396912 - time (sec): 4.46 - samples/sec: 2963.46 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:35:34,650 epoch 4 - iter 91/138 - loss 0.07816740 - time (sec): 5.22 - samples/sec: 2944.53 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:35:35,356 epoch 4 - iter 104/138 - loss 0.07634711 - time (sec): 5.93 - samples/sec: 2908.55 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:35:36,161 epoch 4 - iter 117/138 - loss 0.07445357 - time (sec): 6.73 - samples/sec: 2928.58 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:35:36,888 epoch 4 - iter 130/138 - loss 0.07280530 - time (sec): 7.46 - samples/sec: 2896.65 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:35:37,316 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:37,316 EPOCH 4 done: loss 0.0765 - lr: 0.000034 2023-10-17 08:35:37,955 DEV : loss 0.142630472779274 - f1-score (micro avg) 0.8541 2023-10-17 08:35:37,960 saving best model 2023-10-17 08:35:38,389 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:39,240 epoch 5 - iter 13/138 - loss 0.03121562 - time (sec): 0.85 - samples/sec: 2885.84 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:35:39,959 epoch 5 - iter 26/138 - loss 0.04200227 - time (sec): 1.57 - samples/sec: 2777.97 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:35:40,670 epoch 5 - iter 39/138 - loss 0.05399254 - time (sec): 2.28 - samples/sec: 2865.47 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:35:41,438 epoch 5 - iter 52/138 - loss 0.04714869 - time (sec): 3.05 - samples/sec: 2812.70 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:35:42,181 epoch 5 - iter 65/138 - loss 0.04556329 - time (sec): 3.79 - samples/sec: 2818.02 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:35:42,936 epoch 5 - iter 78/138 - loss 0.04860871 - time (sec): 4.54 - samples/sec: 2832.52 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:35:43,652 epoch 5 - iter 91/138 - loss 0.05073703 - time (sec): 5.26 - samples/sec: 2875.50 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:35:44,408 epoch 5 - iter 104/138 - loss 0.05067376 - time (sec): 6.02 - samples/sec: 2880.98 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:35:45,141 epoch 5 - iter 117/138 - loss 0.04980472 - time (sec): 6.75 - samples/sec: 2888.64 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:35:45,858 epoch 5 - iter 130/138 - loss 0.05156968 - time (sec): 7.47 - samples/sec: 2882.46 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:35:46,290 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:46,290 EPOCH 5 done: loss 0.0544 - lr: 0.000028 2023-10-17 08:35:46,941 DEV : loss 0.13915039598941803 - f1-score (micro avg) 0.8685 2023-10-17 08:35:46,946 saving best model 2023-10-17 08:35:47,394 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:48,156 epoch 6 - iter 13/138 - loss 0.05894751 - time (sec): 0.76 - samples/sec: 2914.36 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:35:48,874 epoch 6 - iter 26/138 - loss 0.04072450 - time (sec): 1.48 - samples/sec: 2788.36 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:35:49,620 epoch 6 - iter 39/138 - loss 0.03586809 - time (sec): 2.22 - samples/sec: 2841.81 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:35:50,366 epoch 6 - iter 52/138 - loss 0.03465352 - time (sec): 2.97 - samples/sec: 2859.07 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:35:51,104 epoch 6 - iter 65/138 - loss 0.03757808 - time (sec): 3.71 - samples/sec: 2857.92 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:35:51,890 epoch 6 - iter 78/138 - loss 0.04046940 - time (sec): 4.49 - samples/sec: 2856.53 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:35:52,701 epoch 6 - iter 91/138 - loss 0.04562090 - time (sec): 5.30 - samples/sec: 2894.15 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:35:53,438 epoch 6 - iter 104/138 - loss 0.04308195 - time (sec): 6.04 - samples/sec: 2893.39 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:35:54,134 epoch 6 - iter 117/138 - loss 0.04275599 - time (sec): 6.74 - samples/sec: 2889.34 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:35:54,891 epoch 6 - iter 130/138 - loss 0.04058690 - time (sec): 7.49 - samples/sec: 2882.52 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:35:55,346 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:55,347 EPOCH 6 done: loss 0.0403 - lr: 0.000023 2023-10-17 08:35:56,007 DEV : loss 0.17933738231658936 - f1-score (micro avg) 0.8634 2023-10-17 08:35:56,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:35:56,740 epoch 7 - iter 13/138 - loss 0.05558773 - time (sec): 0.73 - samples/sec: 3158.89 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:35:57,500 epoch 7 - iter 26/138 - loss 0.03701149 - time (sec): 1.49 - samples/sec: 3057.72 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:35:58,203 epoch 7 - iter 39/138 - loss 0.03583069 - time (sec): 2.19 - samples/sec: 3014.45 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:35:58,941 epoch 7 - iter 52/138 - loss 0.04594578 - time (sec): 2.93 - samples/sec: 2981.36 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:35:59,694 epoch 7 - iter 65/138 - loss 0.04030116 - time (sec): 3.68 - samples/sec: 2988.98 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:36:00,421 epoch 7 - iter 78/138 - loss 0.04106748 - time (sec): 4.41 - samples/sec: 2950.01 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:36:01,180 epoch 7 - iter 91/138 - loss 0.03909668 - time (sec): 5.17 - samples/sec: 2921.79 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:36:02,029 epoch 7 - iter 104/138 - loss 0.03806689 - time (sec): 6.02 - samples/sec: 2870.22 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:36:02,737 epoch 7 - iter 117/138 - loss 0.03591500 - time (sec): 6.72 - samples/sec: 2857.58 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:36:03,492 epoch 7 - iter 130/138 - loss 0.03447142 - time (sec): 7.48 - samples/sec: 2876.46 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:36:03,968 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:03,968 EPOCH 7 done: loss 0.0336 - lr: 0.000017 2023-10-17 08:36:04,614 DEV : loss 0.1858513355255127 - f1-score (micro avg) 0.8633 2023-10-17 08:36:04,619 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:05,347 epoch 8 - iter 13/138 - loss 0.01159785 - time (sec): 0.73 - samples/sec: 2930.99 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:36:06,107 epoch 8 - iter 26/138 - loss 0.00748875 - time (sec): 1.49 - samples/sec: 2819.78 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:36:06,828 epoch 8 - iter 39/138 - loss 0.00977630 - time (sec): 2.21 - samples/sec: 2846.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:36:07,619 epoch 8 - iter 52/138 - loss 0.01032406 - time (sec): 3.00 - samples/sec: 2884.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:36:08,355 epoch 8 - iter 65/138 - loss 0.01045180 - time (sec): 3.74 - samples/sec: 2905.72 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:36:09,149 epoch 8 - iter 78/138 - loss 0.01407818 - time (sec): 4.53 - samples/sec: 2886.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:36:09,892 epoch 8 - iter 91/138 - loss 0.01448040 - time (sec): 5.27 - samples/sec: 2862.11 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:36:10,630 epoch 8 - iter 104/138 - loss 0.01703694 - time (sec): 6.01 - samples/sec: 2872.26 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:36:11,389 epoch 8 - iter 117/138 - loss 0.02100446 - time (sec): 6.77 - samples/sec: 2862.47 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:36:12,097 epoch 8 - iter 130/138 - loss 0.02527052 - time (sec): 7.48 - samples/sec: 2866.84 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:36:12,579 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:12,579 EPOCH 8 done: loss 0.0251 - lr: 0.000012 2023-10-17 08:36:13,258 DEV : loss 0.18673621118068695 - f1-score (micro avg) 0.8619 2023-10-17 08:36:13,263 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:14,055 epoch 9 - iter 13/138 - loss 0.00286938 - time (sec): 0.79 - samples/sec: 2623.46 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:36:14,786 epoch 9 - iter 26/138 - loss 0.00516397 - time (sec): 1.52 - samples/sec: 2730.28 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:36:15,547 epoch 9 - iter 39/138 - loss 0.00459760 - time (sec): 2.28 - samples/sec: 2703.24 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:36:16,357 epoch 9 - iter 52/138 - loss 0.01948120 - time (sec): 3.09 - samples/sec: 2770.89 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:36:17,118 epoch 9 - iter 65/138 - loss 0.01938758 - time (sec): 3.85 - samples/sec: 2749.47 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:36:17,943 epoch 9 - iter 78/138 - loss 0.02191297 - time (sec): 4.68 - samples/sec: 2751.11 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:36:18,732 epoch 9 - iter 91/138 - loss 0.01910025 - time (sec): 5.47 - samples/sec: 2751.04 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:36:19,491 epoch 9 - iter 104/138 - loss 0.01905430 - time (sec): 6.23 - samples/sec: 2741.24 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:36:20,292 epoch 9 - iter 117/138 - loss 0.01906412 - time (sec): 7.03 - samples/sec: 2747.88 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:36:21,034 epoch 9 - iter 130/138 - loss 0.02012627 - time (sec): 7.77 - samples/sec: 2772.24 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:36:21,458 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:21,458 EPOCH 9 done: loss 0.0191 - lr: 0.000006 2023-10-17 08:36:22,181 DEV : loss 0.20274551212787628 - f1-score (micro avg) 0.8558 2023-10-17 08:36:22,186 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:22,923 epoch 10 - iter 13/138 - loss 0.00037874 - time (sec): 0.74 - samples/sec: 2816.38 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:36:23,701 epoch 10 - iter 26/138 - loss 0.00164556 - time (sec): 1.51 - samples/sec: 2860.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:36:24,464 epoch 10 - iter 39/138 - loss 0.00401456 - time (sec): 2.28 - samples/sec: 2919.05 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:36:25,258 epoch 10 - iter 52/138 - loss 0.01412709 - time (sec): 3.07 - samples/sec: 2880.68 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:36:25,995 epoch 10 - iter 65/138 - loss 0.01577883 - time (sec): 3.81 - samples/sec: 2837.15 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:36:26,765 epoch 10 - iter 78/138 - loss 0.01358062 - time (sec): 4.58 - samples/sec: 2842.94 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:36:27,536 epoch 10 - iter 91/138 - loss 0.01388950 - time (sec): 5.35 - samples/sec: 2848.01 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:36:28,287 epoch 10 - iter 104/138 - loss 0.01366988 - time (sec): 6.10 - samples/sec: 2867.03 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:36:29,019 epoch 10 - iter 117/138 - loss 0.01421039 - time (sec): 6.83 - samples/sec: 2841.28 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:36:29,736 epoch 10 - iter 130/138 - loss 0.01533784 - time (sec): 7.55 - samples/sec: 2839.22 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:36:30,198 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:30,198 EPOCH 10 done: loss 0.0146 - lr: 0.000000 2023-10-17 08:36:30,901 DEV : loss 0.20630405843257904 - f1-score (micro avg) 0.8599 2023-10-17 08:36:31,252 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:36:31,253 Loading model from best epoch ... 2023-10-17 08:36:32,566 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:36:33,363 Results: - F-score (micro) 0.886 - F-score (macro) 0.6312 - Accuracy 0.8067 By class: precision recall f1-score support scope 0.8851 0.8750 0.8800 176 pers 0.9677 0.9375 0.9524 128 work 0.7975 0.8514 0.8235 74 object 0.0000 0.0000 0.0000 2 loc 0.5000 0.5000 0.5000 2 micro avg 0.8871 0.8848 0.8860 382 macro avg 0.6301 0.6328 0.6312 382 weighted avg 0.8891 0.8848 0.8867 382 2023-10-17 08:36:33,363 ----------------------------------------------------------------------------------------------------