2023-10-17 19:15:43,844 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,846 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,846 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,846 Train: 20847 sentences 2023-10-17 19:15:43,846 (train_with_dev=False, train_with_test=False) 2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,846 Training Params: 2023-10-17 19:15:43,847 - learning_rate: "3e-05" 2023-10-17 19:15:43,847 - mini_batch_size: "4" 2023-10-17 19:15:43,847 - max_epochs: "10" 2023-10-17 19:15:43,847 - shuffle: "True" 2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,847 Plugins: 2023-10-17 19:15:43,847 - TensorboardLogger 2023-10-17 19:15:43,847 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,847 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 19:15:43,847 - metric: "('micro avg', 'f1-score')" 2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,847 Computation: 2023-10-17 19:15:43,847 - compute on device: cuda:0 2023-10-17 19:15:43,847 - embedding storage: none 2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,848 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:15:43,848 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 19:16:25,838 epoch 1 - iter 521/5212 - loss 1.79081099 - time (sec): 41.99 - samples/sec: 897.70 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:17:08,595 epoch 1 - iter 1042/5212 - loss 1.11359267 - time (sec): 84.75 - samples/sec: 881.25 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:17:49,971 epoch 1 - iter 1563/5212 - loss 0.86063766 - time (sec): 126.12 - samples/sec: 874.30 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:18:32,756 epoch 1 - iter 2084/5212 - loss 0.72254731 - time (sec): 168.91 - samples/sec: 862.33 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:19:14,275 epoch 1 - iter 2605/5212 - loss 0.63113368 - time (sec): 210.42 - samples/sec: 854.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:19:55,829 epoch 1 - iter 3126/5212 - loss 0.56274254 - time (sec): 251.98 - samples/sec: 861.56 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:20:37,475 epoch 1 - iter 3647/5212 - loss 0.51180163 - time (sec): 293.63 - samples/sec: 863.11 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:21:20,549 epoch 1 - iter 4168/5212 - loss 0.47801506 - time (sec): 336.70 - samples/sec: 861.50 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:22:02,701 epoch 1 - iter 4689/5212 - loss 0.44903165 - time (sec): 378.85 - samples/sec: 858.92 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:22:45,402 epoch 1 - iter 5210/5212 - loss 0.42328665 - time (sec): 421.55 - samples/sec: 871.56 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:22:45,568 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:22:45,568 EPOCH 1 done: loss 0.4233 - lr: 0.000030 2023-10-17 19:22:53,561 DEV : loss 0.16118471324443817 - f1-score (micro avg) 0.3044 2023-10-17 19:22:53,624 saving best model 2023-10-17 19:22:54,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:23:37,950 epoch 2 - iter 521/5212 - loss 0.19745124 - time (sec): 43.71 - samples/sec: 841.63 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:24:20,587 epoch 2 - iter 1042/5212 - loss 0.18718357 - time (sec): 86.34 - samples/sec: 886.43 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:25:01,266 epoch 2 - iter 1563/5212 - loss 0.18426743 - time (sec): 127.02 - samples/sec: 883.54 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:25:43,672 epoch 2 - iter 2084/5212 - loss 0.18503011 - time (sec): 169.43 - samples/sec: 873.71 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:26:26,956 epoch 2 - iter 2605/5212 - loss 0.18746087 - time (sec): 212.71 - samples/sec: 860.20 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:27:10,029 epoch 2 - iter 3126/5212 - loss 0.18636111 - time (sec): 255.79 - samples/sec: 859.99 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:27:50,405 epoch 2 - iter 3647/5212 - loss 0.18553488 - time (sec): 296.16 - samples/sec: 859.53 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:28:31,258 epoch 2 - iter 4168/5212 - loss 0.18074757 - time (sec): 337.02 - samples/sec: 866.96 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:29:12,346 epoch 2 - iter 4689/5212 - loss 0.17930858 - time (sec): 378.10 - samples/sec: 871.71 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:29:53,662 epoch 2 - iter 5210/5212 - loss 0.17753408 - time (sec): 419.42 - samples/sec: 875.95 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:29:53,812 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:29:53,812 EPOCH 2 done: loss 0.1775 - lr: 0.000027 2023-10-17 19:30:05,947 DEV : loss 0.1658860296010971 - f1-score (micro avg) 0.328 2023-10-17 19:30:06,006 saving best model 2023-10-17 19:30:07,423 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:30:48,361 epoch 3 - iter 521/5212 - loss 0.12724045 - time (sec): 40.93 - samples/sec: 873.85 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:31:30,342 epoch 3 - iter 1042/5212 - loss 0.12659364 - time (sec): 82.91 - samples/sec: 885.08 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:32:11,449 epoch 3 - iter 1563/5212 - loss 0.13086892 - time (sec): 124.02 - samples/sec: 885.97 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:32:51,804 epoch 3 - iter 2084/5212 - loss 0.13601022 - time (sec): 164.38 - samples/sec: 885.41 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:33:33,570 epoch 3 - iter 2605/5212 - loss 0.13335118 - time (sec): 206.14 - samples/sec: 884.09 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:34:14,705 epoch 3 - iter 3126/5212 - loss 0.13049815 - time (sec): 247.28 - samples/sec: 895.14 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:34:54,896 epoch 3 - iter 3647/5212 - loss 0.13086755 - time (sec): 287.47 - samples/sec: 896.68 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:35:36,596 epoch 3 - iter 4168/5212 - loss 0.12984620 - time (sec): 329.17 - samples/sec: 897.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:36:17,637 epoch 3 - iter 4689/5212 - loss 0.12848690 - time (sec): 370.21 - samples/sec: 887.38 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:37:00,276 epoch 3 - iter 5210/5212 - loss 0.12780510 - time (sec): 412.85 - samples/sec: 889.84 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:37:00,434 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:37:00,434 EPOCH 3 done: loss 0.1278 - lr: 0.000023 2023-10-17 19:37:11,317 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3325 2023-10-17 19:37:11,372 saving best model 2023-10-17 19:37:13,638 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:37:55,294 epoch 4 - iter 521/5212 - loss 0.08561438 - time (sec): 41.65 - samples/sec: 889.99 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:38:37,303 epoch 4 - iter 1042/5212 - loss 0.08487661 - time (sec): 83.66 - samples/sec: 902.31 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:39:17,623 epoch 4 - iter 1563/5212 - loss 0.08733029 - time (sec): 123.98 - samples/sec: 892.46 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:39:57,086 epoch 4 - iter 2084/5212 - loss 0.09068160 - time (sec): 163.44 - samples/sec: 900.17 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:40:38,544 epoch 4 - iter 2605/5212 - loss 0.08826575 - time (sec): 204.90 - samples/sec: 895.06 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:41:20,337 epoch 4 - iter 3126/5212 - loss 0.08927590 - time (sec): 246.70 - samples/sec: 890.13 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:42:02,639 epoch 4 - iter 3647/5212 - loss 0.09103052 - time (sec): 289.00 - samples/sec: 885.89 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:42:44,152 epoch 4 - iter 4168/5212 - loss 0.09120563 - time (sec): 330.51 - samples/sec: 889.32 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:43:24,805 epoch 4 - iter 4689/5212 - loss 0.09067149 - time (sec): 371.16 - samples/sec: 887.48 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:44:06,153 epoch 4 - iter 5210/5212 - loss 0.08963326 - time (sec): 412.51 - samples/sec: 890.27 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:44:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:44:06,304 EPOCH 4 done: loss 0.0896 - lr: 0.000020 2023-10-17 19:44:17,266 DEV : loss 0.37559980154037476 - f1-score (micro avg) 0.3333 2023-10-17 19:44:17,320 saving best model 2023-10-17 19:44:18,755 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:45:00,565 epoch 5 - iter 521/5212 - loss 0.07091184 - time (sec): 41.81 - samples/sec: 870.41 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:45:41,701 epoch 5 - iter 1042/5212 - loss 0.06709683 - time (sec): 82.94 - samples/sec: 876.92 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:46:25,201 epoch 5 - iter 1563/5212 - loss 0.06271210 - time (sec): 126.44 - samples/sec: 878.30 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:47:08,265 epoch 5 - iter 2084/5212 - loss 0.06183598 - time (sec): 169.51 - samples/sec: 876.76 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:47:51,557 epoch 5 - iter 2605/5212 - loss 0.06620122 - time (sec): 212.80 - samples/sec: 873.44 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:48:32,575 epoch 5 - iter 3126/5212 - loss 0.06541861 - time (sec): 253.82 - samples/sec: 879.21 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:49:13,264 epoch 5 - iter 3647/5212 - loss 0.06535621 - time (sec): 294.51 - samples/sec: 881.37 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:49:55,190 epoch 5 - iter 4168/5212 - loss 0.06634610 - time (sec): 336.43 - samples/sec: 874.02 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:50:36,970 epoch 5 - iter 4689/5212 - loss 0.06558366 - time (sec): 378.21 - samples/sec: 878.27 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:51:19,264 epoch 5 - iter 5210/5212 - loss 0.06576385 - time (sec): 420.50 - samples/sec: 873.25 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:51:19,446 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:51:19,447 EPOCH 5 done: loss 0.0657 - lr: 0.000017 2023-10-17 19:51:30,543 DEV : loss 0.37306851148605347 - f1-score (micro avg) 0.3721 2023-10-17 19:51:30,599 saving best model 2023-10-17 19:51:32,021 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:52:13,905 epoch 6 - iter 521/5212 - loss 0.04071800 - time (sec): 41.88 - samples/sec: 899.21 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:52:56,216 epoch 6 - iter 1042/5212 - loss 0.04067970 - time (sec): 84.19 - samples/sec: 851.43 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:53:38,996 epoch 6 - iter 1563/5212 - loss 0.04392675 - time (sec): 126.97 - samples/sec: 843.09 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:54:20,689 epoch 6 - iter 2084/5212 - loss 0.04624917 - time (sec): 168.66 - samples/sec: 841.75 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:55:02,737 epoch 6 - iter 2605/5212 - loss 0.04664802 - time (sec): 210.71 - samples/sec: 842.84 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:55:43,236 epoch 6 - iter 3126/5212 - loss 0.04611913 - time (sec): 251.21 - samples/sec: 856.24 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:56:25,765 epoch 6 - iter 3647/5212 - loss 0.04513070 - time (sec): 293.74 - samples/sec: 857.27 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:57:08,519 epoch 6 - iter 4168/5212 - loss 0.04430926 - time (sec): 336.49 - samples/sec: 867.81 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:57:51,149 epoch 6 - iter 4689/5212 - loss 0.04422683 - time (sec): 379.12 - samples/sec: 876.18 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:58:32,733 epoch 6 - iter 5210/5212 - loss 0.04435344 - time (sec): 420.71 - samples/sec: 873.18 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:58:32,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:32,887 EPOCH 6 done: loss 0.0444 - lr: 0.000013 2023-10-17 19:58:44,404 DEV : loss 0.38573741912841797 - f1-score (micro avg) 0.3998 2023-10-17 19:58:44,465 saving best model 2023-10-17 19:58:45,880 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:59:27,983 epoch 7 - iter 521/5212 - loss 0.03311573 - time (sec): 42.10 - samples/sec: 920.02 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:00:08,876 epoch 7 - iter 1042/5212 - loss 0.02677280 - time (sec): 82.99 - samples/sec: 903.20 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:00:50,108 epoch 7 - iter 1563/5212 - loss 0.03149182 - time (sec): 124.22 - samples/sec: 906.02 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:01:30,385 epoch 7 - iter 2084/5212 - loss 0.03097008 - time (sec): 164.50 - samples/sec: 895.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:02:12,354 epoch 7 - iter 2605/5212 - loss 0.03138727 - time (sec): 206.47 - samples/sec: 889.41 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:02:50,517 epoch 7 - iter 3126/5212 - loss 0.03238816 - time (sec): 244.63 - samples/sec: 907.86 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:03:28,571 epoch 7 - iter 3647/5212 - loss 0.03240198 - time (sec): 282.69 - samples/sec: 910.52 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:04:09,623 epoch 7 - iter 4168/5212 - loss 0.03231063 - time (sec): 323.74 - samples/sec: 903.97 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:04:50,358 epoch 7 - iter 4689/5212 - loss 0.03217252 - time (sec): 364.47 - samples/sec: 902.06 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:05:30,790 epoch 7 - iter 5210/5212 - loss 0.03119367 - time (sec): 404.90 - samples/sec: 907.24 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:05:30,938 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:05:30,939 EPOCH 7 done: loss 0.0312 - lr: 0.000010 2023-10-17 20:05:42,794 DEV : loss 0.4771367907524109 - f1-score (micro avg) 0.3705 2023-10-17 20:05:42,853 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:06:23,460 epoch 8 - iter 521/5212 - loss 0.02998807 - time (sec): 40.61 - samples/sec: 888.36 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:07:05,326 epoch 8 - iter 1042/5212 - loss 0.02453114 - time (sec): 82.47 - samples/sec: 879.19 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:07:46,564 epoch 8 - iter 1563/5212 - loss 0.02200622 - time (sec): 123.71 - samples/sec: 889.00 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:08:28,073 epoch 8 - iter 2084/5212 - loss 0.02307900 - time (sec): 165.22 - samples/sec: 888.53 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:09:09,714 epoch 8 - iter 2605/5212 - loss 0.02297405 - time (sec): 206.86 - samples/sec: 888.20 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:09:51,048 epoch 8 - iter 3126/5212 - loss 0.02283079 - time (sec): 248.19 - samples/sec: 883.30 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:10:33,437 epoch 8 - iter 3647/5212 - loss 0.02212019 - time (sec): 290.58 - samples/sec: 882.42 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:11:13,823 epoch 8 - iter 4168/5212 - loss 0.02167924 - time (sec): 330.97 - samples/sec: 883.31 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:11:54,496 epoch 8 - iter 4689/5212 - loss 0.02165052 - time (sec): 371.64 - samples/sec: 884.04 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:12:36,344 epoch 8 - iter 5210/5212 - loss 0.02175733 - time (sec): 413.49 - samples/sec: 888.16 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:12:36,512 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:12:36,512 EPOCH 8 done: loss 0.0217 - lr: 0.000007 2023-10-17 20:12:48,699 DEV : loss 0.4998532235622406 - f1-score (micro avg) 0.3543 2023-10-17 20:12:48,755 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:13:31,960 epoch 9 - iter 521/5212 - loss 0.00863824 - time (sec): 43.20 - samples/sec: 948.05 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:14:14,947 epoch 9 - iter 1042/5212 - loss 0.01171617 - time (sec): 86.19 - samples/sec: 908.03 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:14:56,079 epoch 9 - iter 1563/5212 - loss 0.01668874 - time (sec): 127.32 - samples/sec: 891.92 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:15:37,604 epoch 9 - iter 2084/5212 - loss 0.01665217 - time (sec): 168.85 - samples/sec: 894.47 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:16:21,106 epoch 9 - iter 2605/5212 - loss 0.01635385 - time (sec): 212.35 - samples/sec: 882.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:17:04,978 epoch 9 - iter 3126/5212 - loss 0.01550677 - time (sec): 256.22 - samples/sec: 872.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:17:47,321 epoch 9 - iter 3647/5212 - loss 0.01517082 - time (sec): 298.56 - samples/sec: 868.72 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:18:29,546 epoch 9 - iter 4168/5212 - loss 0.01578016 - time (sec): 340.79 - samples/sec: 872.86 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:19:12,396 epoch 9 - iter 4689/5212 - loss 0.01560147 - time (sec): 383.64 - samples/sec: 872.79 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:19:53,494 epoch 9 - iter 5210/5212 - loss 0.01545797 - time (sec): 424.74 - samples/sec: 864.75 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:19:53,648 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:19:53,648 EPOCH 9 done: loss 0.0155 - lr: 0.000003 2023-10-17 20:20:05,733 DEV : loss 0.5256008505821228 - f1-score (micro avg) 0.3744 2023-10-17 20:20:05,821 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:20:49,600 epoch 10 - iter 521/5212 - loss 0.00845977 - time (sec): 43.78 - samples/sec: 868.52 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:21:32,317 epoch 10 - iter 1042/5212 - loss 0.00922324 - time (sec): 86.49 - samples/sec: 897.78 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:22:12,750 epoch 10 - iter 1563/5212 - loss 0.00972860 - time (sec): 126.93 - samples/sec: 874.26 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:22:54,760 epoch 10 - iter 2084/5212 - loss 0.00952950 - time (sec): 168.94 - samples/sec: 881.10 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:23:39,004 epoch 10 - iter 2605/5212 - loss 0.00943308 - time (sec): 213.18 - samples/sec: 860.79 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:24:21,919 epoch 10 - iter 3126/5212 - loss 0.00987377 - time (sec): 256.09 - samples/sec: 859.62 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:25:08,498 epoch 10 - iter 3647/5212 - loss 0.00962367 - time (sec): 302.67 - samples/sec: 857.97 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:25:49,520 epoch 10 - iter 4168/5212 - loss 0.00965467 - time (sec): 343.70 - samples/sec: 862.47 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:26:29,257 epoch 10 - iter 4689/5212 - loss 0.00958044 - time (sec): 383.43 - samples/sec: 861.07 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:27:09,922 epoch 10 - iter 5210/5212 - loss 0.00919733 - time (sec): 424.10 - samples/sec: 866.15 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:27:10,080 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:27:10,080 EPOCH 10 done: loss 0.0092 - lr: 0.000000 2023-10-17 20:27:22,342 DEV : loss 0.5250148177146912 - f1-score (micro avg) 0.3723 2023-10-17 20:27:22,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:27:23,000 Loading model from best epoch ... 2023-10-17 20:27:25,469 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 20:27:44,580 Results: - F-score (micro) 0.4828 - F-score (macro) 0.3302 - Accuracy 0.3225 By class: precision recall f1-score support LOC 0.4639 0.6293 0.5341 1214 PER 0.4487 0.4926 0.4696 808 ORG 0.3199 0.3144 0.3171 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4416 0.5326 0.4828 2390 macro avg 0.3081 0.3591 0.3302 2390 weighted avg 0.4346 0.5326 0.4769 2390 2023-10-17 20:27:44,580 ----------------------------------------------------------------------------------------------------