|
2023-10-17 19:15:43,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,846 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,846 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,846 Train: 20847 sentences |
|
2023-10-17 19:15:43,846 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 19:15:43,846 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,846 Training Params: |
|
2023-10-17 19:15:43,847 - learning_rate: "3e-05" |
|
2023-10-17 19:15:43,847 - mini_batch_size: "4" |
|
2023-10-17 19:15:43,847 - max_epochs: "10" |
|
2023-10-17 19:15:43,847 - shuffle: "True" |
|
2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,847 Plugins: |
|
2023-10-17 19:15:43,847 - TensorboardLogger |
|
2023-10-17 19:15:43,847 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,847 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 19:15:43,847 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 19:15:43,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,847 Computation: |
|
2023-10-17 19:15:43,847 - compute on device: cuda:0 |
|
2023-10-17 19:15:43,847 - embedding storage: none |
|
2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,848 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:15:43,848 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 19:16:25,838 epoch 1 - iter 521/5212 - loss 1.79081099 - time (sec): 41.99 - samples/sec: 897.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:17:08,595 epoch 1 - iter 1042/5212 - loss 1.11359267 - time (sec): 84.75 - samples/sec: 881.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:17:49,971 epoch 1 - iter 1563/5212 - loss 0.86063766 - time (sec): 126.12 - samples/sec: 874.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:18:32,756 epoch 1 - iter 2084/5212 - loss 0.72254731 - time (sec): 168.91 - samples/sec: 862.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:19:14,275 epoch 1 - iter 2605/5212 - loss 0.63113368 - time (sec): 210.42 - samples/sec: 854.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:19:55,829 epoch 1 - iter 3126/5212 - loss 0.56274254 - time (sec): 251.98 - samples/sec: 861.56 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:20:37,475 epoch 1 - iter 3647/5212 - loss 0.51180163 - time (sec): 293.63 - samples/sec: 863.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:21:20,549 epoch 1 - iter 4168/5212 - loss 0.47801506 - time (sec): 336.70 - samples/sec: 861.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:22:02,701 epoch 1 - iter 4689/5212 - loss 0.44903165 - time (sec): 378.85 - samples/sec: 858.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:22:45,402 epoch 1 - iter 5210/5212 - loss 0.42328665 - time (sec): 421.55 - samples/sec: 871.56 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 19:22:45,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:22:45,568 EPOCH 1 done: loss 0.4233 - lr: 0.000030 |
|
2023-10-17 19:22:53,561 DEV : loss 0.16118471324443817 - f1-score (micro avg) 0.3044 |
|
2023-10-17 19:22:53,624 saving best model |
|
2023-10-17 19:22:54,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:23:37,950 epoch 2 - iter 521/5212 - loss 0.19745124 - time (sec): 43.71 - samples/sec: 841.63 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 19:24:20,587 epoch 2 - iter 1042/5212 - loss 0.18718357 - time (sec): 86.34 - samples/sec: 886.43 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:25:01,266 epoch 2 - iter 1563/5212 - loss 0.18426743 - time (sec): 127.02 - samples/sec: 883.54 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:25:43,672 epoch 2 - iter 2084/5212 - loss 0.18503011 - time (sec): 169.43 - samples/sec: 873.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:26:26,956 epoch 2 - iter 2605/5212 - loss 0.18746087 - time (sec): 212.71 - samples/sec: 860.20 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:27:10,029 epoch 2 - iter 3126/5212 - loss 0.18636111 - time (sec): 255.79 - samples/sec: 859.99 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:27:50,405 epoch 2 - iter 3647/5212 - loss 0.18553488 - time (sec): 296.16 - samples/sec: 859.53 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:28:31,258 epoch 2 - iter 4168/5212 - loss 0.18074757 - time (sec): 337.02 - samples/sec: 866.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:29:12,346 epoch 2 - iter 4689/5212 - loss 0.17930858 - time (sec): 378.10 - samples/sec: 871.71 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:29:53,662 epoch 2 - iter 5210/5212 - loss 0.17753408 - time (sec): 419.42 - samples/sec: 875.95 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:29:53,812 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:29:53,812 EPOCH 2 done: loss 0.1775 - lr: 0.000027 |
|
2023-10-17 19:30:05,947 DEV : loss 0.1658860296010971 - f1-score (micro avg) 0.328 |
|
2023-10-17 19:30:06,006 saving best model |
|
2023-10-17 19:30:07,423 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:30:48,361 epoch 3 - iter 521/5212 - loss 0.12724045 - time (sec): 40.93 - samples/sec: 873.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:31:30,342 epoch 3 - iter 1042/5212 - loss 0.12659364 - time (sec): 82.91 - samples/sec: 885.08 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:32:11,449 epoch 3 - iter 1563/5212 - loss 0.13086892 - time (sec): 124.02 - samples/sec: 885.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:32:51,804 epoch 3 - iter 2084/5212 - loss 0.13601022 - time (sec): 164.38 - samples/sec: 885.41 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:33:33,570 epoch 3 - iter 2605/5212 - loss 0.13335118 - time (sec): 206.14 - samples/sec: 884.09 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:34:14,705 epoch 3 - iter 3126/5212 - loss 0.13049815 - time (sec): 247.28 - samples/sec: 895.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:34:54,896 epoch 3 - iter 3647/5212 - loss 0.13086755 - time (sec): 287.47 - samples/sec: 896.68 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:35:36,596 epoch 3 - iter 4168/5212 - loss 0.12984620 - time (sec): 329.17 - samples/sec: 897.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:36:17,637 epoch 3 - iter 4689/5212 - loss 0.12848690 - time (sec): 370.21 - samples/sec: 887.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:37:00,276 epoch 3 - iter 5210/5212 - loss 0.12780510 - time (sec): 412.85 - samples/sec: 889.84 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:37:00,434 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:37:00,434 EPOCH 3 done: loss 0.1278 - lr: 0.000023 |
|
2023-10-17 19:37:11,317 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3325 |
|
2023-10-17 19:37:11,372 saving best model |
|
2023-10-17 19:37:13,638 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:37:55,294 epoch 4 - iter 521/5212 - loss 0.08561438 - time (sec): 41.65 - samples/sec: 889.99 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:38:37,303 epoch 4 - iter 1042/5212 - loss 0.08487661 - time (sec): 83.66 - samples/sec: 902.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:39:17,623 epoch 4 - iter 1563/5212 - loss 0.08733029 - time (sec): 123.98 - samples/sec: 892.46 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:39:57,086 epoch 4 - iter 2084/5212 - loss 0.09068160 - time (sec): 163.44 - samples/sec: 900.17 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:40:38,544 epoch 4 - iter 2605/5212 - loss 0.08826575 - time (sec): 204.90 - samples/sec: 895.06 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:41:20,337 epoch 4 - iter 3126/5212 - loss 0.08927590 - time (sec): 246.70 - samples/sec: 890.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:42:02,639 epoch 4 - iter 3647/5212 - loss 0.09103052 - time (sec): 289.00 - samples/sec: 885.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:42:44,152 epoch 4 - iter 4168/5212 - loss 0.09120563 - time (sec): 330.51 - samples/sec: 889.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:43:24,805 epoch 4 - iter 4689/5212 - loss 0.09067149 - time (sec): 371.16 - samples/sec: 887.48 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:44:06,153 epoch 4 - iter 5210/5212 - loss 0.08963326 - time (sec): 412.51 - samples/sec: 890.27 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:44:06,304 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:44:06,304 EPOCH 4 done: loss 0.0896 - lr: 0.000020 |
|
2023-10-17 19:44:17,266 DEV : loss 0.37559980154037476 - f1-score (micro avg) 0.3333 |
|
2023-10-17 19:44:17,320 saving best model |
|
2023-10-17 19:44:18,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:00,565 epoch 5 - iter 521/5212 - loss 0.07091184 - time (sec): 41.81 - samples/sec: 870.41 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:45:41,701 epoch 5 - iter 1042/5212 - loss 0.06709683 - time (sec): 82.94 - samples/sec: 876.92 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:46:25,201 epoch 5 - iter 1563/5212 - loss 0.06271210 - time (sec): 126.44 - samples/sec: 878.30 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:47:08,265 epoch 5 - iter 2084/5212 - loss 0.06183598 - time (sec): 169.51 - samples/sec: 876.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:47:51,557 epoch 5 - iter 2605/5212 - loss 0.06620122 - time (sec): 212.80 - samples/sec: 873.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:48:32,575 epoch 5 - iter 3126/5212 - loss 0.06541861 - time (sec): 253.82 - samples/sec: 879.21 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:49:13,264 epoch 5 - iter 3647/5212 - loss 0.06535621 - time (sec): 294.51 - samples/sec: 881.37 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:49:55,190 epoch 5 - iter 4168/5212 - loss 0.06634610 - time (sec): 336.43 - samples/sec: 874.02 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:50:36,970 epoch 5 - iter 4689/5212 - loss 0.06558366 - time (sec): 378.21 - samples/sec: 878.27 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:51:19,264 epoch 5 - iter 5210/5212 - loss 0.06576385 - time (sec): 420.50 - samples/sec: 873.25 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:51:19,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:51:19,447 EPOCH 5 done: loss 0.0657 - lr: 0.000017 |
|
2023-10-17 19:51:30,543 DEV : loss 0.37306851148605347 - f1-score (micro avg) 0.3721 |
|
2023-10-17 19:51:30,599 saving best model |
|
2023-10-17 19:51:32,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:52:13,905 epoch 6 - iter 521/5212 - loss 0.04071800 - time (sec): 41.88 - samples/sec: 899.21 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:52:56,216 epoch 6 - iter 1042/5212 - loss 0.04067970 - time (sec): 84.19 - samples/sec: 851.43 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:53:38,996 epoch 6 - iter 1563/5212 - loss 0.04392675 - time (sec): 126.97 - samples/sec: 843.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:54:20,689 epoch 6 - iter 2084/5212 - loss 0.04624917 - time (sec): 168.66 - samples/sec: 841.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:55:02,737 epoch 6 - iter 2605/5212 - loss 0.04664802 - time (sec): 210.71 - samples/sec: 842.84 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:55:43,236 epoch 6 - iter 3126/5212 - loss 0.04611913 - time (sec): 251.21 - samples/sec: 856.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:56:25,765 epoch 6 - iter 3647/5212 - loss 0.04513070 - time (sec): 293.74 - samples/sec: 857.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:57:08,519 epoch 6 - iter 4168/5212 - loss 0.04430926 - time (sec): 336.49 - samples/sec: 867.81 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:57:51,149 epoch 6 - iter 4689/5212 - loss 0.04422683 - time (sec): 379.12 - samples/sec: 876.18 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:58:32,733 epoch 6 - iter 5210/5212 - loss 0.04435344 - time (sec): 420.71 - samples/sec: 873.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:58:32,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:58:32,887 EPOCH 6 done: loss 0.0444 - lr: 0.000013 |
|
2023-10-17 19:58:44,404 DEV : loss 0.38573741912841797 - f1-score (micro avg) 0.3998 |
|
2023-10-17 19:58:44,465 saving best model |
|
2023-10-17 19:58:45,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:59:27,983 epoch 7 - iter 521/5212 - loss 0.03311573 - time (sec): 42.10 - samples/sec: 920.02 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:00:08,876 epoch 7 - iter 1042/5212 - loss 0.02677280 - time (sec): 82.99 - samples/sec: 903.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:00:50,108 epoch 7 - iter 1563/5212 - loss 0.03149182 - time (sec): 124.22 - samples/sec: 906.02 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:01:30,385 epoch 7 - iter 2084/5212 - loss 0.03097008 - time (sec): 164.50 - samples/sec: 895.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:02:12,354 epoch 7 - iter 2605/5212 - loss 0.03138727 - time (sec): 206.47 - samples/sec: 889.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:02:50,517 epoch 7 - iter 3126/5212 - loss 0.03238816 - time (sec): 244.63 - samples/sec: 907.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:03:28,571 epoch 7 - iter 3647/5212 - loss 0.03240198 - time (sec): 282.69 - samples/sec: 910.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:04:09,623 epoch 7 - iter 4168/5212 - loss 0.03231063 - time (sec): 323.74 - samples/sec: 903.97 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:04:50,358 epoch 7 - iter 4689/5212 - loss 0.03217252 - time (sec): 364.47 - samples/sec: 902.06 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:05:30,790 epoch 7 - iter 5210/5212 - loss 0.03119367 - time (sec): 404.90 - samples/sec: 907.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:05:30,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:30,939 EPOCH 7 done: loss 0.0312 - lr: 0.000010 |
|
2023-10-17 20:05:42,794 DEV : loss 0.4771367907524109 - f1-score (micro avg) 0.3705 |
|
2023-10-17 20:05:42,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:06:23,460 epoch 8 - iter 521/5212 - loss 0.02998807 - time (sec): 40.61 - samples/sec: 888.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:07:05,326 epoch 8 - iter 1042/5212 - loss 0.02453114 - time (sec): 82.47 - samples/sec: 879.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:07:46,564 epoch 8 - iter 1563/5212 - loss 0.02200622 - time (sec): 123.71 - samples/sec: 889.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:08:28,073 epoch 8 - iter 2084/5212 - loss 0.02307900 - time (sec): 165.22 - samples/sec: 888.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:09:09,714 epoch 8 - iter 2605/5212 - loss 0.02297405 - time (sec): 206.86 - samples/sec: 888.20 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:09:51,048 epoch 8 - iter 3126/5212 - loss 0.02283079 - time (sec): 248.19 - samples/sec: 883.30 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:10:33,437 epoch 8 - iter 3647/5212 - loss 0.02212019 - time (sec): 290.58 - samples/sec: 882.42 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:11:13,823 epoch 8 - iter 4168/5212 - loss 0.02167924 - time (sec): 330.97 - samples/sec: 883.31 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:11:54,496 epoch 8 - iter 4689/5212 - loss 0.02165052 - time (sec): 371.64 - samples/sec: 884.04 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:12:36,344 epoch 8 - iter 5210/5212 - loss 0.02175733 - time (sec): 413.49 - samples/sec: 888.16 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:12:36,512 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:12:36,512 EPOCH 8 done: loss 0.0217 - lr: 0.000007 |
|
2023-10-17 20:12:48,699 DEV : loss 0.4998532235622406 - f1-score (micro avg) 0.3543 |
|
2023-10-17 20:12:48,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:13:31,960 epoch 9 - iter 521/5212 - loss 0.00863824 - time (sec): 43.20 - samples/sec: 948.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:14:14,947 epoch 9 - iter 1042/5212 - loss 0.01171617 - time (sec): 86.19 - samples/sec: 908.03 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:14:56,079 epoch 9 - iter 1563/5212 - loss 0.01668874 - time (sec): 127.32 - samples/sec: 891.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:15:37,604 epoch 9 - iter 2084/5212 - loss 0.01665217 - time (sec): 168.85 - samples/sec: 894.47 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:16:21,106 epoch 9 - iter 2605/5212 - loss 0.01635385 - time (sec): 212.35 - samples/sec: 882.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:17:04,978 epoch 9 - iter 3126/5212 - loss 0.01550677 - time (sec): 256.22 - samples/sec: 872.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:17:47,321 epoch 9 - iter 3647/5212 - loss 0.01517082 - time (sec): 298.56 - samples/sec: 868.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:18:29,546 epoch 9 - iter 4168/5212 - loss 0.01578016 - time (sec): 340.79 - samples/sec: 872.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:19:12,396 epoch 9 - iter 4689/5212 - loss 0.01560147 - time (sec): 383.64 - samples/sec: 872.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:19:53,494 epoch 9 - iter 5210/5212 - loss 0.01545797 - time (sec): 424.74 - samples/sec: 864.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:19:53,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:53,648 EPOCH 9 done: loss 0.0155 - lr: 0.000003 |
|
2023-10-17 20:20:05,733 DEV : loss 0.5256008505821228 - f1-score (micro avg) 0.3744 |
|
2023-10-17 20:20:05,821 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:20:49,600 epoch 10 - iter 521/5212 - loss 0.00845977 - time (sec): 43.78 - samples/sec: 868.52 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:21:32,317 epoch 10 - iter 1042/5212 - loss 0.00922324 - time (sec): 86.49 - samples/sec: 897.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:22:12,750 epoch 10 - iter 1563/5212 - loss 0.00972860 - time (sec): 126.93 - samples/sec: 874.26 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:22:54,760 epoch 10 - iter 2084/5212 - loss 0.00952950 - time (sec): 168.94 - samples/sec: 881.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:23:39,004 epoch 10 - iter 2605/5212 - loss 0.00943308 - time (sec): 213.18 - samples/sec: 860.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:24:21,919 epoch 10 - iter 3126/5212 - loss 0.00987377 - time (sec): 256.09 - samples/sec: 859.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:25:08,498 epoch 10 - iter 3647/5212 - loss 0.00962367 - time (sec): 302.67 - samples/sec: 857.97 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:25:49,520 epoch 10 - iter 4168/5212 - loss 0.00965467 - time (sec): 343.70 - samples/sec: 862.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:26:29,257 epoch 10 - iter 4689/5212 - loss 0.00958044 - time (sec): 383.43 - samples/sec: 861.07 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:27:09,922 epoch 10 - iter 5210/5212 - loss 0.00919733 - time (sec): 424.10 - samples/sec: 866.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:27:10,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:27:10,080 EPOCH 10 done: loss 0.0092 - lr: 0.000000 |
|
2023-10-17 20:27:22,342 DEV : loss 0.5250148177146912 - f1-score (micro avg) 0.3723 |
|
2023-10-17 20:27:22,998 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:27:23,000 Loading model from best epoch ... |
|
2023-10-17 20:27:25,469 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 20:27:44,580 |
|
Results: |
|
- F-score (micro) 0.4828 |
|
- F-score (macro) 0.3302 |
|
- Accuracy 0.3225 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4639 0.6293 0.5341 1214 |
|
PER 0.4487 0.4926 0.4696 808 |
|
ORG 0.3199 0.3144 0.3171 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4416 0.5326 0.4828 2390 |
|
macro avg 0.3081 0.3591 0.3302 2390 |
|
weighted avg 0.4346 0.5326 0.4769 2390 |
|
|
|
2023-10-17 20:27:44,580 ---------------------------------------------------------------------------------------------------- |
|
|