|
2023-10-17 20:39:11,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,481 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:39:11,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,481 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:39:11,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,481 Train: 1085 sentences |
|
2023-10-17 20:39:11,482 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Training Params: |
|
2023-10-17 20:39:11,482 - learning_rate: "3e-05" |
|
2023-10-17 20:39:11,482 - mini_batch_size: "8" |
|
2023-10-17 20:39:11,482 - max_epochs: "10" |
|
2023-10-17 20:39:11,482 - shuffle: "True" |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Plugins: |
|
2023-10-17 20:39:11,482 - TensorboardLogger |
|
2023-10-17 20:39:11,482 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:39:11,482 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Computation: |
|
2023-10-17 20:39:11,482 - compute on device: cuda:0 |
|
2023-10-17 20:39:11,482 - embedding storage: none |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:11,482 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:39:12,671 epoch 1 - iter 13/136 - loss 3.66351920 - time (sec): 1.19 - samples/sec: 4185.66 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:39:14,181 epoch 1 - iter 26/136 - loss 3.35094627 - time (sec): 2.70 - samples/sec: 3722.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:39:15,811 epoch 1 - iter 39/136 - loss 2.78889854 - time (sec): 4.33 - samples/sec: 3789.25 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:39:17,138 epoch 1 - iter 52/136 - loss 2.32130470 - time (sec): 5.66 - samples/sec: 3843.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:39:18,491 epoch 1 - iter 65/136 - loss 2.05154624 - time (sec): 7.01 - samples/sec: 3723.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:39:19,960 epoch 1 - iter 78/136 - loss 1.82218984 - time (sec): 8.48 - samples/sec: 3638.21 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:39:21,219 epoch 1 - iter 91/136 - loss 1.64923175 - time (sec): 9.74 - samples/sec: 3622.53 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:39:22,378 epoch 1 - iter 104/136 - loss 1.49725228 - time (sec): 10.89 - samples/sec: 3659.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:39:23,642 epoch 1 - iter 117/136 - loss 1.36546981 - time (sec): 12.16 - samples/sec: 3680.93 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:39:25,121 epoch 1 - iter 130/136 - loss 1.23847244 - time (sec): 13.64 - samples/sec: 3690.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:39:25,573 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:25,574 EPOCH 1 done: loss 1.2071 - lr: 0.000028 |
|
2023-10-17 20:39:26,432 DEV : loss 0.20332075655460358 - f1-score (micro avg) 0.5808 |
|
2023-10-17 20:39:26,436 saving best model |
|
2023-10-17 20:39:26,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:27,986 epoch 2 - iter 13/136 - loss 0.18587840 - time (sec): 1.16 - samples/sec: 4009.69 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:39:29,351 epoch 2 - iter 26/136 - loss 0.19781664 - time (sec): 2.52 - samples/sec: 3563.72 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:39:30,710 epoch 2 - iter 39/136 - loss 0.20407721 - time (sec): 3.88 - samples/sec: 3624.52 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:39:32,113 epoch 2 - iter 52/136 - loss 0.19101613 - time (sec): 5.28 - samples/sec: 3685.94 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:39:33,309 epoch 2 - iter 65/136 - loss 0.18614276 - time (sec): 6.48 - samples/sec: 3715.90 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:39:34,677 epoch 2 - iter 78/136 - loss 0.17566842 - time (sec): 7.85 - samples/sec: 3735.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:39:36,076 epoch 2 - iter 91/136 - loss 0.18694374 - time (sec): 9.25 - samples/sec: 3712.42 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:39:37,199 epoch 2 - iter 104/136 - loss 0.18726542 - time (sec): 10.37 - samples/sec: 3713.87 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:39:38,767 epoch 2 - iter 117/136 - loss 0.18197226 - time (sec): 11.94 - samples/sec: 3700.41 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:39:40,506 epoch 2 - iter 130/136 - loss 0.17741685 - time (sec): 13.68 - samples/sec: 3667.22 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:39:41,046 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:41,046 EPOCH 2 done: loss 0.1749 - lr: 0.000027 |
|
2023-10-17 20:39:42,770 DEV : loss 0.13270148634910583 - f1-score (micro avg) 0.7061 |
|
2023-10-17 20:39:42,774 saving best model |
|
2023-10-17 20:39:43,238 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:44,421 epoch 3 - iter 13/136 - loss 0.12680936 - time (sec): 1.18 - samples/sec: 3578.39 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:39:45,887 epoch 3 - iter 26/136 - loss 0.10390425 - time (sec): 2.64 - samples/sec: 3589.08 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:39:47,329 epoch 3 - iter 39/136 - loss 0.10053727 - time (sec): 4.09 - samples/sec: 3574.81 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:39:48,540 epoch 3 - iter 52/136 - loss 0.10624829 - time (sec): 5.30 - samples/sec: 3648.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:39:49,952 epoch 3 - iter 65/136 - loss 0.10754921 - time (sec): 6.71 - samples/sec: 3654.59 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:39:51,355 epoch 3 - iter 78/136 - loss 0.10617729 - time (sec): 8.11 - samples/sec: 3717.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:39:52,709 epoch 3 - iter 91/136 - loss 0.10565851 - time (sec): 9.47 - samples/sec: 3679.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:39:54,070 epoch 3 - iter 104/136 - loss 0.10009536 - time (sec): 10.83 - samples/sec: 3660.77 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:39:55,781 epoch 3 - iter 117/136 - loss 0.09902985 - time (sec): 12.54 - samples/sec: 3632.31 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:39:57,000 epoch 3 - iter 130/136 - loss 0.09674527 - time (sec): 13.76 - samples/sec: 3644.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:39:57,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:39:57,560 EPOCH 3 done: loss 0.0993 - lr: 0.000024 |
|
2023-10-17 20:39:59,043 DEV : loss 0.0991288274526596 - f1-score (micro avg) 0.7614 |
|
2023-10-17 20:39:59,049 saving best model |
|
2023-10-17 20:39:59,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:00,996 epoch 4 - iter 13/136 - loss 0.05282612 - time (sec): 1.47 - samples/sec: 3232.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:40:02,432 epoch 4 - iter 26/136 - loss 0.06127360 - time (sec): 2.91 - samples/sec: 3346.40 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:40:03,973 epoch 4 - iter 39/136 - loss 0.05866787 - time (sec): 4.45 - samples/sec: 3296.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:40:05,425 epoch 4 - iter 52/136 - loss 0.05512121 - time (sec): 5.90 - samples/sec: 3341.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:40:07,041 epoch 4 - iter 65/136 - loss 0.05533475 - time (sec): 7.52 - samples/sec: 3417.43 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:40:08,319 epoch 4 - iter 78/136 - loss 0.05593976 - time (sec): 8.80 - samples/sec: 3483.99 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:40:09,788 epoch 4 - iter 91/136 - loss 0.05569689 - time (sec): 10.26 - samples/sec: 3456.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:40:11,121 epoch 4 - iter 104/136 - loss 0.06146163 - time (sec): 11.60 - samples/sec: 3476.78 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:40:12,351 epoch 4 - iter 117/136 - loss 0.06045633 - time (sec): 12.83 - samples/sec: 3511.82 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:40:13,785 epoch 4 - iter 130/136 - loss 0.05823953 - time (sec): 14.26 - samples/sec: 3489.25 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:40:14,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:14,387 EPOCH 4 done: loss 0.0590 - lr: 0.000020 |
|
2023-10-17 20:40:15,849 DEV : loss 0.11398832499980927 - f1-score (micro avg) 0.7549 |
|
2023-10-17 20:40:15,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:17,373 epoch 5 - iter 13/136 - loss 0.05351999 - time (sec): 1.52 - samples/sec: 3707.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:40:18,839 epoch 5 - iter 26/136 - loss 0.04693685 - time (sec): 2.98 - samples/sec: 3751.47 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:40:20,118 epoch 5 - iter 39/136 - loss 0.04291263 - time (sec): 4.26 - samples/sec: 3676.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:40:21,617 epoch 5 - iter 52/136 - loss 0.03825795 - time (sec): 5.76 - samples/sec: 3663.68 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:40:23,117 epoch 5 - iter 65/136 - loss 0.04225632 - time (sec): 7.26 - samples/sec: 3631.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:40:24,348 epoch 5 - iter 78/136 - loss 0.04081450 - time (sec): 8.49 - samples/sec: 3627.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:40:25,720 epoch 5 - iter 91/136 - loss 0.04086109 - time (sec): 9.87 - samples/sec: 3608.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:40:26,858 epoch 5 - iter 104/136 - loss 0.04108347 - time (sec): 11.00 - samples/sec: 3646.16 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:40:28,307 epoch 5 - iter 117/136 - loss 0.04124379 - time (sec): 12.45 - samples/sec: 3647.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:40:29,651 epoch 5 - iter 130/136 - loss 0.04185840 - time (sec): 13.80 - samples/sec: 3638.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:40:30,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:30,172 EPOCH 5 done: loss 0.0429 - lr: 0.000017 |
|
2023-10-17 20:40:31,636 DEV : loss 0.12549878656864166 - f1-score (micro avg) 0.7956 |
|
2023-10-17 20:40:31,640 saving best model |
|
2023-10-17 20:40:32,126 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:33,534 epoch 6 - iter 13/136 - loss 0.02881144 - time (sec): 1.41 - samples/sec: 3590.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:40:34,762 epoch 6 - iter 26/136 - loss 0.03664492 - time (sec): 2.63 - samples/sec: 3601.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:40:36,113 epoch 6 - iter 39/136 - loss 0.03420146 - time (sec): 3.98 - samples/sec: 3672.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:40:37,606 epoch 6 - iter 52/136 - loss 0.03091671 - time (sec): 5.48 - samples/sec: 3619.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:40:39,092 epoch 6 - iter 65/136 - loss 0.02845803 - time (sec): 6.96 - samples/sec: 3593.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:40:40,730 epoch 6 - iter 78/136 - loss 0.02735186 - time (sec): 8.60 - samples/sec: 3571.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:40:42,066 epoch 6 - iter 91/136 - loss 0.02646650 - time (sec): 9.94 - samples/sec: 3550.71 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:40:43,402 epoch 6 - iter 104/136 - loss 0.02865805 - time (sec): 11.27 - samples/sec: 3552.07 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:40:44,931 epoch 6 - iter 117/136 - loss 0.02748370 - time (sec): 12.80 - samples/sec: 3559.68 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:40:46,191 epoch 6 - iter 130/136 - loss 0.02689993 - time (sec): 14.06 - samples/sec: 3550.44 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:40:46,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:46,715 EPOCH 6 done: loss 0.0276 - lr: 0.000014 |
|
2023-10-17 20:40:48,167 DEV : loss 0.13069528341293335 - f1-score (micro avg) 0.8163 |
|
2023-10-17 20:40:48,172 saving best model |
|
2023-10-17 20:40:48,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:40:50,165 epoch 7 - iter 13/136 - loss 0.00713212 - time (sec): 1.51 - samples/sec: 3240.87 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:40:51,776 epoch 7 - iter 26/136 - loss 0.00668770 - time (sec): 3.12 - samples/sec: 3534.48 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:40:53,147 epoch 7 - iter 39/136 - loss 0.00787348 - time (sec): 4.50 - samples/sec: 3413.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:40:54,368 epoch 7 - iter 52/136 - loss 0.00951842 - time (sec): 5.72 - samples/sec: 3414.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:40:55,775 epoch 7 - iter 65/136 - loss 0.01119866 - time (sec): 7.12 - samples/sec: 3514.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:40:57,162 epoch 7 - iter 78/136 - loss 0.01398365 - time (sec): 8.51 - samples/sec: 3598.78 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:40:58,452 epoch 7 - iter 91/136 - loss 0.01492819 - time (sec): 9.80 - samples/sec: 3635.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:40:59,830 epoch 7 - iter 104/136 - loss 0.01779001 - time (sec): 11.18 - samples/sec: 3604.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:41:01,229 epoch 7 - iter 117/136 - loss 0.01803445 - time (sec): 12.58 - samples/sec: 3603.26 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:41:02,773 epoch 7 - iter 130/136 - loss 0.01711632 - time (sec): 14.12 - samples/sec: 3570.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:41:03,311 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:03,311 EPOCH 7 done: loss 0.0176 - lr: 0.000010 |
|
2023-10-17 20:41:04,766 DEV : loss 0.14296813309192657 - f1-score (micro avg) 0.797 |
|
2023-10-17 20:41:04,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:06,137 epoch 8 - iter 13/136 - loss 0.02668612 - time (sec): 1.37 - samples/sec: 3487.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:41:07,516 epoch 8 - iter 26/136 - loss 0.01656597 - time (sec): 2.74 - samples/sec: 3517.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:41:08,875 epoch 8 - iter 39/136 - loss 0.02124971 - time (sec): 4.10 - samples/sec: 3480.74 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:41:10,335 epoch 8 - iter 52/136 - loss 0.01963592 - time (sec): 5.56 - samples/sec: 3450.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:41:11,663 epoch 8 - iter 65/136 - loss 0.01915559 - time (sec): 6.89 - samples/sec: 3486.70 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:41:13,484 epoch 8 - iter 78/136 - loss 0.01774397 - time (sec): 8.71 - samples/sec: 3471.36 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:41:14,773 epoch 8 - iter 91/136 - loss 0.01724397 - time (sec): 10.00 - samples/sec: 3499.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:41:16,117 epoch 8 - iter 104/136 - loss 0.01668140 - time (sec): 11.35 - samples/sec: 3552.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:41:17,408 epoch 8 - iter 117/136 - loss 0.01634825 - time (sec): 12.64 - samples/sec: 3538.35 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:41:18,726 epoch 8 - iter 130/136 - loss 0.01685340 - time (sec): 13.95 - samples/sec: 3587.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:41:19,222 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:19,222 EPOCH 8 done: loss 0.0163 - lr: 0.000007 |
|
2023-10-17 20:41:20,691 DEV : loss 0.15952429175376892 - f1-score (micro avg) 0.7905 |
|
2023-10-17 20:41:20,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:22,011 epoch 9 - iter 13/136 - loss 0.00850981 - time (sec): 1.31 - samples/sec: 3948.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:41:23,773 epoch 9 - iter 26/136 - loss 0.01655706 - time (sec): 3.08 - samples/sec: 3611.56 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:41:25,267 epoch 9 - iter 39/136 - loss 0.01193651 - time (sec): 4.57 - samples/sec: 3492.51 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:41:26,512 epoch 9 - iter 52/136 - loss 0.01210586 - time (sec): 5.82 - samples/sec: 3470.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:41:27,734 epoch 9 - iter 65/136 - loss 0.01443710 - time (sec): 7.04 - samples/sec: 3551.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:41:29,126 epoch 9 - iter 78/136 - loss 0.01276575 - time (sec): 8.43 - samples/sec: 3597.34 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:41:30,478 epoch 9 - iter 91/136 - loss 0.01220159 - time (sec): 9.78 - samples/sec: 3619.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:41:31,831 epoch 9 - iter 104/136 - loss 0.01241011 - time (sec): 11.13 - samples/sec: 3601.88 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:41:33,200 epoch 9 - iter 117/136 - loss 0.01124102 - time (sec): 12.50 - samples/sec: 3620.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:41:34,577 epoch 9 - iter 130/136 - loss 0.01118786 - time (sec): 13.88 - samples/sec: 3620.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:41:35,049 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:35,049 EPOCH 9 done: loss 0.0118 - lr: 0.000004 |
|
2023-10-17 20:41:36,509 DEV : loss 0.16205447912216187 - f1-score (micro avg) 0.792 |
|
2023-10-17 20:41:36,513 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:37,852 epoch 10 - iter 13/136 - loss 0.00911916 - time (sec): 1.34 - samples/sec: 4035.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:41:39,113 epoch 10 - iter 26/136 - loss 0.00708093 - time (sec): 2.60 - samples/sec: 3932.63 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:41:40,383 epoch 10 - iter 39/136 - loss 0.00680133 - time (sec): 3.87 - samples/sec: 3881.20 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:41:41,593 epoch 10 - iter 52/136 - loss 0.00961504 - time (sec): 5.08 - samples/sec: 3714.43 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:41:43,057 epoch 10 - iter 65/136 - loss 0.00876011 - time (sec): 6.54 - samples/sec: 3690.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:41:44,433 epoch 10 - iter 78/136 - loss 0.01025725 - time (sec): 7.92 - samples/sec: 3679.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:41:45,670 epoch 10 - iter 91/136 - loss 0.01108971 - time (sec): 9.16 - samples/sec: 3682.38 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:41:47,314 epoch 10 - iter 104/136 - loss 0.01226546 - time (sec): 10.80 - samples/sec: 3660.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:41:48,790 epoch 10 - iter 117/136 - loss 0.01149985 - time (sec): 12.28 - samples/sec: 3654.50 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:41:50,022 epoch 10 - iter 130/136 - loss 0.01055171 - time (sec): 13.51 - samples/sec: 3687.81 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:41:50,580 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:50,580 EPOCH 10 done: loss 0.0107 - lr: 0.000000 |
|
2023-10-17 20:41:52,064 DEV : loss 0.17041459679603577 - f1-score (micro avg) 0.7898 |
|
2023-10-17 20:41:52,460 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:41:52,462 Loading model from best epoch ... |
|
2023-10-17 20:41:54,118 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:41:56,326 |
|
Results: |
|
- F-score (micro) 0.8073 |
|
- F-score (macro) 0.7683 |
|
- Accuracy 0.6924 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8567 0.8814 0.8689 312 |
|
PER 0.7011 0.8798 0.7804 208 |
|
ORG 0.5254 0.5636 0.5439 55 |
|
HumanProd 0.7857 1.0000 0.8800 22 |
|
|
|
micro avg 0.7638 0.8559 0.8073 597 |
|
macro avg 0.7172 0.8312 0.7683 597 |
|
weighted avg 0.7694 0.8559 0.8085 597 |
|
|
|
2023-10-17 20:41:56,326 ---------------------------------------------------------------------------------------------------- |
|
|