|
2023-10-17 20:42:20,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Train: 1085 sentences |
|
2023-10-17 20:42:20,537 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Training Params: |
|
2023-10-17 20:42:20,537 - learning_rate: "5e-05" |
|
2023-10-17 20:42:20,537 - mini_batch_size: "8" |
|
2023-10-17 20:42:20,537 - max_epochs: "10" |
|
2023-10-17 20:42:20,537 - shuffle: "True" |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Plugins: |
|
2023-10-17 20:42:20,537 - TensorboardLogger |
|
2023-10-17 20:42:20,537 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:42:20,537 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,537 Computation: |
|
2023-10-17 20:42:20,537 - compute on device: cuda:0 |
|
2023-10-17 20:42:20,537 - embedding storage: none |
|
2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,538 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-17 20:42:20,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:20,538 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:42:21,708 epoch 1 - iter 13/136 - loss 3.61948799 - time (sec): 1.17 - samples/sec: 4249.82 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:42:23,195 epoch 1 - iter 26/136 - loss 3.13237506 - time (sec): 2.66 - samples/sec: 3779.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:42:24,833 epoch 1 - iter 39/136 - loss 2.34189086 - time (sec): 4.29 - samples/sec: 3817.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:42:26,155 epoch 1 - iter 52/136 - loss 1.92616474 - time (sec): 5.62 - samples/sec: 3869.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:42:27,501 epoch 1 - iter 65/136 - loss 1.70538327 - time (sec): 6.96 - samples/sec: 3747.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:42:28,954 epoch 1 - iter 78/136 - loss 1.51058071 - time (sec): 8.42 - samples/sec: 3664.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:42:30,242 epoch 1 - iter 91/136 - loss 1.36273296 - time (sec): 9.70 - samples/sec: 3634.79 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:42:31,421 epoch 1 - iter 104/136 - loss 1.23429361 - time (sec): 10.88 - samples/sec: 3663.48 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:42:32,684 epoch 1 - iter 117/136 - loss 1.12489260 - time (sec): 12.15 - samples/sec: 3684.80 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:42:34,173 epoch 1 - iter 130/136 - loss 1.01914801 - time (sec): 13.63 - samples/sec: 3691.10 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:42:34,636 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:34,636 EPOCH 1 done: loss 0.9934 - lr: 0.000047 |
|
2023-10-17 20:42:35,690 DEV : loss 0.17961926758289337 - f1-score (micro avg) 0.5662 |
|
2023-10-17 20:42:35,695 saving best model |
|
2023-10-17 20:42:36,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:37,254 epoch 2 - iter 13/136 - loss 0.15228697 - time (sec): 1.20 - samples/sec: 3859.37 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 20:42:38,649 epoch 2 - iter 26/136 - loss 0.16566093 - time (sec): 2.60 - samples/sec: 3461.64 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:42:40,011 epoch 2 - iter 39/136 - loss 0.17398233 - time (sec): 3.96 - samples/sec: 3553.30 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:42:41,406 epoch 2 - iter 52/136 - loss 0.16043159 - time (sec): 5.35 - samples/sec: 3638.07 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:42:42,617 epoch 2 - iter 65/136 - loss 0.15531100 - time (sec): 6.56 - samples/sec: 3667.80 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:42:44,001 epoch 2 - iter 78/136 - loss 0.14560993 - time (sec): 7.95 - samples/sec: 3688.02 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:42:45,411 epoch 2 - iter 91/136 - loss 0.15781392 - time (sec): 9.36 - samples/sec: 3668.01 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:42:46,561 epoch 2 - iter 104/136 - loss 0.15797681 - time (sec): 10.51 - samples/sec: 3665.01 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:42:48,140 epoch 2 - iter 117/136 - loss 0.15530704 - time (sec): 12.09 - samples/sec: 3654.54 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:42:49,877 epoch 2 - iter 130/136 - loss 0.15218247 - time (sec): 13.82 - samples/sec: 3627.87 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:42:50,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:50,416 EPOCH 2 done: loss 0.1500 - lr: 0.000045 |
|
2023-10-17 20:42:51,874 DEV : loss 0.11248722672462463 - f1-score (micro avg) 0.7532 |
|
2023-10-17 20:42:51,879 saving best model |
|
2023-10-17 20:42:52,359 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:42:53,518 epoch 3 - iter 13/136 - loss 0.10254776 - time (sec): 1.16 - samples/sec: 3639.39 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:42:54,973 epoch 3 - iter 26/136 - loss 0.08535292 - time (sec): 2.61 - samples/sec: 3632.11 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:42:56,426 epoch 3 - iter 39/136 - loss 0.08605534 - time (sec): 4.07 - samples/sec: 3592.38 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:42:57,652 epoch 3 - iter 52/136 - loss 0.08805134 - time (sec): 5.29 - samples/sec: 3651.61 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:42:59,085 epoch 3 - iter 65/136 - loss 0.09492906 - time (sec): 6.72 - samples/sec: 3646.33 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:43:00,479 epoch 3 - iter 78/136 - loss 0.09268581 - time (sec): 8.12 - samples/sec: 3714.31 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:43:01,836 epoch 3 - iter 91/136 - loss 0.09202451 - time (sec): 9.48 - samples/sec: 3675.97 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:43:03,179 epoch 3 - iter 104/136 - loss 0.08747935 - time (sec): 10.82 - samples/sec: 3663.44 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:43:04,878 epoch 3 - iter 117/136 - loss 0.08598807 - time (sec): 12.52 - samples/sec: 3638.22 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:43:06,097 epoch 3 - iter 130/136 - loss 0.08406916 - time (sec): 13.74 - samples/sec: 3649.73 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 20:43:06,660 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:06,660 EPOCH 3 done: loss 0.0864 - lr: 0.000039 |
|
2023-10-17 20:43:08,273 DEV : loss 0.10026960074901581 - f1-score (micro avg) 0.776 |
|
2023-10-17 20:43:08,278 saving best model |
|
2023-10-17 20:43:08,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:10,225 epoch 4 - iter 13/136 - loss 0.04229442 - time (sec): 1.51 - samples/sec: 3161.21 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:43:11,715 epoch 4 - iter 26/136 - loss 0.05020601 - time (sec): 3.00 - samples/sec: 3249.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:43:13,276 epoch 4 - iter 39/136 - loss 0.04768072 - time (sec): 4.56 - samples/sec: 3218.31 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:43:14,752 epoch 4 - iter 52/136 - loss 0.04433361 - time (sec): 6.03 - samples/sec: 3268.99 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:43:16,373 epoch 4 - iter 65/136 - loss 0.04410039 - time (sec): 7.65 - samples/sec: 3356.66 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:43:17,660 epoch 4 - iter 78/136 - loss 0.04603029 - time (sec): 8.94 - samples/sec: 3427.50 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:43:19,118 epoch 4 - iter 91/136 - loss 0.04570230 - time (sec): 10.40 - samples/sec: 3411.83 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:43:20,427 epoch 4 - iter 104/136 - loss 0.05176775 - time (sec): 11.71 - samples/sec: 3444.17 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:43:21,663 epoch 4 - iter 117/136 - loss 0.05143245 - time (sec): 12.94 - samples/sec: 3480.37 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:43:23,091 epoch 4 - iter 130/136 - loss 0.04955538 - time (sec): 14.37 - samples/sec: 3462.37 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:43:23,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:23,671 EPOCH 4 done: loss 0.0496 - lr: 0.000034 |
|
2023-10-17 20:43:25,138 DEV : loss 0.11402004957199097 - f1-score (micro avg) 0.792 |
|
2023-10-17 20:43:25,144 saving best model |
|
2023-10-17 20:43:25,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:26,954 epoch 5 - iter 13/136 - loss 0.03063410 - time (sec): 1.32 - samples/sec: 4249.37 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:43:28,426 epoch 5 - iter 26/136 - loss 0.02995518 - time (sec): 2.80 - samples/sec: 4003.21 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:43:29,698 epoch 5 - iter 39/136 - loss 0.02782725 - time (sec): 4.07 - samples/sec: 3852.73 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:43:31,184 epoch 5 - iter 52/136 - loss 0.02730023 - time (sec): 5.55 - samples/sec: 3800.64 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:43:32,669 epoch 5 - iter 65/136 - loss 0.03436385 - time (sec): 7.04 - samples/sec: 3746.35 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:43:33,900 epoch 5 - iter 78/136 - loss 0.03270278 - time (sec): 8.27 - samples/sec: 3724.77 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:43:35,271 epoch 5 - iter 91/136 - loss 0.03166123 - time (sec): 9.64 - samples/sec: 3692.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:43:36,407 epoch 5 - iter 104/136 - loss 0.03261930 - time (sec): 10.78 - samples/sec: 3722.45 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:43:37,855 epoch 5 - iter 117/136 - loss 0.03205863 - time (sec): 12.23 - samples/sec: 3715.32 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:43:39,189 epoch 5 - iter 130/136 - loss 0.03339591 - time (sec): 13.56 - samples/sec: 3701.65 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:43:39,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:39,701 EPOCH 5 done: loss 0.0349 - lr: 0.000028 |
|
2023-10-17 20:43:41,161 DEV : loss 0.1294308602809906 - f1-score (micro avg) 0.8029 |
|
2023-10-17 20:43:41,171 saving best model |
|
2023-10-17 20:43:41,686 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:43,361 epoch 6 - iter 13/136 - loss 0.02234781 - time (sec): 1.67 - samples/sec: 3017.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:43:44,615 epoch 6 - iter 26/136 - loss 0.03157132 - time (sec): 2.93 - samples/sec: 3241.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:43:45,968 epoch 6 - iter 39/136 - loss 0.02896451 - time (sec): 4.28 - samples/sec: 3419.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:43:47,486 epoch 6 - iter 52/136 - loss 0.02436989 - time (sec): 5.80 - samples/sec: 3419.02 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:43:48,962 epoch 6 - iter 65/136 - loss 0.02415456 - time (sec): 7.27 - samples/sec: 3440.57 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:43:50,588 epoch 6 - iter 78/136 - loss 0.02331112 - time (sec): 8.90 - samples/sec: 3451.15 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:43:51,950 epoch 6 - iter 91/136 - loss 0.02208564 - time (sec): 10.26 - samples/sec: 3438.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:43:53,304 epoch 6 - iter 104/136 - loss 0.02410552 - time (sec): 11.62 - samples/sec: 3447.65 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:43:54,810 epoch 6 - iter 117/136 - loss 0.02282725 - time (sec): 13.12 - samples/sec: 3473.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:43:56,039 epoch 6 - iter 130/136 - loss 0.02215833 - time (sec): 14.35 - samples/sec: 3479.14 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:43:56,545 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:56,545 EPOCH 6 done: loss 0.0227 - lr: 0.000023 |
|
2023-10-17 20:43:58,074 DEV : loss 0.13876760005950928 - f1-score (micro avg) 0.7971 |
|
2023-10-17 20:43:58,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:43:59,378 epoch 7 - iter 13/136 - loss 0.00755380 - time (sec): 1.30 - samples/sec: 3785.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:44:00,973 epoch 7 - iter 26/136 - loss 0.00992484 - time (sec): 2.89 - samples/sec: 3820.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:44:02,332 epoch 7 - iter 39/136 - loss 0.01261973 - time (sec): 4.25 - samples/sec: 3610.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:44:03,564 epoch 7 - iter 52/136 - loss 0.01262043 - time (sec): 5.48 - samples/sec: 3560.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:44:04,979 epoch 7 - iter 65/136 - loss 0.01417440 - time (sec): 6.90 - samples/sec: 3630.20 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:44:06,376 epoch 7 - iter 78/136 - loss 0.01699668 - time (sec): 8.29 - samples/sec: 3692.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:44:07,667 epoch 7 - iter 91/136 - loss 0.01622148 - time (sec): 9.59 - samples/sec: 3717.77 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:44:09,054 epoch 7 - iter 104/136 - loss 0.01831688 - time (sec): 10.97 - samples/sec: 3672.06 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:44:10,463 epoch 7 - iter 117/136 - loss 0.01805275 - time (sec): 12.38 - samples/sec: 3660.48 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:44:12,008 epoch 7 - iter 130/136 - loss 0.01673109 - time (sec): 13.93 - samples/sec: 3620.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:44:12,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:12,549 EPOCH 7 done: loss 0.0172 - lr: 0.000017 |
|
2023-10-17 20:44:14,050 DEV : loss 0.135068878531456 - f1-score (micro avg) 0.8007 |
|
2023-10-17 20:44:14,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:15,400 epoch 8 - iter 13/136 - loss 0.02534486 - time (sec): 1.34 - samples/sec: 3544.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:44:17,001 epoch 8 - iter 26/136 - loss 0.01371569 - time (sec): 2.94 - samples/sec: 3276.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:44:18,363 epoch 8 - iter 39/136 - loss 0.01519439 - time (sec): 4.31 - samples/sec: 3316.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:44:19,814 epoch 8 - iter 52/136 - loss 0.01421570 - time (sec): 5.76 - samples/sec: 3334.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:44:21,129 epoch 8 - iter 65/136 - loss 0.01455442 - time (sec): 7.07 - samples/sec: 3396.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:44:22,934 epoch 8 - iter 78/136 - loss 0.01281225 - time (sec): 8.88 - samples/sec: 3406.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:44:24,198 epoch 8 - iter 91/136 - loss 0.01236698 - time (sec): 10.14 - samples/sec: 3450.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:44:25,545 epoch 8 - iter 104/136 - loss 0.01253968 - time (sec): 11.49 - samples/sec: 3507.74 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:44:26,849 epoch 8 - iter 117/136 - loss 0.01184043 - time (sec): 12.79 - samples/sec: 3494.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:44:28,189 epoch 8 - iter 130/136 - loss 0.01190958 - time (sec): 14.13 - samples/sec: 3542.44 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:44:28,676 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:28,676 EPOCH 8 done: loss 0.0115 - lr: 0.000012 |
|
2023-10-17 20:44:30,188 DEV : loss 0.15049846470355988 - f1-score (micro avg) 0.8125 |
|
2023-10-17 20:44:30,196 saving best model |
|
2023-10-17 20:44:30,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:32,109 epoch 9 - iter 13/136 - loss 0.00518564 - time (sec): 1.35 - samples/sec: 3834.24 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:44:33,708 epoch 9 - iter 26/136 - loss 0.01160391 - time (sec): 2.95 - samples/sec: 3762.86 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:44:35,210 epoch 9 - iter 39/136 - loss 0.00814467 - time (sec): 4.45 - samples/sec: 3583.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:44:36,459 epoch 9 - iter 52/136 - loss 0.00815978 - time (sec): 5.70 - samples/sec: 3538.25 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:44:37,693 epoch 9 - iter 65/136 - loss 0.01000613 - time (sec): 6.94 - samples/sec: 3602.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:44:39,091 epoch 9 - iter 78/136 - loss 0.00866915 - time (sec): 8.33 - samples/sec: 3638.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:44:40,432 epoch 9 - iter 91/136 - loss 0.00783014 - time (sec): 9.68 - samples/sec: 3658.37 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:44:41,715 epoch 9 - iter 104/136 - loss 0.00795667 - time (sec): 10.96 - samples/sec: 3659.40 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:44:43,025 epoch 9 - iter 117/136 - loss 0.00776984 - time (sec): 12.27 - samples/sec: 3689.62 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:44:44,402 epoch 9 - iter 130/136 - loss 0.00793713 - time (sec): 13.65 - samples/sec: 3682.03 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:44:44,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:44,873 EPOCH 9 done: loss 0.0083 - lr: 0.000006 |
|
2023-10-17 20:44:46,356 DEV : loss 0.1609845757484436 - f1-score (micro avg) 0.8044 |
|
2023-10-17 20:44:46,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:44:47,715 epoch 10 - iter 13/136 - loss 0.00546006 - time (sec): 1.35 - samples/sec: 3993.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:44:49,007 epoch 10 - iter 26/136 - loss 0.00322110 - time (sec): 2.64 - samples/sec: 3865.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:44:50,499 epoch 10 - iter 39/136 - loss 0.00273811 - time (sec): 4.14 - samples/sec: 3629.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:44:51,704 epoch 10 - iter 52/136 - loss 0.00447940 - time (sec): 5.34 - samples/sec: 3532.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:44:53,162 epoch 10 - iter 65/136 - loss 0.00412880 - time (sec): 6.80 - samples/sec: 3551.89 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:44:54,559 epoch 10 - iter 78/136 - loss 0.00483931 - time (sec): 8.20 - samples/sec: 3554.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:44:55,825 epoch 10 - iter 91/136 - loss 0.00597590 - time (sec): 9.46 - samples/sec: 3563.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:44:57,492 epoch 10 - iter 104/136 - loss 0.00684780 - time (sec): 11.13 - samples/sec: 3552.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:44:58,987 epoch 10 - iter 117/136 - loss 0.00648974 - time (sec): 12.62 - samples/sec: 3553.46 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:45:00,225 epoch 10 - iter 130/136 - loss 0.00586892 - time (sec): 13.86 - samples/sec: 3593.40 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:45:00,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:45:00,771 EPOCH 10 done: loss 0.0060 - lr: 0.000000 |
|
2023-10-17 20:45:02,256 DEV : loss 0.1766374260187149 - f1-score (micro avg) 0.8 |
|
2023-10-17 20:45:02,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:45:02,613 Loading model from best epoch ... |
|
2023-10-17 20:45:04,070 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:45:06,110 |
|
Results: |
|
- F-score (micro) 0.8003 |
|
- F-score (macro) 0.7578 |
|
- Accuracy 0.6851 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8081 0.8910 0.8476 312 |
|
PER 0.7126 0.8702 0.7835 208 |
|
ORG 0.5957 0.5091 0.5490 55 |
|
HumanProd 0.8000 0.9091 0.8511 22 |
|
|
|
micro avg 0.7567 0.8492 0.8003 597 |
|
macro avg 0.7291 0.7948 0.7578 597 |
|
weighted avg 0.7550 0.8492 0.7979 597 |
|
|
|
2023-10-17 20:45:06,110 ---------------------------------------------------------------------------------------------------- |
|
|