|
2023-10-17 20:08:58,049 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Train: 1085 sentences |
|
2023-10-17 20:08:58,050 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Training Params: |
|
2023-10-17 20:08:58,050 - learning_rate: "5e-05" |
|
2023-10-17 20:08:58,050 - mini_batch_size: "4" |
|
2023-10-17 20:08:58,050 - max_epochs: "10" |
|
2023-10-17 20:08:58,050 - shuffle: "True" |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Plugins: |
|
2023-10-17 20:08:58,050 - TensorboardLogger |
|
2023-10-17 20:08:58,050 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:08:58,050 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,050 Computation: |
|
2023-10-17 20:08:58,050 - compute on device: cuda:0 |
|
2023-10-17 20:08:58,050 - embedding storage: none |
|
2023-10-17 20:08:58,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,051 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 20:08:58,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:58,051 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:08:59,690 epoch 1 - iter 27/272 - loss 3.36464584 - time (sec): 1.64 - samples/sec: 3365.74 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:09:01,419 epoch 1 - iter 54/272 - loss 2.61595255 - time (sec): 3.37 - samples/sec: 3463.47 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:09:03,073 epoch 1 - iter 81/272 - loss 1.95888601 - time (sec): 5.02 - samples/sec: 3391.46 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:09:04,703 epoch 1 - iter 108/272 - loss 1.63527716 - time (sec): 6.65 - samples/sec: 3324.49 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:09:06,324 epoch 1 - iter 135/272 - loss 1.40247920 - time (sec): 8.27 - samples/sec: 3279.12 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:09:07,837 epoch 1 - iter 162/272 - loss 1.24766159 - time (sec): 9.79 - samples/sec: 3203.87 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:09:09,325 epoch 1 - iter 189/272 - loss 1.11908011 - time (sec): 11.27 - samples/sec: 3213.03 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:09:10,977 epoch 1 - iter 216/272 - loss 0.97436126 - time (sec): 12.93 - samples/sec: 3284.80 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:09:12,421 epoch 1 - iter 243/272 - loss 0.90626581 - time (sec): 14.37 - samples/sec: 3260.16 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:09:13,956 epoch 1 - iter 270/272 - loss 0.83943344 - time (sec): 15.90 - samples/sec: 3255.35 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:09:14,057 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:14,057 EPOCH 1 done: loss 0.8379 - lr: 0.000049 |
|
2023-10-17 20:09:15,186 DEV : loss 0.13628603518009186 - f1-score (micro avg) 0.6761 |
|
2023-10-17 20:09:15,190 saving best model |
|
2023-10-17 20:09:15,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:17,030 epoch 2 - iter 27/272 - loss 0.18530521 - time (sec): 1.46 - samples/sec: 3299.00 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:09:18,617 epoch 2 - iter 54/272 - loss 0.22313282 - time (sec): 3.05 - samples/sec: 3322.10 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:09:20,139 epoch 2 - iter 81/272 - loss 0.19202817 - time (sec): 4.57 - samples/sec: 3313.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:09:21,803 epoch 2 - iter 108/272 - loss 0.17147208 - time (sec): 6.24 - samples/sec: 3183.13 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:09:23,468 epoch 2 - iter 135/272 - loss 0.16183630 - time (sec): 7.90 - samples/sec: 3251.91 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:09:25,031 epoch 2 - iter 162/272 - loss 0.15586281 - time (sec): 9.46 - samples/sec: 3216.56 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:09:26,596 epoch 2 - iter 189/272 - loss 0.14885775 - time (sec): 11.03 - samples/sec: 3216.98 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:09:28,246 epoch 2 - iter 216/272 - loss 0.14799766 - time (sec): 12.68 - samples/sec: 3269.07 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:09:29,853 epoch 2 - iter 243/272 - loss 0.14386111 - time (sec): 14.29 - samples/sec: 3280.38 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:09:31,438 epoch 2 - iter 270/272 - loss 0.13889560 - time (sec): 15.87 - samples/sec: 3253.07 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:09:31,579 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:31,579 EPOCH 2 done: loss 0.1379 - lr: 0.000045 |
|
2023-10-17 20:09:33,063 DEV : loss 0.09656020253896713 - f1-score (micro avg) 0.7568 |
|
2023-10-17 20:09:33,069 saving best model |
|
2023-10-17 20:09:33,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:35,102 epoch 3 - iter 27/272 - loss 0.08143101 - time (sec): 1.55 - samples/sec: 3206.66 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:09:36,499 epoch 3 - iter 54/272 - loss 0.08721599 - time (sec): 2.94 - samples/sec: 3137.14 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:09:38,090 epoch 3 - iter 81/272 - loss 0.10064725 - time (sec): 4.53 - samples/sec: 3203.43 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:09:39,623 epoch 3 - iter 108/272 - loss 0.08740114 - time (sec): 6.07 - samples/sec: 3267.36 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:09:41,190 epoch 3 - iter 135/272 - loss 0.08333219 - time (sec): 7.63 - samples/sec: 3287.83 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:09:42,776 epoch 3 - iter 162/272 - loss 0.08611077 - time (sec): 9.22 - samples/sec: 3277.71 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:09:44,312 epoch 3 - iter 189/272 - loss 0.09438552 - time (sec): 10.76 - samples/sec: 3228.93 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:09:46,017 epoch 3 - iter 216/272 - loss 0.09169296 - time (sec): 12.46 - samples/sec: 3239.19 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:09:47,523 epoch 3 - iter 243/272 - loss 0.09211531 - time (sec): 13.97 - samples/sec: 3251.97 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:09:49,243 epoch 3 - iter 270/272 - loss 0.09033732 - time (sec): 15.69 - samples/sec: 3288.37 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 20:09:49,374 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:49,374 EPOCH 3 done: loss 0.0903 - lr: 0.000039 |
|
2023-10-17 20:09:50,835 DEV : loss 0.1402185708284378 - f1-score (micro avg) 0.8037 |
|
2023-10-17 20:09:50,840 saving best model |
|
2023-10-17 20:09:51,303 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:52,787 epoch 4 - iter 27/272 - loss 0.03350101 - time (sec): 1.48 - samples/sec: 2905.43 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:09:54,363 epoch 4 - iter 54/272 - loss 0.04361525 - time (sec): 3.06 - samples/sec: 3126.00 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:09:55,995 epoch 4 - iter 81/272 - loss 0.04585125 - time (sec): 4.69 - samples/sec: 3067.31 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:09:57,645 epoch 4 - iter 108/272 - loss 0.05242094 - time (sec): 6.34 - samples/sec: 3084.77 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:09:59,177 epoch 4 - iter 135/272 - loss 0.04837595 - time (sec): 7.87 - samples/sec: 3158.44 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:10:00,892 epoch 4 - iter 162/272 - loss 0.05957811 - time (sec): 9.59 - samples/sec: 3182.84 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:10:02,471 epoch 4 - iter 189/272 - loss 0.06049789 - time (sec): 11.17 - samples/sec: 3189.41 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:10:04,095 epoch 4 - iter 216/272 - loss 0.05601630 - time (sec): 12.79 - samples/sec: 3204.40 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:10:05,705 epoch 4 - iter 243/272 - loss 0.05920188 - time (sec): 14.40 - samples/sec: 3254.20 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:10:07,220 epoch 4 - iter 270/272 - loss 0.05626618 - time (sec): 15.91 - samples/sec: 3247.34 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:10:07,353 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:07,353 EPOCH 4 done: loss 0.0559 - lr: 0.000033 |
|
2023-10-17 20:10:08,860 DEV : loss 0.12993647158145905 - f1-score (micro avg) 0.7764 |
|
2023-10-17 20:10:08,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:10,472 epoch 5 - iter 27/272 - loss 0.03478292 - time (sec): 1.61 - samples/sec: 3520.10 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:10:11,897 epoch 5 - iter 54/272 - loss 0.02706579 - time (sec): 3.03 - samples/sec: 3367.24 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:10:13,415 epoch 5 - iter 81/272 - loss 0.02798956 - time (sec): 4.55 - samples/sec: 3180.58 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:10:15,001 epoch 5 - iter 108/272 - loss 0.03053647 - time (sec): 6.13 - samples/sec: 3282.15 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:10:16,959 epoch 5 - iter 135/272 - loss 0.02892008 - time (sec): 8.09 - samples/sec: 3216.38 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:10:18,617 epoch 5 - iter 162/272 - loss 0.02651766 - time (sec): 9.75 - samples/sec: 3251.68 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:10:20,088 epoch 5 - iter 189/272 - loss 0.03505289 - time (sec): 11.22 - samples/sec: 3225.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:10:21,653 epoch 5 - iter 216/272 - loss 0.03419520 - time (sec): 12.79 - samples/sec: 3246.17 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:10:23,263 epoch 5 - iter 243/272 - loss 0.03527206 - time (sec): 14.40 - samples/sec: 3243.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:10:24,803 epoch 5 - iter 270/272 - loss 0.03558909 - time (sec): 15.94 - samples/sec: 3242.08 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:10:24,892 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:24,892 EPOCH 5 done: loss 0.0355 - lr: 0.000028 |
|
2023-10-17 20:10:26,382 DEV : loss 0.136720672249794 - f1-score (micro avg) 0.7993 |
|
2023-10-17 20:10:26,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:28,006 epoch 6 - iter 27/272 - loss 0.02355395 - time (sec): 1.62 - samples/sec: 3230.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:10:29,528 epoch 6 - iter 54/272 - loss 0.02468085 - time (sec): 3.14 - samples/sec: 3305.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:10:30,968 epoch 6 - iter 81/272 - loss 0.03144277 - time (sec): 4.58 - samples/sec: 3281.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:10:32,450 epoch 6 - iter 108/272 - loss 0.02540633 - time (sec): 6.06 - samples/sec: 3286.01 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:10:34,164 epoch 6 - iter 135/272 - loss 0.02367505 - time (sec): 7.78 - samples/sec: 3307.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:10:35,616 epoch 6 - iter 162/272 - loss 0.02246197 - time (sec): 9.23 - samples/sec: 3259.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:10:37,137 epoch 6 - iter 189/272 - loss 0.02522335 - time (sec): 10.75 - samples/sec: 3246.53 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:10:38,586 epoch 6 - iter 216/272 - loss 0.02532442 - time (sec): 12.20 - samples/sec: 3228.14 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:10:40,398 epoch 6 - iter 243/272 - loss 0.02338905 - time (sec): 14.01 - samples/sec: 3288.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:10:42,014 epoch 6 - iter 270/272 - loss 0.02383395 - time (sec): 15.63 - samples/sec: 3316.69 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:10:42,097 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:42,097 EPOCH 6 done: loss 0.0238 - lr: 0.000022 |
|
2023-10-17 20:10:43,661 DEV : loss 0.17291073501110077 - f1-score (micro avg) 0.8148 |
|
2023-10-17 20:10:43,668 saving best model |
|
2023-10-17 20:10:44,141 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:45,812 epoch 7 - iter 27/272 - loss 0.01466535 - time (sec): 1.67 - samples/sec: 3282.96 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:10:47,415 epoch 7 - iter 54/272 - loss 0.01384818 - time (sec): 3.27 - samples/sec: 3190.91 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:10:48,915 epoch 7 - iter 81/272 - loss 0.01364182 - time (sec): 4.77 - samples/sec: 3181.97 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:10:50,493 epoch 7 - iter 108/272 - loss 0.01625888 - time (sec): 6.35 - samples/sec: 3245.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:10:52,201 epoch 7 - iter 135/272 - loss 0.01518591 - time (sec): 8.06 - samples/sec: 3248.37 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:10:53,716 epoch 7 - iter 162/272 - loss 0.01649649 - time (sec): 9.57 - samples/sec: 3239.36 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:10:55,328 epoch 7 - iter 189/272 - loss 0.01508126 - time (sec): 11.19 - samples/sec: 3248.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:10:56,849 epoch 7 - iter 216/272 - loss 0.01466722 - time (sec): 12.71 - samples/sec: 3209.15 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:10:58,355 epoch 7 - iter 243/272 - loss 0.01386952 - time (sec): 14.21 - samples/sec: 3239.70 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:10:59,915 epoch 7 - iter 270/272 - loss 0.01470419 - time (sec): 15.77 - samples/sec: 3271.80 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:11:00,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:00,020 EPOCH 7 done: loss 0.0146 - lr: 0.000017 |
|
2023-10-17 20:11:01,497 DEV : loss 0.16411031782627106 - f1-score (micro avg) 0.8 |
|
2023-10-17 20:11:01,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:03,057 epoch 8 - iter 27/272 - loss 0.00697671 - time (sec): 1.55 - samples/sec: 3181.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:11:04,691 epoch 8 - iter 54/272 - loss 0.01218680 - time (sec): 3.19 - samples/sec: 3242.34 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:11:06,212 epoch 8 - iter 81/272 - loss 0.01160361 - time (sec): 4.71 - samples/sec: 3285.46 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:11:07,797 epoch 8 - iter 108/272 - loss 0.01003810 - time (sec): 6.29 - samples/sec: 3276.76 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:11:09,511 epoch 8 - iter 135/272 - loss 0.01334468 - time (sec): 8.01 - samples/sec: 3280.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:11:11,084 epoch 8 - iter 162/272 - loss 0.01311487 - time (sec): 9.58 - samples/sec: 3242.08 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:11:12,690 epoch 8 - iter 189/272 - loss 0.01245178 - time (sec): 11.19 - samples/sec: 3292.10 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:11:14,249 epoch 8 - iter 216/272 - loss 0.01433968 - time (sec): 12.75 - samples/sec: 3265.29 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:11:15,689 epoch 8 - iter 243/272 - loss 0.01354159 - time (sec): 14.19 - samples/sec: 3276.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:11:17,283 epoch 8 - iter 270/272 - loss 0.01280139 - time (sec): 15.78 - samples/sec: 3281.26 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:11:17,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:17,382 EPOCH 8 done: loss 0.0128 - lr: 0.000011 |
|
2023-10-17 20:11:18,870 DEV : loss 0.16361746191978455 - f1-score (micro avg) 0.822 |
|
2023-10-17 20:11:18,875 saving best model |
|
2023-10-17 20:11:19,354 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:21,085 epoch 9 - iter 27/272 - loss 0.00469121 - time (sec): 1.73 - samples/sec: 2992.63 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:11:22,734 epoch 9 - iter 54/272 - loss 0.00351451 - time (sec): 3.38 - samples/sec: 3100.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:11:24,509 epoch 9 - iter 81/272 - loss 0.00808982 - time (sec): 5.15 - samples/sec: 3034.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:11:25,936 epoch 9 - iter 108/272 - loss 0.00708092 - time (sec): 6.58 - samples/sec: 2983.60 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:11:27,522 epoch 9 - iter 135/272 - loss 0.00600108 - time (sec): 8.17 - samples/sec: 3035.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:11:29,194 epoch 9 - iter 162/272 - loss 0.00553079 - time (sec): 9.84 - samples/sec: 3023.78 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:11:30,953 epoch 9 - iter 189/272 - loss 0.00725806 - time (sec): 11.60 - samples/sec: 3050.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:11:32,445 epoch 9 - iter 216/272 - loss 0.00887487 - time (sec): 13.09 - samples/sec: 3069.42 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:11:34,186 epoch 9 - iter 243/272 - loss 0.00894166 - time (sec): 14.83 - samples/sec: 3124.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:11:35,757 epoch 9 - iter 270/272 - loss 0.00811046 - time (sec): 16.40 - samples/sec: 3158.99 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:11:35,851 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:35,851 EPOCH 9 done: loss 0.0081 - lr: 0.000006 |
|
2023-10-17 20:11:37,339 DEV : loss 0.16167020797729492 - f1-score (micro avg) 0.8352 |
|
2023-10-17 20:11:37,344 saving best model |
|
2023-10-17 20:11:37,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:39,378 epoch 10 - iter 27/272 - loss 0.01637510 - time (sec): 1.54 - samples/sec: 2953.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:11:40,845 epoch 10 - iter 54/272 - loss 0.01175067 - time (sec): 3.00 - samples/sec: 2932.84 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:11:42,291 epoch 10 - iter 81/272 - loss 0.00774745 - time (sec): 4.45 - samples/sec: 3017.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:11:43,912 epoch 10 - iter 108/272 - loss 0.00814612 - time (sec): 6.07 - samples/sec: 3119.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:11:45,416 epoch 10 - iter 135/272 - loss 0.00670909 - time (sec): 7.57 - samples/sec: 3130.98 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:11:47,004 epoch 10 - iter 162/272 - loss 0.00603799 - time (sec): 9.16 - samples/sec: 3210.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:11:48,623 epoch 10 - iter 189/272 - loss 0.00595688 - time (sec): 10.78 - samples/sec: 3241.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:11:50,195 epoch 10 - iter 216/272 - loss 0.00624265 - time (sec): 12.35 - samples/sec: 3247.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:11:51,948 epoch 10 - iter 243/272 - loss 0.00609270 - time (sec): 14.11 - samples/sec: 3290.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:11:53,598 epoch 10 - iter 270/272 - loss 0.00548331 - time (sec): 15.76 - samples/sec: 3285.22 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:11:53,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:53,688 EPOCH 10 done: loss 0.0055 - lr: 0.000000 |
|
2023-10-17 20:11:55,237 DEV : loss 0.16244861483573914 - f1-score (micro avg) 0.8253 |
|
2023-10-17 20:11:55,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:55,689 Loading model from best epoch ... |
|
2023-10-17 20:11:57,273 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:11:59,397 |
|
Results: |
|
- F-score (micro) 0.7971 |
|
- F-score (macro) 0.7775 |
|
- Accuracy 0.6763 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8442 0.8333 0.8387 312 |
|
PER 0.6962 0.8702 0.7735 208 |
|
ORG 0.6667 0.5455 0.6000 55 |
|
HumanProd 0.8148 1.0000 0.8980 22 |
|
|
|
micro avg 0.7703 0.8258 0.7971 597 |
|
macro avg 0.7554 0.8122 0.7775 597 |
|
weighted avg 0.7752 0.8258 0.7962 597 |
|
|
|
2023-10-17 20:11:59,397 ---------------------------------------------------------------------------------------------------- |
|
|