|
2023-10-17 11:07:44,037 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,040 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 11:07:44,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,040 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-17 11:07:44,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,041 Train: 20847 sentences |
|
2023-10-17 11:07:44,041 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 11:07:44,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,041 Training Params: |
|
2023-10-17 11:07:44,041 - learning_rate: "3e-05" |
|
2023-10-17 11:07:44,041 - mini_batch_size: "4" |
|
2023-10-17 11:07:44,041 - max_epochs: "10" |
|
2023-10-17 11:07:44,041 - shuffle: "True" |
|
2023-10-17 11:07:44,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,041 Plugins: |
|
2023-10-17 11:07:44,041 - TensorboardLogger |
|
2023-10-17 11:07:44,042 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 11:07:44,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,042 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 11:07:44,042 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 11:07:44,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,042 Computation: |
|
2023-10-17 11:07:44,042 - compute on device: cuda:0 |
|
2023-10-17 11:07:44,042 - embedding storage: none |
|
2023-10-17 11:07:44,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,042 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 11:07:44,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,043 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:07:44,043 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 11:08:26,363 epoch 1 - iter 521/5212 - loss 1.89163776 - time (sec): 42.32 - samples/sec: 798.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 11:09:08,855 epoch 1 - iter 1042/5212 - loss 1.13479704 - time (sec): 84.81 - samples/sec: 817.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 11:09:54,477 epoch 1 - iter 1563/5212 - loss 0.84671118 - time (sec): 130.43 - samples/sec: 826.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 11:10:37,574 epoch 1 - iter 2084/5212 - loss 0.69980888 - time (sec): 173.53 - samples/sec: 838.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 11:11:21,695 epoch 1 - iter 2605/5212 - loss 0.60610850 - time (sec): 217.65 - samples/sec: 848.09 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 11:12:04,464 epoch 1 - iter 3126/5212 - loss 0.54043037 - time (sec): 260.42 - samples/sec: 857.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 11:12:47,850 epoch 1 - iter 3647/5212 - loss 0.49949347 - time (sec): 303.81 - samples/sec: 851.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 11:13:31,805 epoch 1 - iter 4168/5212 - loss 0.46823322 - time (sec): 347.76 - samples/sec: 842.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 11:14:16,895 epoch 1 - iter 4689/5212 - loss 0.43915469 - time (sec): 392.85 - samples/sec: 842.42 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 11:15:00,022 epoch 1 - iter 5210/5212 - loss 0.41555785 - time (sec): 435.98 - samples/sec: 842.27 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 11:15:00,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:15:00,190 EPOCH 1 done: loss 0.4153 - lr: 0.000030 |
|
2023-10-17 11:15:07,630 DEV : loss 0.11844930797815323 - f1-score (micro avg) 0.2469 |
|
2023-10-17 11:15:07,684 saving best model |
|
2023-10-17 11:15:08,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:15:51,077 epoch 2 - iter 521/5212 - loss 0.18677525 - time (sec): 42.84 - samples/sec: 893.85 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 11:16:34,185 epoch 2 - iter 1042/5212 - loss 0.18467584 - time (sec): 85.94 - samples/sec: 867.54 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 11:17:17,438 epoch 2 - iter 1563/5212 - loss 0.18459225 - time (sec): 129.20 - samples/sec: 868.65 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 11:18:00,970 epoch 2 - iter 2084/5212 - loss 0.18949291 - time (sec): 172.73 - samples/sec: 854.13 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 11:18:45,977 epoch 2 - iter 2605/5212 - loss 0.18960565 - time (sec): 217.74 - samples/sec: 842.99 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 11:19:29,280 epoch 2 - iter 3126/5212 - loss 0.18834557 - time (sec): 261.04 - samples/sec: 840.64 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 11:20:11,948 epoch 2 - iter 3647/5212 - loss 0.18442261 - time (sec): 303.71 - samples/sec: 854.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 11:20:55,872 epoch 2 - iter 4168/5212 - loss 0.18307206 - time (sec): 347.63 - samples/sec: 851.42 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 11:21:38,580 epoch 2 - iter 4689/5212 - loss 0.17918821 - time (sec): 390.34 - samples/sec: 844.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 11:22:20,682 epoch 2 - iter 5210/5212 - loss 0.17650046 - time (sec): 432.44 - samples/sec: 849.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 11:22:20,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:22:20,833 EPOCH 2 done: loss 0.1765 - lr: 0.000027 |
|
2023-10-17 11:22:32,839 DEV : loss 0.23894941806793213 - f1-score (micro avg) 0.3469 |
|
2023-10-17 11:22:32,893 saving best model |
|
2023-10-17 11:22:34,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:23:15,325 epoch 3 - iter 521/5212 - loss 0.11437386 - time (sec): 41.01 - samples/sec: 925.03 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 11:23:57,365 epoch 3 - iter 1042/5212 - loss 0.12459624 - time (sec): 83.04 - samples/sec: 902.84 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 11:24:39,204 epoch 3 - iter 1563/5212 - loss 0.12923542 - time (sec): 124.88 - samples/sec: 889.63 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 11:25:20,547 epoch 3 - iter 2084/5212 - loss 0.13146091 - time (sec): 166.23 - samples/sec: 882.27 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 11:26:01,412 epoch 3 - iter 2605/5212 - loss 0.12864210 - time (sec): 207.09 - samples/sec: 886.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 11:26:42,942 epoch 3 - iter 3126/5212 - loss 0.13399268 - time (sec): 248.62 - samples/sec: 875.92 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 11:27:25,033 epoch 3 - iter 3647/5212 - loss 0.13321294 - time (sec): 290.71 - samples/sec: 874.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 11:28:06,996 epoch 3 - iter 4168/5212 - loss 0.13283195 - time (sec): 332.68 - samples/sec: 874.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 11:28:50,011 epoch 3 - iter 4689/5212 - loss 0.13502035 - time (sec): 375.69 - samples/sec: 878.62 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 11:29:31,516 epoch 3 - iter 5210/5212 - loss 0.13202538 - time (sec): 417.20 - samples/sec: 880.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 11:29:31,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:29:31,671 EPOCH 3 done: loss 0.1319 - lr: 0.000023 |
|
2023-10-17 11:29:43,652 DEV : loss 0.24874247610569 - f1-score (micro avg) 0.351 |
|
2023-10-17 11:29:43,706 saving best model |
|
2023-10-17 11:29:45,126 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:30:29,111 epoch 4 - iter 521/5212 - loss 0.09645922 - time (sec): 43.98 - samples/sec: 847.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 11:31:14,228 epoch 4 - iter 1042/5212 - loss 0.09575057 - time (sec): 89.10 - samples/sec: 827.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 11:31:56,651 epoch 4 - iter 1563/5212 - loss 0.09440810 - time (sec): 131.52 - samples/sec: 826.16 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 11:32:38,800 epoch 4 - iter 2084/5212 - loss 0.09200648 - time (sec): 173.67 - samples/sec: 830.24 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 11:33:22,254 epoch 4 - iter 2605/5212 - loss 0.09507978 - time (sec): 217.12 - samples/sec: 824.50 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 11:34:03,455 epoch 4 - iter 3126/5212 - loss 0.09574149 - time (sec): 258.33 - samples/sec: 827.19 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 11:34:45,413 epoch 4 - iter 3647/5212 - loss 0.09612564 - time (sec): 300.28 - samples/sec: 839.08 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 11:35:28,560 epoch 4 - iter 4168/5212 - loss 0.09673298 - time (sec): 343.43 - samples/sec: 846.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 11:36:11,639 epoch 4 - iter 4689/5212 - loss 0.09605759 - time (sec): 386.51 - samples/sec: 853.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 11:36:54,277 epoch 4 - iter 5210/5212 - loss 0.09435830 - time (sec): 429.15 - samples/sec: 855.98 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 11:36:54,441 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:36:54,441 EPOCH 4 done: loss 0.0943 - lr: 0.000020 |
|
2023-10-17 11:37:06,587 DEV : loss 0.2750011384487152 - f1-score (micro avg) 0.3813 |
|
2023-10-17 11:37:06,641 saving best model |
|
2023-10-17 11:37:08,118 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:37:52,422 epoch 5 - iter 521/5212 - loss 0.05806410 - time (sec): 44.30 - samples/sec: 850.98 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 11:38:35,833 epoch 5 - iter 1042/5212 - loss 0.05768321 - time (sec): 87.71 - samples/sec: 816.42 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 11:39:21,545 epoch 5 - iter 1563/5212 - loss 0.06370432 - time (sec): 133.42 - samples/sec: 813.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 11:40:04,393 epoch 5 - iter 2084/5212 - loss 0.06198607 - time (sec): 176.27 - samples/sec: 810.85 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 11:40:50,143 epoch 5 - iter 2605/5212 - loss 0.06397044 - time (sec): 222.02 - samples/sec: 820.57 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 11:41:34,615 epoch 5 - iter 3126/5212 - loss 0.06343104 - time (sec): 266.49 - samples/sec: 834.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 11:42:17,862 epoch 5 - iter 3647/5212 - loss 0.06370732 - time (sec): 309.74 - samples/sec: 834.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 11:43:00,416 epoch 5 - iter 4168/5212 - loss 0.06361989 - time (sec): 352.29 - samples/sec: 841.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 11:43:41,207 epoch 5 - iter 4689/5212 - loss 0.06404810 - time (sec): 393.08 - samples/sec: 842.08 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 11:44:23,171 epoch 5 - iter 5210/5212 - loss 0.06343811 - time (sec): 435.05 - samples/sec: 844.47 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 11:44:23,319 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:44:23,320 EPOCH 5 done: loss 0.0635 - lr: 0.000017 |
|
2023-10-17 11:44:34,163 DEV : loss 0.34400203824043274 - f1-score (micro avg) 0.3937 |
|
2023-10-17 11:44:34,220 saving best model |
|
2023-10-17 11:44:35,623 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:45:19,252 epoch 6 - iter 521/5212 - loss 0.05472897 - time (sec): 43.62 - samples/sec: 855.22 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 11:46:00,081 epoch 6 - iter 1042/5212 - loss 0.05370031 - time (sec): 84.45 - samples/sec: 855.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 11:46:41,972 epoch 6 - iter 1563/5212 - loss 0.04710872 - time (sec): 126.34 - samples/sec: 849.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 11:47:27,063 epoch 6 - iter 2084/5212 - loss 0.04824334 - time (sec): 171.44 - samples/sec: 855.69 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 11:48:09,332 epoch 6 - iter 2605/5212 - loss 0.04632066 - time (sec): 213.70 - samples/sec: 868.35 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 11:48:50,366 epoch 6 - iter 3126/5212 - loss 0.04608767 - time (sec): 254.74 - samples/sec: 874.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 11:49:32,284 epoch 6 - iter 3647/5212 - loss 0.04516211 - time (sec): 296.66 - samples/sec: 871.68 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 11:50:13,897 epoch 6 - iter 4168/5212 - loss 0.04805567 - time (sec): 338.27 - samples/sec: 869.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 11:50:56,045 epoch 6 - iter 4689/5212 - loss 0.04758346 - time (sec): 380.42 - samples/sec: 871.32 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 11:51:37,913 epoch 6 - iter 5210/5212 - loss 0.04826135 - time (sec): 422.29 - samples/sec: 869.98 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 11:51:38,067 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:51:38,068 EPOCH 6 done: loss 0.0483 - lr: 0.000013 |
|
2023-10-17 11:51:49,240 DEV : loss 0.2987309396266937 - f1-score (micro avg) 0.3914 |
|
2023-10-17 11:51:49,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:52:31,138 epoch 7 - iter 521/5212 - loss 0.03418427 - time (sec): 41.84 - samples/sec: 904.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 11:53:13,452 epoch 7 - iter 1042/5212 - loss 0.03052773 - time (sec): 84.15 - samples/sec: 895.56 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 11:53:57,642 epoch 7 - iter 1563/5212 - loss 0.03183996 - time (sec): 128.34 - samples/sec: 877.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 11:54:40,075 epoch 7 - iter 2084/5212 - loss 0.03089690 - time (sec): 170.78 - samples/sec: 876.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 11:55:22,339 epoch 7 - iter 2605/5212 - loss 0.03370678 - time (sec): 213.04 - samples/sec: 867.96 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 11:56:03,952 epoch 7 - iter 3126/5212 - loss 0.03259067 - time (sec): 254.65 - samples/sec: 864.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 11:56:47,386 epoch 7 - iter 3647/5212 - loss 0.03324339 - time (sec): 298.09 - samples/sec: 858.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 11:57:31,079 epoch 7 - iter 4168/5212 - loss 0.03204679 - time (sec): 341.78 - samples/sec: 865.26 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 11:58:13,042 epoch 7 - iter 4689/5212 - loss 0.03270349 - time (sec): 383.74 - samples/sec: 866.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 11:58:56,293 epoch 7 - iter 5210/5212 - loss 0.03210456 - time (sec): 426.99 - samples/sec: 860.40 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 11:58:56,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:58:56,458 EPOCH 7 done: loss 0.0321 - lr: 0.000010 |
|
2023-10-17 11:59:07,929 DEV : loss 0.4514279067516327 - f1-score (micro avg) 0.3873 |
|
2023-10-17 11:59:08,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:59:50,456 epoch 8 - iter 521/5212 - loss 0.02287542 - time (sec): 42.45 - samples/sec: 845.18 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 12:00:35,464 epoch 8 - iter 1042/5212 - loss 0.02032053 - time (sec): 87.46 - samples/sec: 818.29 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 12:01:18,589 epoch 8 - iter 1563/5212 - loss 0.02247712 - time (sec): 130.59 - samples/sec: 814.40 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 12:02:01,597 epoch 8 - iter 2084/5212 - loss 0.02124931 - time (sec): 173.59 - samples/sec: 821.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 12:02:44,007 epoch 8 - iter 2605/5212 - loss 0.02199294 - time (sec): 216.00 - samples/sec: 826.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 12:03:28,062 epoch 8 - iter 3126/5212 - loss 0.02206727 - time (sec): 260.06 - samples/sec: 829.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 12:04:11,244 epoch 8 - iter 3647/5212 - loss 0.02198914 - time (sec): 303.24 - samples/sec: 835.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 12:04:53,240 epoch 8 - iter 4168/5212 - loss 0.02133287 - time (sec): 345.24 - samples/sec: 846.62 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 12:05:35,938 epoch 8 - iter 4689/5212 - loss 0.02089455 - time (sec): 387.93 - samples/sec: 851.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 12:06:18,629 epoch 8 - iter 5210/5212 - loss 0.02139060 - time (sec): 430.63 - samples/sec: 853.09 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 12:06:18,786 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:06:18,787 EPOCH 8 done: loss 0.0214 - lr: 0.000007 |
|
2023-10-17 12:06:31,446 DEV : loss 0.42482689023017883 - f1-score (micro avg) 0.4045 |
|
2023-10-17 12:06:31,520 saving best model |
|
2023-10-17 12:06:33,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:07:14,963 epoch 9 - iter 521/5212 - loss 0.01446293 - time (sec): 41.93 - samples/sec: 811.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 12:07:57,416 epoch 9 - iter 1042/5212 - loss 0.01733620 - time (sec): 84.38 - samples/sec: 845.43 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 12:08:38,808 epoch 9 - iter 1563/5212 - loss 0.01496632 - time (sec): 125.78 - samples/sec: 836.07 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 12:09:20,895 epoch 9 - iter 2084/5212 - loss 0.01584419 - time (sec): 167.86 - samples/sec: 834.70 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 12:10:02,618 epoch 9 - iter 2605/5212 - loss 0.01574388 - time (sec): 209.59 - samples/sec: 843.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 12:10:44,627 epoch 9 - iter 3126/5212 - loss 0.01792002 - time (sec): 251.59 - samples/sec: 847.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 12:11:27,348 epoch 9 - iter 3647/5212 - loss 0.01713778 - time (sec): 294.32 - samples/sec: 851.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 12:12:10,707 epoch 9 - iter 4168/5212 - loss 0.01688059 - time (sec): 337.68 - samples/sec: 855.37 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 12:12:53,136 epoch 9 - iter 4689/5212 - loss 0.01654568 - time (sec): 380.10 - samples/sec: 860.95 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 12:13:36,164 epoch 9 - iter 5210/5212 - loss 0.01654318 - time (sec): 423.13 - samples/sec: 868.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 12:13:36,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:13:36,313 EPOCH 9 done: loss 0.0165 - lr: 0.000003 |
|
2023-10-17 12:13:48,791 DEV : loss 0.41042855381965637 - f1-score (micro avg) 0.416 |
|
2023-10-17 12:13:48,867 saving best model |
|
2023-10-17 12:13:50,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:14:33,294 epoch 10 - iter 521/5212 - loss 0.00781776 - time (sec): 42.96 - samples/sec: 865.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 12:15:15,785 epoch 10 - iter 1042/5212 - loss 0.00793401 - time (sec): 85.46 - samples/sec: 852.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 12:15:58,787 epoch 10 - iter 1563/5212 - loss 0.00880507 - time (sec): 128.46 - samples/sec: 831.97 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 12:16:41,351 epoch 10 - iter 2084/5212 - loss 0.00972971 - time (sec): 171.02 - samples/sec: 843.50 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 12:17:24,604 epoch 10 - iter 2605/5212 - loss 0.01024931 - time (sec): 214.27 - samples/sec: 848.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 12:18:05,735 epoch 10 - iter 3126/5212 - loss 0.00986422 - time (sec): 255.41 - samples/sec: 849.94 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 12:18:49,745 epoch 10 - iter 3647/5212 - loss 0.00950244 - time (sec): 299.42 - samples/sec: 845.97 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 12:19:32,787 epoch 10 - iter 4168/5212 - loss 0.00935274 - time (sec): 342.46 - samples/sec: 847.08 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 12:20:15,120 epoch 10 - iter 4689/5212 - loss 0.00939995 - time (sec): 384.79 - samples/sec: 855.40 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 12:20:57,845 epoch 10 - iter 5210/5212 - loss 0.00928220 - time (sec): 427.52 - samples/sec: 859.13 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 12:20:58,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:20:58,002 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-17 12:21:11,081 DEV : loss 0.4939973056316376 - f1-score (micro avg) 0.3985 |
|
2023-10-17 12:21:11,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 12:21:11,727 Loading model from best epoch ... |
|
2023-10-17 12:21:14,296 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 12:21:35,379 |
|
Results: |
|
- F-score (micro) 0.4827 |
|
- F-score (macro) 0.3288 |
|
- Accuracy 0.3215 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5341 0.5939 0.5624 1214 |
|
PER 0.4248 0.4542 0.4390 808 |
|
ORG 0.3102 0.3173 0.3137 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4648 0.5021 0.4827 2390 |
|
macro avg 0.3173 0.3413 0.3288 2390 |
|
weighted avg 0.4607 0.5021 0.4804 2390 |
|
|
|
2023-10-17 12:21:35,379 ---------------------------------------------------------------------------------------------------- |
|
|