|
2023-10-17 17:54:32,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,308 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,308 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,308 Train: 1166 sentences |
|
2023-10-17 17:54:32,308 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,308 Training Params: |
|
2023-10-17 17:54:32,308 - learning_rate: "3e-05" |
|
2023-10-17 17:54:32,308 - mini_batch_size: "4" |
|
2023-10-17 17:54:32,308 - max_epochs: "10" |
|
2023-10-17 17:54:32,308 - shuffle: "True" |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 Plugins: |
|
2023-10-17 17:54:32,309 - TensorboardLogger |
|
2023-10-17 17:54:32,309 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:54:32,309 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 Computation: |
|
2023-10-17 17:54:32,309 - compute on device: cuda:0 |
|
2023-10-17 17:54:32,309 - embedding storage: none |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:32,309 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:54:33,870 epoch 1 - iter 29/292 - loss 3.46529745 - time (sec): 1.56 - samples/sec: 2414.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:54:35,648 epoch 1 - iter 58/292 - loss 2.85262564 - time (sec): 3.34 - samples/sec: 2695.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:54:37,352 epoch 1 - iter 87/292 - loss 2.24144121 - time (sec): 5.04 - samples/sec: 2729.14 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:54:39,256 epoch 1 - iter 116/292 - loss 1.84711241 - time (sec): 6.95 - samples/sec: 2702.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:54:40,855 epoch 1 - iter 145/292 - loss 1.60586186 - time (sec): 8.55 - samples/sec: 2668.69 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:54:42,574 epoch 1 - iter 174/292 - loss 1.40883045 - time (sec): 10.26 - samples/sec: 2662.50 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:54:44,270 epoch 1 - iter 203/292 - loss 1.26133435 - time (sec): 11.96 - samples/sec: 2663.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:54:45,832 epoch 1 - iter 232/292 - loss 1.17210562 - time (sec): 13.52 - samples/sec: 2650.12 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:54:47,491 epoch 1 - iter 261/292 - loss 1.07516745 - time (sec): 15.18 - samples/sec: 2632.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:54:49,094 epoch 1 - iter 290/292 - loss 1.00482708 - time (sec): 16.78 - samples/sec: 2632.29 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:54:49,197 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:49,198 EPOCH 1 done: loss 1.0011 - lr: 0.000030 |
|
2023-10-17 17:54:50,247 DEV : loss 0.18033993244171143 - f1-score (micro avg) 0.5166 |
|
2023-10-17 17:54:50,252 saving best model |
|
2023-10-17 17:54:50,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:54:52,163 epoch 2 - iter 29/292 - loss 0.26889643 - time (sec): 1.56 - samples/sec: 2819.81 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:54:53,751 epoch 2 - iter 58/292 - loss 0.23616332 - time (sec): 3.15 - samples/sec: 2670.49 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:54:55,366 epoch 2 - iter 87/292 - loss 0.23714419 - time (sec): 4.77 - samples/sec: 2706.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:54:57,264 epoch 2 - iter 116/292 - loss 0.23528197 - time (sec): 6.66 - samples/sec: 2714.76 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:54:58,855 epoch 2 - iter 145/292 - loss 0.23228705 - time (sec): 8.26 - samples/sec: 2656.03 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:55:00,541 epoch 2 - iter 174/292 - loss 0.21871137 - time (sec): 9.94 - samples/sec: 2656.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:55:02,351 epoch 2 - iter 203/292 - loss 0.21324631 - time (sec): 11.75 - samples/sec: 2694.53 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:55:04,110 epoch 2 - iter 232/292 - loss 0.20885632 - time (sec): 13.51 - samples/sec: 2715.99 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:55:05,662 epoch 2 - iter 261/292 - loss 0.20683157 - time (sec): 15.06 - samples/sec: 2667.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:55:07,281 epoch 2 - iter 290/292 - loss 0.20615249 - time (sec): 16.68 - samples/sec: 2655.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:55:07,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:07,370 EPOCH 2 done: loss 0.2057 - lr: 0.000027 |
|
2023-10-17 17:55:08,617 DEV : loss 0.13886240124702454 - f1-score (micro avg) 0.6061 |
|
2023-10-17 17:55:08,622 saving best model |
|
2023-10-17 17:55:09,062 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:10,780 epoch 3 - iter 29/292 - loss 0.13024211 - time (sec): 1.72 - samples/sec: 2838.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:55:12,637 epoch 3 - iter 58/292 - loss 0.12759556 - time (sec): 3.57 - samples/sec: 2715.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:55:14,276 epoch 3 - iter 87/292 - loss 0.14205085 - time (sec): 5.21 - samples/sec: 2682.63 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:55:15,845 epoch 3 - iter 116/292 - loss 0.12865193 - time (sec): 6.78 - samples/sec: 2641.70 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:55:17,502 epoch 3 - iter 145/292 - loss 0.12530962 - time (sec): 8.44 - samples/sec: 2652.51 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:55:19,195 epoch 3 - iter 174/292 - loss 0.12338449 - time (sec): 10.13 - samples/sec: 2671.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:55:20,660 epoch 3 - iter 203/292 - loss 0.12001001 - time (sec): 11.60 - samples/sec: 2696.64 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:55:22,357 epoch 3 - iter 232/292 - loss 0.11516561 - time (sec): 13.29 - samples/sec: 2692.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:55:23,957 epoch 3 - iter 261/292 - loss 0.11294112 - time (sec): 14.89 - samples/sec: 2688.59 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:55:25,606 epoch 3 - iter 290/292 - loss 0.11431090 - time (sec): 16.54 - samples/sec: 2676.98 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:55:25,693 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:25,693 EPOCH 3 done: loss 0.1145 - lr: 0.000023 |
|
2023-10-17 17:55:26,953 DEV : loss 0.12177132815122604 - f1-score (micro avg) 0.7233 |
|
2023-10-17 17:55:26,981 saving best model |
|
2023-10-17 17:55:27,438 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:29,139 epoch 4 - iter 29/292 - loss 0.06672050 - time (sec): 1.70 - samples/sec: 2718.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:55:30,934 epoch 4 - iter 58/292 - loss 0.06268066 - time (sec): 3.49 - samples/sec: 2683.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:55:32,516 epoch 4 - iter 87/292 - loss 0.06732367 - time (sec): 5.07 - samples/sec: 2611.61 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:55:34,394 epoch 4 - iter 116/292 - loss 0.06108828 - time (sec): 6.95 - samples/sec: 2647.70 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:55:36,052 epoch 4 - iter 145/292 - loss 0.06815551 - time (sec): 8.61 - samples/sec: 2653.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:55:37,835 epoch 4 - iter 174/292 - loss 0.07114621 - time (sec): 10.39 - samples/sec: 2644.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:55:39,445 epoch 4 - iter 203/292 - loss 0.07001109 - time (sec): 12.00 - samples/sec: 2622.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:55:41,196 epoch 4 - iter 232/292 - loss 0.07466102 - time (sec): 13.75 - samples/sec: 2611.52 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:55:42,800 epoch 4 - iter 261/292 - loss 0.07419526 - time (sec): 15.36 - samples/sec: 2616.12 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:55:44,399 epoch 4 - iter 290/292 - loss 0.07236502 - time (sec): 16.96 - samples/sec: 2612.87 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:55:44,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:44,487 EPOCH 4 done: loss 0.0722 - lr: 0.000020 |
|
2023-10-17 17:55:45,740 DEV : loss 0.12535437941551208 - f1-score (micro avg) 0.7738 |
|
2023-10-17 17:55:45,745 saving best model |
|
2023-10-17 17:55:46,211 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:55:47,883 epoch 5 - iter 29/292 - loss 0.04075673 - time (sec): 1.67 - samples/sec: 2548.52 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:55:49,424 epoch 5 - iter 58/292 - loss 0.04398143 - time (sec): 3.21 - samples/sec: 2657.55 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:55:51,125 epoch 5 - iter 87/292 - loss 0.05122247 - time (sec): 4.91 - samples/sec: 2767.67 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:55:52,824 epoch 5 - iter 116/292 - loss 0.05362585 - time (sec): 6.61 - samples/sec: 2709.69 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:55:54,584 epoch 5 - iter 145/292 - loss 0.06020828 - time (sec): 8.37 - samples/sec: 2671.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:55:56,221 epoch 5 - iter 174/292 - loss 0.05705763 - time (sec): 10.01 - samples/sec: 2660.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:55:57,804 epoch 5 - iter 203/292 - loss 0.05440471 - time (sec): 11.59 - samples/sec: 2661.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:55:59,504 epoch 5 - iter 232/292 - loss 0.05342826 - time (sec): 13.29 - samples/sec: 2650.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:56:01,190 epoch 5 - iter 261/292 - loss 0.04996839 - time (sec): 14.98 - samples/sec: 2660.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:56:02,840 epoch 5 - iter 290/292 - loss 0.05076733 - time (sec): 16.63 - samples/sec: 2664.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:56:02,942 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:02,942 EPOCH 5 done: loss 0.0507 - lr: 0.000017 |
|
2023-10-17 17:56:04,636 DEV : loss 0.1338176429271698 - f1-score (micro avg) 0.7873 |
|
2023-10-17 17:56:04,643 saving best model |
|
2023-10-17 17:56:05,220 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:07,011 epoch 6 - iter 29/292 - loss 0.03285633 - time (sec): 1.79 - samples/sec: 2416.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:56:08,784 epoch 6 - iter 58/292 - loss 0.04542416 - time (sec): 3.56 - samples/sec: 2541.67 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:56:10,460 epoch 6 - iter 87/292 - loss 0.04632496 - time (sec): 5.24 - samples/sec: 2485.10 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:56:11,985 epoch 6 - iter 116/292 - loss 0.04275741 - time (sec): 6.76 - samples/sec: 2426.11 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:56:13,731 epoch 6 - iter 145/292 - loss 0.03816003 - time (sec): 8.51 - samples/sec: 2499.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:56:15,477 epoch 6 - iter 174/292 - loss 0.04074117 - time (sec): 10.26 - samples/sec: 2561.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:56:17,003 epoch 6 - iter 203/292 - loss 0.04022817 - time (sec): 11.78 - samples/sec: 2557.88 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:56:18,654 epoch 6 - iter 232/292 - loss 0.03926639 - time (sec): 13.43 - samples/sec: 2557.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:56:20,236 epoch 6 - iter 261/292 - loss 0.04036851 - time (sec): 15.01 - samples/sec: 2580.58 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:56:22,084 epoch 6 - iter 290/292 - loss 0.03754090 - time (sec): 16.86 - samples/sec: 2621.46 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:56:22,182 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:22,182 EPOCH 6 done: loss 0.0377 - lr: 0.000013 |
|
2023-10-17 17:56:23,418 DEV : loss 0.13025900721549988 - f1-score (micro avg) 0.7822 |
|
2023-10-17 17:56:23,423 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:25,047 epoch 7 - iter 29/292 - loss 0.02527725 - time (sec): 1.62 - samples/sec: 2566.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:56:26,530 epoch 7 - iter 58/292 - loss 0.03459254 - time (sec): 3.11 - samples/sec: 2529.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:56:28,186 epoch 7 - iter 87/292 - loss 0.02696686 - time (sec): 4.76 - samples/sec: 2593.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:56:29,761 epoch 7 - iter 116/292 - loss 0.02843464 - time (sec): 6.34 - samples/sec: 2607.35 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:56:31,440 epoch 7 - iter 145/292 - loss 0.03184250 - time (sec): 8.02 - samples/sec: 2662.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:56:33,110 epoch 7 - iter 174/292 - loss 0.03157781 - time (sec): 9.69 - samples/sec: 2625.25 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:56:34,856 epoch 7 - iter 203/292 - loss 0.02943012 - time (sec): 11.43 - samples/sec: 2629.65 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:56:36,561 epoch 7 - iter 232/292 - loss 0.02784831 - time (sec): 13.14 - samples/sec: 2603.66 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:56:38,277 epoch 7 - iter 261/292 - loss 0.02771447 - time (sec): 14.85 - samples/sec: 2614.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:56:39,989 epoch 7 - iter 290/292 - loss 0.02650986 - time (sec): 16.56 - samples/sec: 2646.73 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:56:40,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:40,181 EPOCH 7 done: loss 0.0265 - lr: 0.000010 |
|
2023-10-17 17:56:41,471 DEV : loss 0.13627947866916656 - f1-score (micro avg) 0.7758 |
|
2023-10-17 17:56:41,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:43,311 epoch 8 - iter 29/292 - loss 0.01493153 - time (sec): 1.83 - samples/sec: 2394.48 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:56:45,102 epoch 8 - iter 58/292 - loss 0.02355857 - time (sec): 3.62 - samples/sec: 2428.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:56:46,760 epoch 8 - iter 87/292 - loss 0.01909118 - time (sec): 5.28 - samples/sec: 2516.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:56:48,389 epoch 8 - iter 116/292 - loss 0.02305759 - time (sec): 6.91 - samples/sec: 2577.21 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:56:50,048 epoch 8 - iter 145/292 - loss 0.02327096 - time (sec): 8.57 - samples/sec: 2611.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:56:51,689 epoch 8 - iter 174/292 - loss 0.02166280 - time (sec): 10.21 - samples/sec: 2648.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:56:53,498 epoch 8 - iter 203/292 - loss 0.02052288 - time (sec): 12.02 - samples/sec: 2664.00 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:56:54,998 epoch 8 - iter 232/292 - loss 0.02174233 - time (sec): 13.52 - samples/sec: 2632.40 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:56:56,744 epoch 8 - iter 261/292 - loss 0.02030829 - time (sec): 15.27 - samples/sec: 2641.97 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:56:58,399 epoch 8 - iter 290/292 - loss 0.01930915 - time (sec): 16.92 - samples/sec: 2615.82 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:56:58,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:56:58,493 EPOCH 8 done: loss 0.0192 - lr: 0.000007 |
|
2023-10-17 17:56:59,773 DEV : loss 0.14689218997955322 - f1-score (micro avg) 0.783 |
|
2023-10-17 17:56:59,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:57:01,356 epoch 9 - iter 29/292 - loss 0.02283188 - time (sec): 1.58 - samples/sec: 2542.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:57:03,241 epoch 9 - iter 58/292 - loss 0.02374641 - time (sec): 3.46 - samples/sec: 2751.78 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:57:05,057 epoch 9 - iter 87/292 - loss 0.02522260 - time (sec): 5.28 - samples/sec: 2785.00 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:57:06,636 epoch 9 - iter 116/292 - loss 0.02221788 - time (sec): 6.86 - samples/sec: 2713.59 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:57:08,138 epoch 9 - iter 145/292 - loss 0.02100944 - time (sec): 8.36 - samples/sec: 2656.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:57:09,813 epoch 9 - iter 174/292 - loss 0.01978056 - time (sec): 10.03 - samples/sec: 2673.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:57:11,400 epoch 9 - iter 203/292 - loss 0.01857901 - time (sec): 11.62 - samples/sec: 2658.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:57:13,088 epoch 9 - iter 232/292 - loss 0.01720288 - time (sec): 13.31 - samples/sec: 2698.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:57:14,615 epoch 9 - iter 261/292 - loss 0.01669609 - time (sec): 14.84 - samples/sec: 2669.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:57:16,176 epoch 9 - iter 290/292 - loss 0.01569075 - time (sec): 16.40 - samples/sec: 2681.37 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:57:16,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:57:16,317 EPOCH 9 done: loss 0.0155 - lr: 0.000003 |
|
2023-10-17 17:57:17,603 DEV : loss 0.1420283317565918 - f1-score (micro avg) 0.7859 |
|
2023-10-17 17:57:17,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:57:19,468 epoch 10 - iter 29/292 - loss 0.02328778 - time (sec): 1.86 - samples/sec: 2880.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:57:21,144 epoch 10 - iter 58/292 - loss 0.01581561 - time (sec): 3.54 - samples/sec: 2784.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:57:22,748 epoch 10 - iter 87/292 - loss 0.01386311 - time (sec): 5.14 - samples/sec: 2741.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:57:24,360 epoch 10 - iter 116/292 - loss 0.01274530 - time (sec): 6.75 - samples/sec: 2712.62 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:57:25,977 epoch 10 - iter 145/292 - loss 0.01181146 - time (sec): 8.37 - samples/sec: 2640.57 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:57:27,705 epoch 10 - iter 174/292 - loss 0.01493437 - time (sec): 10.10 - samples/sec: 2610.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:57:29,461 epoch 10 - iter 203/292 - loss 0.01450790 - time (sec): 11.85 - samples/sec: 2649.17 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:57:30,986 epoch 10 - iter 232/292 - loss 0.01526752 - time (sec): 13.38 - samples/sec: 2667.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:57:32,502 epoch 10 - iter 261/292 - loss 0.01469076 - time (sec): 14.89 - samples/sec: 2658.73 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:57:34,269 epoch 10 - iter 290/292 - loss 0.01440001 - time (sec): 16.66 - samples/sec: 2647.74 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:57:34,374 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:57:34,374 EPOCH 10 done: loss 0.0143 - lr: 0.000000 |
|
2023-10-17 17:57:35,631 DEV : loss 0.14396768808364868 - f1-score (micro avg) 0.793 |
|
2023-10-17 17:57:35,636 saving best model |
|
2023-10-17 17:57:36,631 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:57:36,632 Loading model from best epoch ... |
|
2023-10-17 17:57:38,019 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:57:40,546 |
|
Results: |
|
- F-score (micro) 0.7599 |
|
- F-score (macro) 0.7052 |
|
- Accuracy 0.633 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8065 0.8621 0.8333 348 |
|
LOC 0.6494 0.8161 0.7233 261 |
|
ORG 0.4333 0.5000 0.4643 52 |
|
HumanProd 0.7826 0.8182 0.8000 22 |
|
|
|
micro avg 0.7114 0.8155 0.7599 683 |
|
macro avg 0.6679 0.7491 0.7052 683 |
|
weighted avg 0.7173 0.8155 0.7621 683 |
|
|
|
2023-10-17 17:57:40,547 ---------------------------------------------------------------------------------------------------- |
|
|