stefan-it's picture
Upload folder using huggingface_hub
dea42d3
2023-10-17 13:08:39,919 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,921 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 13:08:39,921 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,921 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 13:08:39,921 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,921 Train: 6183 sentences
2023-10-17 13:08:39,921 (train_with_dev=False, train_with_test=False)
2023-10-17 13:08:39,922 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,922 Training Params:
2023-10-17 13:08:39,922 - learning_rate: "3e-05"
2023-10-17 13:08:39,922 - mini_batch_size: "8"
2023-10-17 13:08:39,922 - max_epochs: "10"
2023-10-17 13:08:39,922 - shuffle: "True"
2023-10-17 13:08:39,922 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,922 Plugins:
2023-10-17 13:08:39,922 - TensorboardLogger
2023-10-17 13:08:39,922 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 13:08:39,922 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,922 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 13:08:39,922 - metric: "('micro avg', 'f1-score')"
2023-10-17 13:08:39,922 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,922 Computation:
2023-10-17 13:08:39,922 - compute on device: cuda:0
2023-10-17 13:08:39,923 - embedding storage: none
2023-10-17 13:08:39,923 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,923 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 13:08:39,923 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,923 ----------------------------------------------------------------------------------------------------
2023-10-17 13:08:39,923 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 13:08:46,795 epoch 1 - iter 77/773 - loss 2.97835929 - time (sec): 6.87 - samples/sec: 1745.08 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:08:54,296 epoch 1 - iter 154/773 - loss 1.87469467 - time (sec): 14.37 - samples/sec: 1632.78 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:09:01,982 epoch 1 - iter 231/773 - loss 1.29875250 - time (sec): 22.06 - samples/sec: 1621.70 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:09:09,388 epoch 1 - iter 308/773 - loss 1.01072496 - time (sec): 29.46 - samples/sec: 1621.66 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:09:17,277 epoch 1 - iter 385/773 - loss 0.81975956 - time (sec): 37.35 - samples/sec: 1635.34 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:09:25,191 epoch 1 - iter 462/773 - loss 0.70447128 - time (sec): 45.27 - samples/sec: 1627.15 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:09:32,881 epoch 1 - iter 539/773 - loss 0.62180756 - time (sec): 52.96 - samples/sec: 1620.95 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:09:41,089 epoch 1 - iter 616/773 - loss 0.55646217 - time (sec): 61.16 - samples/sec: 1609.64 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:09:48,700 epoch 1 - iter 693/773 - loss 0.50448744 - time (sec): 68.78 - samples/sec: 1614.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:09:56,705 epoch 1 - iter 770/773 - loss 0.46197740 - time (sec): 76.78 - samples/sec: 1611.60 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:09:57,056 ----------------------------------------------------------------------------------------------------
2023-10-17 13:09:57,057 EPOCH 1 done: loss 0.4601 - lr: 0.000030
2023-10-17 13:10:00,370 DEV : loss 0.055146388709545135 - f1-score (micro avg) 0.7536
2023-10-17 13:10:00,402 saving best model
2023-10-17 13:10:01,040 ----------------------------------------------------------------------------------------------------
2023-10-17 13:10:08,616 epoch 2 - iter 77/773 - loss 0.08891357 - time (sec): 7.57 - samples/sec: 1619.07 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:10:16,519 epoch 2 - iter 154/773 - loss 0.08025883 - time (sec): 15.48 - samples/sec: 1618.18 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:10:23,631 epoch 2 - iter 231/773 - loss 0.07291651 - time (sec): 22.59 - samples/sec: 1636.92 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:10:30,834 epoch 2 - iter 308/773 - loss 0.07668126 - time (sec): 29.79 - samples/sec: 1651.60 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:10:38,528 epoch 2 - iter 385/773 - loss 0.07749694 - time (sec): 37.48 - samples/sec: 1640.58 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:10:46,505 epoch 2 - iter 462/773 - loss 0.07639212 - time (sec): 45.46 - samples/sec: 1618.29 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:10:54,438 epoch 2 - iter 539/773 - loss 0.07525245 - time (sec): 53.40 - samples/sec: 1615.54 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:11:02,973 epoch 2 - iter 616/773 - loss 0.07570975 - time (sec): 61.93 - samples/sec: 1583.02 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:11:11,049 epoch 2 - iter 693/773 - loss 0.07390561 - time (sec): 70.01 - samples/sec: 1586.06 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:11:18,885 epoch 2 - iter 770/773 - loss 0.07331939 - time (sec): 77.84 - samples/sec: 1592.96 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:11:19,172 ----------------------------------------------------------------------------------------------------
2023-10-17 13:11:19,173 EPOCH 2 done: loss 0.0733 - lr: 0.000027
2023-10-17 13:11:22,464 DEV : loss 0.0555521659553051 - f1-score (micro avg) 0.7828
2023-10-17 13:11:22,493 saving best model
2023-10-17 13:11:23,970 ----------------------------------------------------------------------------------------------------
2023-10-17 13:11:31,790 epoch 3 - iter 77/773 - loss 0.04302712 - time (sec): 7.82 - samples/sec: 1736.74 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:11:39,410 epoch 3 - iter 154/773 - loss 0.04215121 - time (sec): 15.44 - samples/sec: 1646.04 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:11:47,665 epoch 3 - iter 231/773 - loss 0.04258486 - time (sec): 23.69 - samples/sec: 1601.66 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:11:55,609 epoch 3 - iter 308/773 - loss 0.04530148 - time (sec): 31.64 - samples/sec: 1574.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:12:04,204 epoch 3 - iter 385/773 - loss 0.04445010 - time (sec): 40.23 - samples/sec: 1585.64 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:12:12,128 epoch 3 - iter 462/773 - loss 0.04672927 - time (sec): 48.16 - samples/sec: 1566.26 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:12:19,104 epoch 3 - iter 539/773 - loss 0.04829116 - time (sec): 55.13 - samples/sec: 1593.13 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:12:26,282 epoch 3 - iter 616/773 - loss 0.04728436 - time (sec): 62.31 - samples/sec: 1603.68 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:12:34,349 epoch 3 - iter 693/773 - loss 0.04777688 - time (sec): 70.38 - samples/sec: 1592.63 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:12:41,971 epoch 3 - iter 770/773 - loss 0.04728166 - time (sec): 78.00 - samples/sec: 1588.52 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:12:42,251 ----------------------------------------------------------------------------------------------------
2023-10-17 13:12:42,252 EPOCH 3 done: loss 0.0471 - lr: 0.000023
2023-10-17 13:12:45,338 DEV : loss 0.06297728419303894 - f1-score (micro avg) 0.7748
2023-10-17 13:12:45,370 ----------------------------------------------------------------------------------------------------
2023-10-17 13:12:53,983 epoch 4 - iter 77/773 - loss 0.02732323 - time (sec): 8.61 - samples/sec: 1445.84 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:13:02,146 epoch 4 - iter 154/773 - loss 0.03181965 - time (sec): 16.77 - samples/sec: 1493.94 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:13:10,153 epoch 4 - iter 231/773 - loss 0.03214623 - time (sec): 24.78 - samples/sec: 1480.43 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:13:17,716 epoch 4 - iter 308/773 - loss 0.02980114 - time (sec): 32.34 - samples/sec: 1528.22 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:13:25,433 epoch 4 - iter 385/773 - loss 0.02875251 - time (sec): 40.06 - samples/sec: 1551.31 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:13:33,344 epoch 4 - iter 462/773 - loss 0.03233554 - time (sec): 47.97 - samples/sec: 1560.68 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:13:41,357 epoch 4 - iter 539/773 - loss 0.03122506 - time (sec): 55.99 - samples/sec: 1560.44 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:13:49,884 epoch 4 - iter 616/773 - loss 0.03233573 - time (sec): 64.51 - samples/sec: 1539.64 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:13:57,748 epoch 4 - iter 693/773 - loss 0.03209738 - time (sec): 72.38 - samples/sec: 1538.03 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:14:04,906 epoch 4 - iter 770/773 - loss 0.03230222 - time (sec): 79.53 - samples/sec: 1557.31 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:14:05,197 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:05,197 EPOCH 4 done: loss 0.0325 - lr: 0.000020
2023-10-17 13:14:08,468 DEV : loss 0.09397705644369125 - f1-score (micro avg) 0.7876
2023-10-17 13:14:08,497 saving best model
2023-10-17 13:14:09,959 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:17,907 epoch 5 - iter 77/773 - loss 0.02528085 - time (sec): 7.94 - samples/sec: 1628.49 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:14:25,737 epoch 5 - iter 154/773 - loss 0.02527426 - time (sec): 15.77 - samples/sec: 1580.91 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:14:33,774 epoch 5 - iter 231/773 - loss 0.02362748 - time (sec): 23.81 - samples/sec: 1552.99 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:14:41,673 epoch 5 - iter 308/773 - loss 0.02239320 - time (sec): 31.71 - samples/sec: 1559.77 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:14:49,222 epoch 5 - iter 385/773 - loss 0.02330606 - time (sec): 39.26 - samples/sec: 1554.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:14:56,980 epoch 5 - iter 462/773 - loss 0.02385192 - time (sec): 47.02 - samples/sec: 1562.03 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:15:05,089 epoch 5 - iter 539/773 - loss 0.02449892 - time (sec): 55.13 - samples/sec: 1542.51 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:15:13,885 epoch 5 - iter 616/773 - loss 0.02394531 - time (sec): 63.92 - samples/sec: 1527.94 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:15:21,845 epoch 5 - iter 693/773 - loss 0.02348430 - time (sec): 71.88 - samples/sec: 1546.02 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:15:29,820 epoch 5 - iter 770/773 - loss 0.02290727 - time (sec): 79.86 - samples/sec: 1550.24 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:15:30,108 ----------------------------------------------------------------------------------------------------
2023-10-17 13:15:30,108 EPOCH 5 done: loss 0.0231 - lr: 0.000017
2023-10-17 13:15:33,581 DEV : loss 0.09673922508955002 - f1-score (micro avg) 0.7871
2023-10-17 13:15:33,614 ----------------------------------------------------------------------------------------------------
2023-10-17 13:15:41,113 epoch 6 - iter 77/773 - loss 0.01282970 - time (sec): 7.50 - samples/sec: 1576.16 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:15:48,997 epoch 6 - iter 154/773 - loss 0.01426688 - time (sec): 15.38 - samples/sec: 1607.45 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:15:57,056 epoch 6 - iter 231/773 - loss 0.01542611 - time (sec): 23.44 - samples/sec: 1597.10 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:16:04,734 epoch 6 - iter 308/773 - loss 0.01498317 - time (sec): 31.12 - samples/sec: 1618.59 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:16:11,813 epoch 6 - iter 385/773 - loss 0.01554285 - time (sec): 38.20 - samples/sec: 1621.76 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:16:19,174 epoch 6 - iter 462/773 - loss 0.01570361 - time (sec): 45.56 - samples/sec: 1618.77 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:16:26,803 epoch 6 - iter 539/773 - loss 0.01538039 - time (sec): 53.19 - samples/sec: 1608.65 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:16:34,412 epoch 6 - iter 616/773 - loss 0.01501943 - time (sec): 60.80 - samples/sec: 1610.50 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:16:42,341 epoch 6 - iter 693/773 - loss 0.01478103 - time (sec): 68.72 - samples/sec: 1614.46 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:16:50,931 epoch 6 - iter 770/773 - loss 0.01499341 - time (sec): 77.31 - samples/sec: 1601.28 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:16:51,209 ----------------------------------------------------------------------------------------------------
2023-10-17 13:16:51,209 EPOCH 6 done: loss 0.0149 - lr: 0.000013
2023-10-17 13:16:54,440 DEV : loss 0.10239903628826141 - f1-score (micro avg) 0.8145
2023-10-17 13:16:54,472 saving best model
2023-10-17 13:16:55,963 ----------------------------------------------------------------------------------------------------
2023-10-17 13:17:03,813 epoch 7 - iter 77/773 - loss 0.00993657 - time (sec): 7.85 - samples/sec: 1575.71 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:17:11,672 epoch 7 - iter 154/773 - loss 0.01083244 - time (sec): 15.71 - samples/sec: 1559.76 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:17:20,126 epoch 7 - iter 231/773 - loss 0.01440738 - time (sec): 24.16 - samples/sec: 1534.55 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:17:28,474 epoch 7 - iter 308/773 - loss 0.01310777 - time (sec): 32.51 - samples/sec: 1509.11 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:17:36,571 epoch 7 - iter 385/773 - loss 0.01226929 - time (sec): 40.61 - samples/sec: 1518.86 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:17:44,264 epoch 7 - iter 462/773 - loss 0.01168054 - time (sec): 48.30 - samples/sec: 1512.85 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:17:51,543 epoch 7 - iter 539/773 - loss 0.01131055 - time (sec): 55.58 - samples/sec: 1550.49 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:17:59,150 epoch 7 - iter 616/773 - loss 0.01118152 - time (sec): 63.18 - samples/sec: 1570.31 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:18:06,493 epoch 7 - iter 693/773 - loss 0.01095252 - time (sec): 70.53 - samples/sec: 1583.36 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:18:14,096 epoch 7 - iter 770/773 - loss 0.01093922 - time (sec): 78.13 - samples/sec: 1584.50 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:18:14,408 ----------------------------------------------------------------------------------------------------
2023-10-17 13:18:14,408 EPOCH 7 done: loss 0.0109 - lr: 0.000010
2023-10-17 13:18:17,632 DEV : loss 0.11184219270944595 - f1-score (micro avg) 0.8057
2023-10-17 13:18:17,665 ----------------------------------------------------------------------------------------------------
2023-10-17 13:18:25,406 epoch 8 - iter 77/773 - loss 0.00400052 - time (sec): 7.74 - samples/sec: 1593.59 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:18:33,177 epoch 8 - iter 154/773 - loss 0.00447152 - time (sec): 15.51 - samples/sec: 1569.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:18:40,878 epoch 8 - iter 231/773 - loss 0.00490799 - time (sec): 23.21 - samples/sec: 1565.41 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:18:48,509 epoch 8 - iter 308/773 - loss 0.00494344 - time (sec): 30.84 - samples/sec: 1567.92 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:18:56,212 epoch 8 - iter 385/773 - loss 0.00625829 - time (sec): 38.54 - samples/sec: 1578.24 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:19:04,489 epoch 8 - iter 462/773 - loss 0.00613140 - time (sec): 46.82 - samples/sec: 1564.49 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:19:13,691 epoch 8 - iter 539/773 - loss 0.00633907 - time (sec): 56.02 - samples/sec: 1552.09 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:19:22,168 epoch 8 - iter 616/773 - loss 0.00615248 - time (sec): 64.50 - samples/sec: 1546.11 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:19:30,578 epoch 8 - iter 693/773 - loss 0.00641877 - time (sec): 72.91 - samples/sec: 1533.68 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:19:37,585 epoch 8 - iter 770/773 - loss 0.00648931 - time (sec): 79.92 - samples/sec: 1550.50 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:19:37,865 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:37,865 EPOCH 8 done: loss 0.0066 - lr: 0.000007
2023-10-17 13:19:40,914 DEV : loss 0.1238672062754631 - f1-score (micro avg) 0.7701
2023-10-17 13:19:40,948 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:47,862 epoch 9 - iter 77/773 - loss 0.00422951 - time (sec): 6.91 - samples/sec: 1776.61 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:19:55,147 epoch 9 - iter 154/773 - loss 0.00410486 - time (sec): 14.20 - samples/sec: 1755.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:20:02,671 epoch 9 - iter 231/773 - loss 0.00365045 - time (sec): 21.72 - samples/sec: 1765.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:20:10,380 epoch 9 - iter 308/773 - loss 0.00392863 - time (sec): 29.43 - samples/sec: 1757.76 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:20:17,497 epoch 9 - iter 385/773 - loss 0.00446919 - time (sec): 36.55 - samples/sec: 1704.88 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:20:24,563 epoch 9 - iter 462/773 - loss 0.00449122 - time (sec): 43.61 - samples/sec: 1700.22 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:20:31,806 epoch 9 - iter 539/773 - loss 0.00450468 - time (sec): 50.86 - samples/sec: 1698.95 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:20:39,046 epoch 9 - iter 616/773 - loss 0.00433734 - time (sec): 58.10 - samples/sec: 1694.55 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:20:46,096 epoch 9 - iter 693/773 - loss 0.00402114 - time (sec): 65.15 - samples/sec: 1695.80 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:20:53,731 epoch 9 - iter 770/773 - loss 0.00408108 - time (sec): 72.78 - samples/sec: 1699.18 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:20:54,012 ----------------------------------------------------------------------------------------------------
2023-10-17 13:20:54,013 EPOCH 9 done: loss 0.0041 - lr: 0.000003
2023-10-17 13:20:56,844 DEV : loss 0.12823046743869781 - f1-score (micro avg) 0.7897
2023-10-17 13:20:56,875 ----------------------------------------------------------------------------------------------------
2023-10-17 13:21:04,003 epoch 10 - iter 77/773 - loss 0.00299248 - time (sec): 7.13 - samples/sec: 1816.45 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:21:11,274 epoch 10 - iter 154/773 - loss 0.00431058 - time (sec): 14.40 - samples/sec: 1760.63 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:21:18,785 epoch 10 - iter 231/773 - loss 0.00336600 - time (sec): 21.91 - samples/sec: 1768.91 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:21:26,524 epoch 10 - iter 308/773 - loss 0.00313090 - time (sec): 29.65 - samples/sec: 1721.62 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:21:33,931 epoch 10 - iter 385/773 - loss 0.00278220 - time (sec): 37.05 - samples/sec: 1705.96 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:21:40,847 epoch 10 - iter 462/773 - loss 0.00299155 - time (sec): 43.97 - samples/sec: 1724.70 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:21:48,002 epoch 10 - iter 539/773 - loss 0.00280852 - time (sec): 51.12 - samples/sec: 1716.44 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:21:54,933 epoch 10 - iter 616/773 - loss 0.00285345 - time (sec): 58.06 - samples/sec: 1706.62 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:22:02,114 epoch 10 - iter 693/773 - loss 0.00280881 - time (sec): 65.24 - samples/sec: 1701.66 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:22:09,626 epoch 10 - iter 770/773 - loss 0.00273836 - time (sec): 72.75 - samples/sec: 1701.49 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:22:09,899 ----------------------------------------------------------------------------------------------------
2023-10-17 13:22:09,899 EPOCH 10 done: loss 0.0028 - lr: 0.000000
2023-10-17 13:22:12,830 DEV : loss 0.12347615510225296 - f1-score (micro avg) 0.8024
2023-10-17 13:22:13,409 ----------------------------------------------------------------------------------------------------
2023-10-17 13:22:13,411 Loading model from best epoch ...
2023-10-17 13:22:15,686 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 13:22:24,113
Results:
- F-score (micro) 0.8043
- F-score (macro) 0.7283
- Accuracy 0.691
By class:
precision recall f1-score support
LOC 0.8384 0.8499 0.8441 946
BUILDING 0.6218 0.6486 0.6349 185
STREET 0.6667 0.7500 0.7059 56
micro avg 0.7951 0.8138 0.8043 1187
macro avg 0.7089 0.7495 0.7283 1187
weighted avg 0.7965 0.8138 0.8050 1187
2023-10-17 13:22:24,113 ----------------------------------------------------------------------------------------------------