stefan-it's picture
Upload folder using huggingface_hub
f57246c
2023-10-17 08:58:17,407 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,408 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 08:58:17,408 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,408 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-17 08:58:17,408 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,408 Train: 1100 sentences
2023-10-17 08:58:17,408 (train_with_dev=False, train_with_test=False)
2023-10-17 08:58:17,408 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,408 Training Params:
2023-10-17 08:58:17,408 - learning_rate: "5e-05"
2023-10-17 08:58:17,408 - mini_batch_size: "4"
2023-10-17 08:58:17,408 - max_epochs: "10"
2023-10-17 08:58:17,408 - shuffle: "True"
2023-10-17 08:58:17,408 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,408 Plugins:
2023-10-17 08:58:17,409 - TensorboardLogger
2023-10-17 08:58:17,409 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 08:58:17,409 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,409 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 08:58:17,409 - metric: "('micro avg', 'f1-score')"
2023-10-17 08:58:17,409 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,409 Computation:
2023-10-17 08:58:17,409 - compute on device: cuda:0
2023-10-17 08:58:17,409 - embedding storage: none
2023-10-17 08:58:17,409 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,409 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 08:58:17,409 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,409 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:17,409 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 08:58:18,605 epoch 1 - iter 27/275 - loss 4.07866641 - time (sec): 1.19 - samples/sec: 1976.27 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:58:19,775 epoch 1 - iter 54/275 - loss 3.29122762 - time (sec): 2.37 - samples/sec: 1984.04 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:58:20,926 epoch 1 - iter 81/275 - loss 2.54691304 - time (sec): 3.52 - samples/sec: 1958.01 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:58:22,121 epoch 1 - iter 108/275 - loss 2.12670496 - time (sec): 4.71 - samples/sec: 1912.47 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:58:23,343 epoch 1 - iter 135/275 - loss 1.79077335 - time (sec): 5.93 - samples/sec: 1917.19 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:58:24,549 epoch 1 - iter 162/275 - loss 1.55375745 - time (sec): 7.14 - samples/sec: 1907.76 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:58:25,780 epoch 1 - iter 189/275 - loss 1.37607396 - time (sec): 8.37 - samples/sec: 1919.35 - lr: 0.000034 - momentum: 0.000000
2023-10-17 08:58:27,007 epoch 1 - iter 216/275 - loss 1.24831931 - time (sec): 9.60 - samples/sec: 1917.67 - lr: 0.000039 - momentum: 0.000000
2023-10-17 08:58:28,229 epoch 1 - iter 243/275 - loss 1.15443524 - time (sec): 10.82 - samples/sec: 1879.47 - lr: 0.000044 - momentum: 0.000000
2023-10-17 08:58:29,450 epoch 1 - iter 270/275 - loss 1.07150887 - time (sec): 12.04 - samples/sec: 1855.75 - lr: 0.000049 - momentum: 0.000000
2023-10-17 08:58:29,669 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:29,669 EPOCH 1 done: loss 1.0540 - lr: 0.000049
2023-10-17 08:58:30,219 DEV : loss 0.21305261552333832 - f1-score (micro avg) 0.689
2023-10-17 08:58:30,225 saving best model
2023-10-17 08:58:30,589 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:31,803 epoch 2 - iter 27/275 - loss 0.17379043 - time (sec): 1.21 - samples/sec: 1836.45 - lr: 0.000049 - momentum: 0.000000
2023-10-17 08:58:33,023 epoch 2 - iter 54/275 - loss 0.20261401 - time (sec): 2.43 - samples/sec: 1731.08 - lr: 0.000049 - momentum: 0.000000
2023-10-17 08:58:34,261 epoch 2 - iter 81/275 - loss 0.18846942 - time (sec): 3.67 - samples/sec: 1771.69 - lr: 0.000048 - momentum: 0.000000
2023-10-17 08:58:35,484 epoch 2 - iter 108/275 - loss 0.20079845 - time (sec): 4.89 - samples/sec: 1774.39 - lr: 0.000048 - momentum: 0.000000
2023-10-17 08:58:36,710 epoch 2 - iter 135/275 - loss 0.19197758 - time (sec): 6.12 - samples/sec: 1762.60 - lr: 0.000047 - momentum: 0.000000
2023-10-17 08:58:37,914 epoch 2 - iter 162/275 - loss 0.18863668 - time (sec): 7.32 - samples/sec: 1773.15 - lr: 0.000047 - momentum: 0.000000
2023-10-17 08:58:39,138 epoch 2 - iter 189/275 - loss 0.18366584 - time (sec): 8.55 - samples/sec: 1781.81 - lr: 0.000046 - momentum: 0.000000
2023-10-17 08:58:40,371 epoch 2 - iter 216/275 - loss 0.17800713 - time (sec): 9.78 - samples/sec: 1797.98 - lr: 0.000046 - momentum: 0.000000
2023-10-17 08:58:41,611 epoch 2 - iter 243/275 - loss 0.17153577 - time (sec): 11.02 - samples/sec: 1811.35 - lr: 0.000045 - momentum: 0.000000
2023-10-17 08:58:42,818 epoch 2 - iter 270/275 - loss 0.17760855 - time (sec): 12.23 - samples/sec: 1826.07 - lr: 0.000045 - momentum: 0.000000
2023-10-17 08:58:43,040 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:43,040 EPOCH 2 done: loss 0.1758 - lr: 0.000045
2023-10-17 08:58:43,808 DEV : loss 0.18843930959701538 - f1-score (micro avg) 0.7895
2023-10-17 08:58:43,815 saving best model
2023-10-17 08:58:44,366 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:45,831 epoch 3 - iter 27/275 - loss 0.11796556 - time (sec): 1.46 - samples/sec: 1664.07 - lr: 0.000044 - momentum: 0.000000
2023-10-17 08:58:47,270 epoch 3 - iter 54/275 - loss 0.11204503 - time (sec): 2.90 - samples/sec: 1569.24 - lr: 0.000043 - momentum: 0.000000
2023-10-17 08:58:48,521 epoch 3 - iter 81/275 - loss 0.10541729 - time (sec): 4.15 - samples/sec: 1678.87 - lr: 0.000043 - momentum: 0.000000
2023-10-17 08:58:49,769 epoch 3 - iter 108/275 - loss 0.09944435 - time (sec): 5.40 - samples/sec: 1682.28 - lr: 0.000042 - momentum: 0.000000
2023-10-17 08:58:51,020 epoch 3 - iter 135/275 - loss 0.08997580 - time (sec): 6.65 - samples/sec: 1716.47 - lr: 0.000042 - momentum: 0.000000
2023-10-17 08:58:52,209 epoch 3 - iter 162/275 - loss 0.09204905 - time (sec): 7.84 - samples/sec: 1716.14 - lr: 0.000041 - momentum: 0.000000
2023-10-17 08:58:53,645 epoch 3 - iter 189/275 - loss 0.09910161 - time (sec): 9.28 - samples/sec: 1684.31 - lr: 0.000041 - momentum: 0.000000
2023-10-17 08:58:54,860 epoch 3 - iter 216/275 - loss 0.10368806 - time (sec): 10.49 - samples/sec: 1711.55 - lr: 0.000040 - momentum: 0.000000
2023-10-17 08:58:56,133 epoch 3 - iter 243/275 - loss 0.10108127 - time (sec): 11.77 - samples/sec: 1727.07 - lr: 0.000040 - momentum: 0.000000
2023-10-17 08:58:57,332 epoch 3 - iter 270/275 - loss 0.10499294 - time (sec): 12.96 - samples/sec: 1727.39 - lr: 0.000039 - momentum: 0.000000
2023-10-17 08:58:57,564 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:57,564 EPOCH 3 done: loss 0.1036 - lr: 0.000039
2023-10-17 08:58:58,201 DEV : loss 0.17471981048583984 - f1-score (micro avg) 0.8381
2023-10-17 08:58:58,206 saving best model
2023-10-17 08:58:58,662 ----------------------------------------------------------------------------------------------------
2023-10-17 08:58:59,945 epoch 4 - iter 27/275 - loss 0.06961920 - time (sec): 1.28 - samples/sec: 1517.71 - lr: 0.000038 - momentum: 0.000000
2023-10-17 08:59:01,182 epoch 4 - iter 54/275 - loss 0.05928687 - time (sec): 2.52 - samples/sec: 1617.38 - lr: 0.000038 - momentum: 0.000000
2023-10-17 08:59:02,416 epoch 4 - iter 81/275 - loss 0.07798742 - time (sec): 3.75 - samples/sec: 1739.45 - lr: 0.000037 - momentum: 0.000000
2023-10-17 08:59:03,637 epoch 4 - iter 108/275 - loss 0.07703170 - time (sec): 4.97 - samples/sec: 1778.57 - lr: 0.000037 - momentum: 0.000000
2023-10-17 08:59:04,939 epoch 4 - iter 135/275 - loss 0.06995502 - time (sec): 6.27 - samples/sec: 1798.39 - lr: 0.000036 - momentum: 0.000000
2023-10-17 08:59:06,139 epoch 4 - iter 162/275 - loss 0.07538902 - time (sec): 7.47 - samples/sec: 1800.10 - lr: 0.000036 - momentum: 0.000000
2023-10-17 08:59:07,403 epoch 4 - iter 189/275 - loss 0.09041096 - time (sec): 8.74 - samples/sec: 1797.81 - lr: 0.000035 - momentum: 0.000000
2023-10-17 08:59:08,633 epoch 4 - iter 216/275 - loss 0.08712686 - time (sec): 9.97 - samples/sec: 1824.14 - lr: 0.000035 - momentum: 0.000000
2023-10-17 08:59:09,874 epoch 4 - iter 243/275 - loss 0.08979277 - time (sec): 11.21 - samples/sec: 1831.00 - lr: 0.000034 - momentum: 0.000000
2023-10-17 08:59:11,099 epoch 4 - iter 270/275 - loss 0.08447754 - time (sec): 12.43 - samples/sec: 1803.26 - lr: 0.000034 - momentum: 0.000000
2023-10-17 08:59:11,324 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:11,324 EPOCH 4 done: loss 0.0854 - lr: 0.000034
2023-10-17 08:59:11,991 DEV : loss 0.20809108018875122 - f1-score (micro avg) 0.8396
2023-10-17 08:59:11,998 saving best model
2023-10-17 08:59:12,432 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:13,700 epoch 5 - iter 27/275 - loss 0.04462372 - time (sec): 1.27 - samples/sec: 1776.94 - lr: 0.000033 - momentum: 0.000000
2023-10-17 08:59:14,965 epoch 5 - iter 54/275 - loss 0.09665540 - time (sec): 2.53 - samples/sec: 1809.57 - lr: 0.000032 - momentum: 0.000000
2023-10-17 08:59:16,195 epoch 5 - iter 81/275 - loss 0.08000189 - time (sec): 3.76 - samples/sec: 1901.41 - lr: 0.000032 - momentum: 0.000000
2023-10-17 08:59:17,441 epoch 5 - iter 108/275 - loss 0.06849516 - time (sec): 5.01 - samples/sec: 1890.98 - lr: 0.000031 - momentum: 0.000000
2023-10-17 08:59:18,647 epoch 5 - iter 135/275 - loss 0.06277789 - time (sec): 6.21 - samples/sec: 1875.02 - lr: 0.000031 - momentum: 0.000000
2023-10-17 08:59:19,862 epoch 5 - iter 162/275 - loss 0.06311559 - time (sec): 7.43 - samples/sec: 1871.34 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:59:21,067 epoch 5 - iter 189/275 - loss 0.06254311 - time (sec): 8.63 - samples/sec: 1855.50 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:59:22,325 epoch 5 - iter 216/275 - loss 0.06440827 - time (sec): 9.89 - samples/sec: 1834.23 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:59:23,541 epoch 5 - iter 243/275 - loss 0.06533553 - time (sec): 11.11 - samples/sec: 1812.20 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:59:24,772 epoch 5 - iter 270/275 - loss 0.06177798 - time (sec): 12.34 - samples/sec: 1807.63 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:59:25,003 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:25,003 EPOCH 5 done: loss 0.0605 - lr: 0.000028
2023-10-17 08:59:25,661 DEV : loss 0.16310758888721466 - f1-score (micro avg) 0.8638
2023-10-17 08:59:25,666 saving best model
2023-10-17 08:59:26,106 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:27,381 epoch 6 - iter 27/275 - loss 0.05938802 - time (sec): 1.27 - samples/sec: 1761.49 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:59:28,605 epoch 6 - iter 54/275 - loss 0.03521374 - time (sec): 2.49 - samples/sec: 1880.76 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:59:29,907 epoch 6 - iter 81/275 - loss 0.04884466 - time (sec): 3.80 - samples/sec: 1829.45 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:59:31,160 epoch 6 - iter 108/275 - loss 0.05158786 - time (sec): 5.05 - samples/sec: 1783.64 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:59:32,408 epoch 6 - iter 135/275 - loss 0.04511006 - time (sec): 6.30 - samples/sec: 1770.43 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:59:33,652 epoch 6 - iter 162/275 - loss 0.04555781 - time (sec): 7.54 - samples/sec: 1788.09 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:59:34,877 epoch 6 - iter 189/275 - loss 0.04640001 - time (sec): 8.77 - samples/sec: 1811.19 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:59:36,129 epoch 6 - iter 216/275 - loss 0.04597427 - time (sec): 10.02 - samples/sec: 1807.09 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:59:37,371 epoch 6 - iter 243/275 - loss 0.04373227 - time (sec): 11.26 - samples/sec: 1793.75 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:59:38,625 epoch 6 - iter 270/275 - loss 0.04377672 - time (sec): 12.51 - samples/sec: 1790.20 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:59:38,866 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:38,866 EPOCH 6 done: loss 0.0436 - lr: 0.000022
2023-10-17 08:59:39,503 DEV : loss 0.18482080101966858 - f1-score (micro avg) 0.8667
2023-10-17 08:59:39,507 saving best model
2023-10-17 08:59:39,942 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:41,185 epoch 7 - iter 27/275 - loss 0.00375937 - time (sec): 1.24 - samples/sec: 1787.06 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:59:42,401 epoch 7 - iter 54/275 - loss 0.03130435 - time (sec): 2.46 - samples/sec: 1792.81 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:59:43,658 epoch 7 - iter 81/275 - loss 0.02778327 - time (sec): 3.71 - samples/sec: 1813.60 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:59:44,936 epoch 7 - iter 108/275 - loss 0.02223742 - time (sec): 4.99 - samples/sec: 1784.16 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:59:46,108 epoch 7 - iter 135/275 - loss 0.02443230 - time (sec): 6.16 - samples/sec: 1804.63 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:59:47,286 epoch 7 - iter 162/275 - loss 0.02403903 - time (sec): 7.34 - samples/sec: 1819.87 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:59:48,489 epoch 7 - iter 189/275 - loss 0.02872301 - time (sec): 8.54 - samples/sec: 1812.79 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:59:49,711 epoch 7 - iter 216/275 - loss 0.02738304 - time (sec): 9.77 - samples/sec: 1827.30 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:59:50,937 epoch 7 - iter 243/275 - loss 0.02863521 - time (sec): 10.99 - samples/sec: 1840.26 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:59:52,153 epoch 7 - iter 270/275 - loss 0.02704741 - time (sec): 12.21 - samples/sec: 1830.73 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:59:52,375 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:52,375 EPOCH 7 done: loss 0.0266 - lr: 0.000017
2023-10-17 08:59:53,010 DEV : loss 0.18936654925346375 - f1-score (micro avg) 0.8661
2023-10-17 08:59:53,015 ----------------------------------------------------------------------------------------------------
2023-10-17 08:59:54,241 epoch 8 - iter 27/275 - loss 0.01591955 - time (sec): 1.22 - samples/sec: 1905.76 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:59:55,473 epoch 8 - iter 54/275 - loss 0.01525013 - time (sec): 2.46 - samples/sec: 1931.06 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:59:56,676 epoch 8 - iter 81/275 - loss 0.01576595 - time (sec): 3.66 - samples/sec: 1865.01 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:59:57,880 epoch 8 - iter 108/275 - loss 0.02936768 - time (sec): 4.86 - samples/sec: 1838.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:59:59,121 epoch 8 - iter 135/275 - loss 0.02952848 - time (sec): 6.10 - samples/sec: 1823.65 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:00:00,326 epoch 8 - iter 162/275 - loss 0.02768193 - time (sec): 7.31 - samples/sec: 1838.08 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:00:01,548 epoch 8 - iter 189/275 - loss 0.02911434 - time (sec): 8.53 - samples/sec: 1827.23 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:00:02,765 epoch 8 - iter 216/275 - loss 0.02582105 - time (sec): 9.75 - samples/sec: 1824.74 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:00:04,011 epoch 8 - iter 243/275 - loss 0.02566323 - time (sec): 11.00 - samples/sec: 1839.93 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:00:05,181 epoch 8 - iter 270/275 - loss 0.02392802 - time (sec): 12.17 - samples/sec: 1849.66 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:00:05,394 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:05,394 EPOCH 8 done: loss 0.0236 - lr: 0.000011
2023-10-17 09:00:06,050 DEV : loss 0.19396452605724335 - f1-score (micro avg) 0.8786
2023-10-17 09:00:06,055 saving best model
2023-10-17 09:00:06,500 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:07,759 epoch 9 - iter 27/275 - loss 0.02131835 - time (sec): 1.26 - samples/sec: 1831.06 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:00:08,979 epoch 9 - iter 54/275 - loss 0.01584945 - time (sec): 2.48 - samples/sec: 1792.56 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:00:10,187 epoch 9 - iter 81/275 - loss 0.01163448 - time (sec): 3.68 - samples/sec: 1788.02 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:00:11,413 epoch 9 - iter 108/275 - loss 0.01764179 - time (sec): 4.91 - samples/sec: 1802.25 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:00:12,655 epoch 9 - iter 135/275 - loss 0.01935884 - time (sec): 6.15 - samples/sec: 1791.20 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:00:13,878 epoch 9 - iter 162/275 - loss 0.01708720 - time (sec): 7.38 - samples/sec: 1786.96 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:00:15,124 epoch 9 - iter 189/275 - loss 0.01596518 - time (sec): 8.62 - samples/sec: 1813.88 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:00:16,378 epoch 9 - iter 216/275 - loss 0.01572838 - time (sec): 9.88 - samples/sec: 1836.50 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:00:17,606 epoch 9 - iter 243/275 - loss 0.01611628 - time (sec): 11.10 - samples/sec: 1842.04 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:00:18,832 epoch 9 - iter 270/275 - loss 0.01566388 - time (sec): 12.33 - samples/sec: 1816.57 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:00:19,062 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:19,062 EPOCH 9 done: loss 0.0154 - lr: 0.000006
2023-10-17 09:00:19,699 DEV : loss 0.19913232326507568 - f1-score (micro avg) 0.8758
2023-10-17 09:00:19,703 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:20,910 epoch 10 - iter 27/275 - loss 0.02597141 - time (sec): 1.21 - samples/sec: 1665.80 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:00:22,172 epoch 10 - iter 54/275 - loss 0.01805506 - time (sec): 2.47 - samples/sec: 1648.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:00:23,400 epoch 10 - iter 81/275 - loss 0.01317504 - time (sec): 3.70 - samples/sec: 1721.95 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:00:24,643 epoch 10 - iter 108/275 - loss 0.01046857 - time (sec): 4.94 - samples/sec: 1772.68 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:00:25,890 epoch 10 - iter 135/275 - loss 0.00992712 - time (sec): 6.19 - samples/sec: 1788.40 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:00:27,118 epoch 10 - iter 162/275 - loss 0.01541164 - time (sec): 7.41 - samples/sec: 1821.87 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:00:28,329 epoch 10 - iter 189/275 - loss 0.01383796 - time (sec): 8.62 - samples/sec: 1826.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:00:29,529 epoch 10 - iter 216/275 - loss 0.01212638 - time (sec): 9.82 - samples/sec: 1830.76 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:00:30,752 epoch 10 - iter 243/275 - loss 0.01189330 - time (sec): 11.05 - samples/sec: 1833.49 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:00:31,980 epoch 10 - iter 270/275 - loss 0.01145146 - time (sec): 12.28 - samples/sec: 1830.13 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:00:32,203 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:32,203 EPOCH 10 done: loss 0.0113 - lr: 0.000000
2023-10-17 09:00:32,836 DEV : loss 0.2009844183921814 - f1-score (micro avg) 0.8785
2023-10-17 09:00:33,192 ----------------------------------------------------------------------------------------------------
2023-10-17 09:00:33,193 Loading model from best epoch ...
2023-10-17 09:00:34,548 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 09:00:35,319
Results:
- F-score (micro) 0.9048
- F-score (macro) 0.8717
- Accuracy 0.8403
By class:
precision recall f1-score support
scope 0.8920 0.8920 0.8920 176
pers 0.9837 0.9453 0.9641 128
work 0.8472 0.8243 0.8356 74
object 1.0000 1.0000 1.0000 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9144 0.8953 0.9048 382
macro avg 0.9446 0.8323 0.8717 382
weighted avg 0.9152 0.8953 0.9047 382
2023-10-17 09:00:35,319 ----------------------------------------------------------------------------------------------------