stefan-it's picture
Upload folder using huggingface_hub
f549f2f
raw
history blame
23.9 kB
2023-10-17 10:54:20,438 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,439 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:54:20,439 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Train: 7936 sentences
2023-10-17 10:54:20,440 (train_with_dev=False, train_with_test=False)
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Training Params:
2023-10-17 10:54:20,440 - learning_rate: "5e-05"
2023-10-17 10:54:20,440 - mini_batch_size: "4"
2023-10-17 10:54:20,440 - max_epochs: "10"
2023-10-17 10:54:20,440 - shuffle: "True"
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Plugins:
2023-10-17 10:54:20,440 - TensorboardLogger
2023-10-17 10:54:20,440 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:54:20,440 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Computation:
2023-10-17 10:54:20,440 - compute on device: cuda:0
2023-10-17 10:54:20,440 - embedding storage: none
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 ----------------------------------------------------------------------------------------------------
2023-10-17 10:54:20,440 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:54:28,907 epoch 1 - iter 198/1984 - loss 1.62211158 - time (sec): 8.47 - samples/sec: 1911.31 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:54:37,933 epoch 1 - iter 396/1984 - loss 0.94619822 - time (sec): 17.49 - samples/sec: 1900.28 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:54:47,160 epoch 1 - iter 594/1984 - loss 0.70711697 - time (sec): 26.72 - samples/sec: 1857.08 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:54:56,036 epoch 1 - iter 792/1984 - loss 0.58185081 - time (sec): 35.59 - samples/sec: 1837.14 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:55:04,916 epoch 1 - iter 990/1984 - loss 0.49792196 - time (sec): 44.47 - samples/sec: 1840.07 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:55:14,026 epoch 1 - iter 1188/1984 - loss 0.43844628 - time (sec): 53.58 - samples/sec: 1840.99 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:55:22,978 epoch 1 - iter 1386/1984 - loss 0.39736182 - time (sec): 62.54 - samples/sec: 1835.63 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:55:32,299 epoch 1 - iter 1584/1984 - loss 0.36824360 - time (sec): 71.86 - samples/sec: 1827.19 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:55:41,543 epoch 1 - iter 1782/1984 - loss 0.34485394 - time (sec): 81.10 - samples/sec: 1815.57 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:55:50,632 epoch 1 - iter 1980/1984 - loss 0.33011193 - time (sec): 90.19 - samples/sec: 1814.56 - lr: 0.000050 - momentum: 0.000000
2023-10-17 10:55:50,815 ----------------------------------------------------------------------------------------------------
2023-10-17 10:55:50,815 EPOCH 1 done: loss 0.3305 - lr: 0.000050
2023-10-17 10:55:53,856 DEV : loss 0.3863277733325958 - f1-score (micro avg) 0.004
2023-10-17 10:55:53,876 saving best model
2023-10-17 10:55:54,244 ----------------------------------------------------------------------------------------------------
2023-10-17 10:56:03,257 epoch 2 - iter 198/1984 - loss 0.16096443 - time (sec): 9.01 - samples/sec: 1876.10 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:56:11,979 epoch 2 - iter 396/1984 - loss 0.14525930 - time (sec): 17.73 - samples/sec: 1867.53 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:56:20,750 epoch 2 - iter 594/1984 - loss 0.14335165 - time (sec): 26.50 - samples/sec: 1872.28 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:56:29,643 epoch 2 - iter 792/1984 - loss 0.13927613 - time (sec): 35.40 - samples/sec: 1877.08 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:56:38,576 epoch 2 - iter 990/1984 - loss 0.13757993 - time (sec): 44.33 - samples/sec: 1862.58 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:56:47,747 epoch 2 - iter 1188/1984 - loss 0.13290183 - time (sec): 53.50 - samples/sec: 1856.82 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:56:56,765 epoch 2 - iter 1386/1984 - loss 0.13087811 - time (sec): 62.52 - samples/sec: 1854.94 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:57:05,839 epoch 2 - iter 1584/1984 - loss 0.13141431 - time (sec): 71.59 - samples/sec: 1839.27 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:57:14,783 epoch 2 - iter 1782/1984 - loss 0.13018538 - time (sec): 80.54 - samples/sec: 1828.57 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:57:23,835 epoch 2 - iter 1980/1984 - loss 0.12957544 - time (sec): 89.59 - samples/sec: 1826.70 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:57:24,015 ----------------------------------------------------------------------------------------------------
2023-10-17 10:57:24,015 EPOCH 2 done: loss 0.1296 - lr: 0.000044
2023-10-17 10:57:27,871 DEV : loss 0.10983909666538239 - f1-score (micro avg) 0.7266
2023-10-17 10:57:27,892 saving best model
2023-10-17 10:57:28,372 ----------------------------------------------------------------------------------------------------
2023-10-17 10:57:37,508 epoch 3 - iter 198/1984 - loss 0.09127543 - time (sec): 9.13 - samples/sec: 1814.20 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:57:46,635 epoch 3 - iter 396/1984 - loss 0.09356135 - time (sec): 18.26 - samples/sec: 1799.61 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:57:55,783 epoch 3 - iter 594/1984 - loss 0.09391553 - time (sec): 27.41 - samples/sec: 1810.02 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:58:04,831 epoch 3 - iter 792/1984 - loss 0.09087245 - time (sec): 36.46 - samples/sec: 1821.93 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:58:13,828 epoch 3 - iter 990/1984 - loss 0.09252505 - time (sec): 45.45 - samples/sec: 1819.54 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:58:22,910 epoch 3 - iter 1188/1984 - loss 0.09198370 - time (sec): 54.54 - samples/sec: 1805.55 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:58:31,977 epoch 3 - iter 1386/1984 - loss 0.09210720 - time (sec): 63.60 - samples/sec: 1804.93 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:58:41,054 epoch 3 - iter 1584/1984 - loss 0.09115675 - time (sec): 72.68 - samples/sec: 1804.76 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:58:50,159 epoch 3 - iter 1782/1984 - loss 0.09117253 - time (sec): 81.79 - samples/sec: 1808.36 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:58:59,410 epoch 3 - iter 1980/1984 - loss 0.09250784 - time (sec): 91.04 - samples/sec: 1797.18 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:58:59,604 ----------------------------------------------------------------------------------------------------
2023-10-17 10:58:59,604 EPOCH 3 done: loss 0.0927 - lr: 0.000039
2023-10-17 10:59:03,046 DEV : loss 0.11815514415502548 - f1-score (micro avg) 0.7673
2023-10-17 10:59:03,068 saving best model
2023-10-17 10:59:03,547 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:12,752 epoch 4 - iter 198/1984 - loss 0.06981980 - time (sec): 9.20 - samples/sec: 1783.67 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:59:22,063 epoch 4 - iter 396/1984 - loss 0.07072403 - time (sec): 18.51 - samples/sec: 1844.43 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:59:31,256 epoch 4 - iter 594/1984 - loss 0.07503217 - time (sec): 27.71 - samples/sec: 1838.25 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:59:40,298 epoch 4 - iter 792/1984 - loss 0.07356156 - time (sec): 36.75 - samples/sec: 1835.12 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:59:49,285 epoch 4 - iter 990/1984 - loss 0.07317536 - time (sec): 45.74 - samples/sec: 1829.88 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:59:58,054 epoch 4 - iter 1188/1984 - loss 0.07519900 - time (sec): 54.51 - samples/sec: 1822.01 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:00:06,621 epoch 4 - iter 1386/1984 - loss 0.07351265 - time (sec): 63.07 - samples/sec: 1822.06 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:00:15,739 epoch 4 - iter 1584/1984 - loss 0.07234519 - time (sec): 72.19 - samples/sec: 1814.86 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:00:24,737 epoch 4 - iter 1782/1984 - loss 0.07317391 - time (sec): 81.19 - samples/sec: 1819.38 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:00:33,691 epoch 4 - iter 1980/1984 - loss 0.07222348 - time (sec): 90.14 - samples/sec: 1816.69 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:00:33,875 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:33,875 EPOCH 4 done: loss 0.0722 - lr: 0.000033
2023-10-17 11:00:37,278 DEV : loss 0.15266923606395721 - f1-score (micro avg) 0.7663
2023-10-17 11:00:37,300 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:46,355 epoch 5 - iter 198/1984 - loss 0.05046205 - time (sec): 9.05 - samples/sec: 1795.87 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:00:55,385 epoch 5 - iter 396/1984 - loss 0.05247464 - time (sec): 18.08 - samples/sec: 1852.85 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:01:04,492 epoch 5 - iter 594/1984 - loss 0.05489266 - time (sec): 27.19 - samples/sec: 1846.51 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:01:13,843 epoch 5 - iter 792/1984 - loss 0.05563933 - time (sec): 36.54 - samples/sec: 1853.29 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:01:23,006 epoch 5 - iter 990/1984 - loss 0.05335869 - time (sec): 45.70 - samples/sec: 1841.01 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:01:32,134 epoch 5 - iter 1188/1984 - loss 0.05269990 - time (sec): 54.83 - samples/sec: 1823.48 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:01:41,271 epoch 5 - iter 1386/1984 - loss 0.05371472 - time (sec): 63.97 - samples/sec: 1819.47 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:01:50,241 epoch 5 - iter 1584/1984 - loss 0.05415103 - time (sec): 72.94 - samples/sec: 1812.74 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:01:59,308 epoch 5 - iter 1782/1984 - loss 0.05436597 - time (sec): 82.01 - samples/sec: 1807.24 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:02:08,369 epoch 5 - iter 1980/1984 - loss 0.05389026 - time (sec): 91.07 - samples/sec: 1796.79 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:02:08,550 ----------------------------------------------------------------------------------------------------
2023-10-17 11:02:08,551 EPOCH 5 done: loss 0.0539 - lr: 0.000028
2023-10-17 11:02:11,912 DEV : loss 0.19919680058956146 - f1-score (micro avg) 0.7507
2023-10-17 11:02:11,932 ----------------------------------------------------------------------------------------------------
2023-10-17 11:02:21,160 epoch 6 - iter 198/1984 - loss 0.04620617 - time (sec): 9.23 - samples/sec: 1782.45 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:02:30,334 epoch 6 - iter 396/1984 - loss 0.04843778 - time (sec): 18.40 - samples/sec: 1774.97 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:02:39,513 epoch 6 - iter 594/1984 - loss 0.04484302 - time (sec): 27.58 - samples/sec: 1822.04 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:02:48,566 epoch 6 - iter 792/1984 - loss 0.04250884 - time (sec): 36.63 - samples/sec: 1818.99 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:02:57,657 epoch 6 - iter 990/1984 - loss 0.04152118 - time (sec): 45.72 - samples/sec: 1824.08 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:03:06,709 epoch 6 - iter 1188/1984 - loss 0.04219659 - time (sec): 54.78 - samples/sec: 1820.23 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:03:15,840 epoch 6 - iter 1386/1984 - loss 0.04406527 - time (sec): 63.91 - samples/sec: 1804.54 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:03:25,050 epoch 6 - iter 1584/1984 - loss 0.04345490 - time (sec): 73.12 - samples/sec: 1793.93 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:03:34,090 epoch 6 - iter 1782/1984 - loss 0.04365630 - time (sec): 82.16 - samples/sec: 1792.95 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:03:43,312 epoch 6 - iter 1980/1984 - loss 0.04303766 - time (sec): 91.38 - samples/sec: 1790.06 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:03:43,494 ----------------------------------------------------------------------------------------------------
2023-10-17 11:03:43,494 EPOCH 6 done: loss 0.0429 - lr: 0.000022
2023-10-17 11:03:46,982 DEV : loss 0.19526612758636475 - f1-score (micro avg) 0.759
2023-10-17 11:03:47,005 ----------------------------------------------------------------------------------------------------
2023-10-17 11:03:56,272 epoch 7 - iter 198/1984 - loss 0.02150771 - time (sec): 9.27 - samples/sec: 1758.81 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:04:05,509 epoch 7 - iter 396/1984 - loss 0.02251021 - time (sec): 18.50 - samples/sec: 1773.66 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:04:14,821 epoch 7 - iter 594/1984 - loss 0.02431276 - time (sec): 27.82 - samples/sec: 1781.84 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:04:23,989 epoch 7 - iter 792/1984 - loss 0.02510360 - time (sec): 36.98 - samples/sec: 1775.37 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:04:33,324 epoch 7 - iter 990/1984 - loss 0.02564549 - time (sec): 46.32 - samples/sec: 1769.45 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:04:42,517 epoch 7 - iter 1188/1984 - loss 0.02559000 - time (sec): 55.51 - samples/sec: 1773.24 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:04:51,705 epoch 7 - iter 1386/1984 - loss 0.02670608 - time (sec): 64.70 - samples/sec: 1773.77 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:05:00,894 epoch 7 - iter 1584/1984 - loss 0.02761748 - time (sec): 73.89 - samples/sec: 1763.66 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:05:10,013 epoch 7 - iter 1782/1984 - loss 0.02914139 - time (sec): 83.01 - samples/sec: 1776.01 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:05:19,040 epoch 7 - iter 1980/1984 - loss 0.02930128 - time (sec): 92.03 - samples/sec: 1778.63 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:05:19,223 ----------------------------------------------------------------------------------------------------
2023-10-17 11:05:19,223 EPOCH 7 done: loss 0.0292 - lr: 0.000017
2023-10-17 11:05:23,023 DEV : loss 0.2106274515390396 - f1-score (micro avg) 0.7652
2023-10-17 11:05:23,044 ----------------------------------------------------------------------------------------------------
2023-10-17 11:05:32,176 epoch 8 - iter 198/1984 - loss 0.01532135 - time (sec): 9.13 - samples/sec: 1792.67 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:05:41,290 epoch 8 - iter 396/1984 - loss 0.01844700 - time (sec): 18.24 - samples/sec: 1775.71 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:05:50,631 epoch 8 - iter 594/1984 - loss 0.01680914 - time (sec): 27.59 - samples/sec: 1813.29 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:05:59,572 epoch 8 - iter 792/1984 - loss 0.01532722 - time (sec): 36.53 - samples/sec: 1809.50 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:06:08,828 epoch 8 - iter 990/1984 - loss 0.01651685 - time (sec): 45.78 - samples/sec: 1817.17 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:06:17,849 epoch 8 - iter 1188/1984 - loss 0.01693879 - time (sec): 54.80 - samples/sec: 1823.06 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:06:26,896 epoch 8 - iter 1386/1984 - loss 0.01791079 - time (sec): 63.85 - samples/sec: 1810.51 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:06:35,832 epoch 8 - iter 1584/1984 - loss 0.01862913 - time (sec): 72.79 - samples/sec: 1794.24 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:06:44,860 epoch 8 - iter 1782/1984 - loss 0.01920153 - time (sec): 81.81 - samples/sec: 1799.31 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:06:53,979 epoch 8 - iter 1980/1984 - loss 0.01890437 - time (sec): 90.93 - samples/sec: 1799.51 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:06:54,163 ----------------------------------------------------------------------------------------------------
2023-10-17 11:06:54,163 EPOCH 8 done: loss 0.0189 - lr: 0.000011
2023-10-17 11:06:57,535 DEV : loss 0.25147587060928345 - f1-score (micro avg) 0.7642
2023-10-17 11:06:57,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:07:06,501 epoch 9 - iter 198/1984 - loss 0.01800551 - time (sec): 8.94 - samples/sec: 1771.15 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:07:15,377 epoch 9 - iter 396/1984 - loss 0.01378590 - time (sec): 17.82 - samples/sec: 1863.42 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:07:24,608 epoch 9 - iter 594/1984 - loss 0.01314436 - time (sec): 27.05 - samples/sec: 1863.71 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:07:33,694 epoch 9 - iter 792/1984 - loss 0.01248684 - time (sec): 36.14 - samples/sec: 1845.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:07:42,786 epoch 9 - iter 990/1984 - loss 0.01186486 - time (sec): 45.23 - samples/sec: 1825.45 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:07:51,933 epoch 9 - iter 1188/1984 - loss 0.01172450 - time (sec): 54.38 - samples/sec: 1815.57 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:08:01,131 epoch 9 - iter 1386/1984 - loss 0.01171225 - time (sec): 63.57 - samples/sec: 1805.34 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:08:10,295 epoch 9 - iter 1584/1984 - loss 0.01234481 - time (sec): 72.74 - samples/sec: 1808.45 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:08:19,267 epoch 9 - iter 1782/1984 - loss 0.01340571 - time (sec): 81.71 - samples/sec: 1807.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:08:28,340 epoch 9 - iter 1980/1984 - loss 0.01367155 - time (sec): 90.78 - samples/sec: 1803.06 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:08:28,512 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:28,512 EPOCH 9 done: loss 0.0136 - lr: 0.000006
2023-10-17 11:08:31,940 DEV : loss 0.23887786269187927 - f1-score (micro avg) 0.7643
2023-10-17 11:08:31,962 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:40,850 epoch 10 - iter 198/1984 - loss 0.00470077 - time (sec): 8.89 - samples/sec: 1858.08 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:08:49,392 epoch 10 - iter 396/1984 - loss 0.00422202 - time (sec): 17.43 - samples/sec: 1844.72 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:08:58,007 epoch 10 - iter 594/1984 - loss 0.00543694 - time (sec): 26.04 - samples/sec: 1866.92 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:09:06,736 epoch 10 - iter 792/1984 - loss 0.00646607 - time (sec): 34.77 - samples/sec: 1863.80 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:09:15,340 epoch 10 - iter 990/1984 - loss 0.00650601 - time (sec): 43.38 - samples/sec: 1857.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:09:23,989 epoch 10 - iter 1188/1984 - loss 0.00670870 - time (sec): 52.03 - samples/sec: 1873.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:09:32,785 epoch 10 - iter 1386/1984 - loss 0.00698514 - time (sec): 60.82 - samples/sec: 1884.69 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:09:41,466 epoch 10 - iter 1584/1984 - loss 0.00711906 - time (sec): 69.50 - samples/sec: 1886.06 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:09:50,532 epoch 10 - iter 1782/1984 - loss 0.00702130 - time (sec): 78.57 - samples/sec: 1870.40 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:09:59,828 epoch 10 - iter 1980/1984 - loss 0.00764656 - time (sec): 87.86 - samples/sec: 1863.19 - lr: 0.000000 - momentum: 0.000000
2023-10-17 11:10:00,016 ----------------------------------------------------------------------------------------------------
2023-10-17 11:10:00,016 EPOCH 10 done: loss 0.0076 - lr: 0.000000
2023-10-17 11:10:03,438 DEV : loss 0.2601605951786041 - f1-score (micro avg) 0.7684
2023-10-17 11:10:03,459 saving best model
2023-10-17 11:10:04,382 ----------------------------------------------------------------------------------------------------
2023-10-17 11:10:04,383 Loading model from best epoch ...
2023-10-17 11:10:05,772 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 11:10:08,979
Results:
- F-score (micro) 0.7792
- F-score (macro) 0.6974
- Accuracy 0.6658
By class:
precision recall f1-score support
LOC 0.8303 0.8443 0.8372 655
PER 0.7028 0.7848 0.7415 223
ORG 0.6000 0.4488 0.5135 127
micro avg 0.7772 0.7811 0.7792 1005
macro avg 0.7110 0.6926 0.6974 1005
weighted avg 0.7729 0.7811 0.7751 1005
2023-10-17 11:10:08,979 ----------------------------------------------------------------------------------------------------