2023-10-17 10:54:20,438 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,439 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:54:20,439 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Train: 7936 sentences 2023-10-17 10:54:20,440 (train_with_dev=False, train_with_test=False) 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Training Params: 2023-10-17 10:54:20,440 - learning_rate: "5e-05" 2023-10-17 10:54:20,440 - mini_batch_size: "4" 2023-10-17 10:54:20,440 - max_epochs: "10" 2023-10-17 10:54:20,440 - shuffle: "True" 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Plugins: 2023-10-17 10:54:20,440 - TensorboardLogger 2023-10-17 10:54:20,440 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:54:20,440 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Computation: 2023-10-17 10:54:20,440 - compute on device: cuda:0 2023-10-17 10:54:20,440 - embedding storage: none 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:54:20,440 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:54:28,907 epoch 1 - iter 198/1984 - loss 1.62211158 - time (sec): 8.47 - samples/sec: 1911.31 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:54:37,933 epoch 1 - iter 396/1984 - loss 0.94619822 - time (sec): 17.49 - samples/sec: 1900.28 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:54:47,160 epoch 1 - iter 594/1984 - loss 0.70711697 - time (sec): 26.72 - samples/sec: 1857.08 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:54:56,036 epoch 1 - iter 792/1984 - loss 0.58185081 - time (sec): 35.59 - samples/sec: 1837.14 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:55:04,916 epoch 1 - iter 990/1984 - loss 0.49792196 - time (sec): 44.47 - samples/sec: 1840.07 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:55:14,026 epoch 1 - iter 1188/1984 - loss 0.43844628 - time (sec): 53.58 - samples/sec: 1840.99 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:55:22,978 epoch 1 - iter 1386/1984 - loss 0.39736182 - time (sec): 62.54 - samples/sec: 1835.63 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:55:32,299 epoch 1 - iter 1584/1984 - loss 0.36824360 - time (sec): 71.86 - samples/sec: 1827.19 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:55:41,543 epoch 1 - iter 1782/1984 - loss 0.34485394 - time (sec): 81.10 - samples/sec: 1815.57 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:55:50,632 epoch 1 - iter 1980/1984 - loss 0.33011193 - time (sec): 90.19 - samples/sec: 1814.56 - lr: 0.000050 - momentum: 0.000000 2023-10-17 10:55:50,815 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:55:50,815 EPOCH 1 done: loss 0.3305 - lr: 0.000050 2023-10-17 10:55:53,856 DEV : loss 0.3863277733325958 - f1-score (micro avg) 0.004 2023-10-17 10:55:53,876 saving best model 2023-10-17 10:55:54,244 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:56:03,257 epoch 2 - iter 198/1984 - loss 0.16096443 - time (sec): 9.01 - samples/sec: 1876.10 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:56:11,979 epoch 2 - iter 396/1984 - loss 0.14525930 - time (sec): 17.73 - samples/sec: 1867.53 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:56:20,750 epoch 2 - iter 594/1984 - loss 0.14335165 - time (sec): 26.50 - samples/sec: 1872.28 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:56:29,643 epoch 2 - iter 792/1984 - loss 0.13927613 - time (sec): 35.40 - samples/sec: 1877.08 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:56:38,576 epoch 2 - iter 990/1984 - loss 0.13757993 - time (sec): 44.33 - samples/sec: 1862.58 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:56:47,747 epoch 2 - iter 1188/1984 - loss 0.13290183 - time (sec): 53.50 - samples/sec: 1856.82 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:56:56,765 epoch 2 - iter 1386/1984 - loss 0.13087811 - time (sec): 62.52 - samples/sec: 1854.94 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:57:05,839 epoch 2 - iter 1584/1984 - loss 0.13141431 - time (sec): 71.59 - samples/sec: 1839.27 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:57:14,783 epoch 2 - iter 1782/1984 - loss 0.13018538 - time (sec): 80.54 - samples/sec: 1828.57 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:57:23,835 epoch 2 - iter 1980/1984 - loss 0.12957544 - time (sec): 89.59 - samples/sec: 1826.70 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:57:24,015 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:24,015 EPOCH 2 done: loss 0.1296 - lr: 0.000044 2023-10-17 10:57:27,871 DEV : loss 0.10983909666538239 - f1-score (micro avg) 0.7266 2023-10-17 10:57:27,892 saving best model 2023-10-17 10:57:28,372 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:37,508 epoch 3 - iter 198/1984 - loss 0.09127543 - time (sec): 9.13 - samples/sec: 1814.20 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:57:46,635 epoch 3 - iter 396/1984 - loss 0.09356135 - time (sec): 18.26 - samples/sec: 1799.61 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:57:55,783 epoch 3 - iter 594/1984 - loss 0.09391553 - time (sec): 27.41 - samples/sec: 1810.02 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:58:04,831 epoch 3 - iter 792/1984 - loss 0.09087245 - time (sec): 36.46 - samples/sec: 1821.93 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:58:13,828 epoch 3 - iter 990/1984 - loss 0.09252505 - time (sec): 45.45 - samples/sec: 1819.54 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:58:22,910 epoch 3 - iter 1188/1984 - loss 0.09198370 - time (sec): 54.54 - samples/sec: 1805.55 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:58:31,977 epoch 3 - iter 1386/1984 - loss 0.09210720 - time (sec): 63.60 - samples/sec: 1804.93 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:58:41,054 epoch 3 - iter 1584/1984 - loss 0.09115675 - time (sec): 72.68 - samples/sec: 1804.76 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:58:50,159 epoch 3 - iter 1782/1984 - loss 0.09117253 - time (sec): 81.79 - samples/sec: 1808.36 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:58:59,410 epoch 3 - iter 1980/1984 - loss 0.09250784 - time (sec): 91.04 - samples/sec: 1797.18 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:58:59,604 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:59,604 EPOCH 3 done: loss 0.0927 - lr: 0.000039 2023-10-17 10:59:03,046 DEV : loss 0.11815514415502548 - f1-score (micro avg) 0.7673 2023-10-17 10:59:03,068 saving best model 2023-10-17 10:59:03,547 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:59:12,752 epoch 4 - iter 198/1984 - loss 0.06981980 - time (sec): 9.20 - samples/sec: 1783.67 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:59:22,063 epoch 4 - iter 396/1984 - loss 0.07072403 - time (sec): 18.51 - samples/sec: 1844.43 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:59:31,256 epoch 4 - iter 594/1984 - loss 0.07503217 - time (sec): 27.71 - samples/sec: 1838.25 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:59:40,298 epoch 4 - iter 792/1984 - loss 0.07356156 - time (sec): 36.75 - samples/sec: 1835.12 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:59:49,285 epoch 4 - iter 990/1984 - loss 0.07317536 - time (sec): 45.74 - samples/sec: 1829.88 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:59:58,054 epoch 4 - iter 1188/1984 - loss 0.07519900 - time (sec): 54.51 - samples/sec: 1822.01 - lr: 0.000036 - momentum: 0.000000 2023-10-17 11:00:06,621 epoch 4 - iter 1386/1984 - loss 0.07351265 - time (sec): 63.07 - samples/sec: 1822.06 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:00:15,739 epoch 4 - iter 1584/1984 - loss 0.07234519 - time (sec): 72.19 - samples/sec: 1814.86 - lr: 0.000034 - momentum: 0.000000 2023-10-17 11:00:24,737 epoch 4 - iter 1782/1984 - loss 0.07317391 - time (sec): 81.19 - samples/sec: 1819.38 - lr: 0.000034 - momentum: 0.000000 2023-10-17 11:00:33,691 epoch 4 - iter 1980/1984 - loss 0.07222348 - time (sec): 90.14 - samples/sec: 1816.69 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:00:33,875 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:33,875 EPOCH 4 done: loss 0.0722 - lr: 0.000033 2023-10-17 11:00:37,278 DEV : loss 0.15266923606395721 - f1-score (micro avg) 0.7663 2023-10-17 11:00:37,300 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:46,355 epoch 5 - iter 198/1984 - loss 0.05046205 - time (sec): 9.05 - samples/sec: 1795.87 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:00:55,385 epoch 5 - iter 396/1984 - loss 0.05247464 - time (sec): 18.08 - samples/sec: 1852.85 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:01:04,492 epoch 5 - iter 594/1984 - loss 0.05489266 - time (sec): 27.19 - samples/sec: 1846.51 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:01:13,843 epoch 5 - iter 792/1984 - loss 0.05563933 - time (sec): 36.54 - samples/sec: 1853.29 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:01:23,006 epoch 5 - iter 990/1984 - loss 0.05335869 - time (sec): 45.70 - samples/sec: 1841.01 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:01:32,134 epoch 5 - iter 1188/1984 - loss 0.05269990 - time (sec): 54.83 - samples/sec: 1823.48 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:01:41,271 epoch 5 - iter 1386/1984 - loss 0.05371472 - time (sec): 63.97 - samples/sec: 1819.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:01:50,241 epoch 5 - iter 1584/1984 - loss 0.05415103 - time (sec): 72.94 - samples/sec: 1812.74 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:01:59,308 epoch 5 - iter 1782/1984 - loss 0.05436597 - time (sec): 82.01 - samples/sec: 1807.24 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:02:08,369 epoch 5 - iter 1980/1984 - loss 0.05389026 - time (sec): 91.07 - samples/sec: 1796.79 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:02:08,550 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:02:08,551 EPOCH 5 done: loss 0.0539 - lr: 0.000028 2023-10-17 11:02:11,912 DEV : loss 0.19919680058956146 - f1-score (micro avg) 0.7507 2023-10-17 11:02:11,932 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:02:21,160 epoch 6 - iter 198/1984 - loss 0.04620617 - time (sec): 9.23 - samples/sec: 1782.45 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:02:30,334 epoch 6 - iter 396/1984 - loss 0.04843778 - time (sec): 18.40 - samples/sec: 1774.97 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:02:39,513 epoch 6 - iter 594/1984 - loss 0.04484302 - time (sec): 27.58 - samples/sec: 1822.04 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:02:48,566 epoch 6 - iter 792/1984 - loss 0.04250884 - time (sec): 36.63 - samples/sec: 1818.99 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:02:57,657 epoch 6 - iter 990/1984 - loss 0.04152118 - time (sec): 45.72 - samples/sec: 1824.08 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:03:06,709 epoch 6 - iter 1188/1984 - loss 0.04219659 - time (sec): 54.78 - samples/sec: 1820.23 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:03:15,840 epoch 6 - iter 1386/1984 - loss 0.04406527 - time (sec): 63.91 - samples/sec: 1804.54 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:03:25,050 epoch 6 - iter 1584/1984 - loss 0.04345490 - time (sec): 73.12 - samples/sec: 1793.93 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:03:34,090 epoch 6 - iter 1782/1984 - loss 0.04365630 - time (sec): 82.16 - samples/sec: 1792.95 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:03:43,312 epoch 6 - iter 1980/1984 - loss 0.04303766 - time (sec): 91.38 - samples/sec: 1790.06 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:03:43,494 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:43,494 EPOCH 6 done: loss 0.0429 - lr: 0.000022 2023-10-17 11:03:46,982 DEV : loss 0.19526612758636475 - f1-score (micro avg) 0.759 2023-10-17 11:03:47,005 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:56,272 epoch 7 - iter 198/1984 - loss 0.02150771 - time (sec): 9.27 - samples/sec: 1758.81 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:04:05,509 epoch 7 - iter 396/1984 - loss 0.02251021 - time (sec): 18.50 - samples/sec: 1773.66 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:04:14,821 epoch 7 - iter 594/1984 - loss 0.02431276 - time (sec): 27.82 - samples/sec: 1781.84 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:04:23,989 epoch 7 - iter 792/1984 - loss 0.02510360 - time (sec): 36.98 - samples/sec: 1775.37 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:04:33,324 epoch 7 - iter 990/1984 - loss 0.02564549 - time (sec): 46.32 - samples/sec: 1769.45 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:04:42,517 epoch 7 - iter 1188/1984 - loss 0.02559000 - time (sec): 55.51 - samples/sec: 1773.24 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:04:51,705 epoch 7 - iter 1386/1984 - loss 0.02670608 - time (sec): 64.70 - samples/sec: 1773.77 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:05:00,894 epoch 7 - iter 1584/1984 - loss 0.02761748 - time (sec): 73.89 - samples/sec: 1763.66 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:05:10,013 epoch 7 - iter 1782/1984 - loss 0.02914139 - time (sec): 83.01 - samples/sec: 1776.01 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:05:19,040 epoch 7 - iter 1980/1984 - loss 0.02930128 - time (sec): 92.03 - samples/sec: 1778.63 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:05:19,223 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:19,223 EPOCH 7 done: loss 0.0292 - lr: 0.000017 2023-10-17 11:05:23,023 DEV : loss 0.2106274515390396 - f1-score (micro avg) 0.7652 2023-10-17 11:05:23,044 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:32,176 epoch 8 - iter 198/1984 - loss 0.01532135 - time (sec): 9.13 - samples/sec: 1792.67 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:05:41,290 epoch 8 - iter 396/1984 - loss 0.01844700 - time (sec): 18.24 - samples/sec: 1775.71 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:05:50,631 epoch 8 - iter 594/1984 - loss 0.01680914 - time (sec): 27.59 - samples/sec: 1813.29 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:05:59,572 epoch 8 - iter 792/1984 - loss 0.01532722 - time (sec): 36.53 - samples/sec: 1809.50 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:06:08,828 epoch 8 - iter 990/1984 - loss 0.01651685 - time (sec): 45.78 - samples/sec: 1817.17 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:06:17,849 epoch 8 - iter 1188/1984 - loss 0.01693879 - time (sec): 54.80 - samples/sec: 1823.06 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:06:26,896 epoch 8 - iter 1386/1984 - loss 0.01791079 - time (sec): 63.85 - samples/sec: 1810.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:06:35,832 epoch 8 - iter 1584/1984 - loss 0.01862913 - time (sec): 72.79 - samples/sec: 1794.24 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:06:44,860 epoch 8 - iter 1782/1984 - loss 0.01920153 - time (sec): 81.81 - samples/sec: 1799.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:06:53,979 epoch 8 - iter 1980/1984 - loss 0.01890437 - time (sec): 90.93 - samples/sec: 1799.51 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:06:54,163 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:06:54,163 EPOCH 8 done: loss 0.0189 - lr: 0.000011 2023-10-17 11:06:57,535 DEV : loss 0.25147587060928345 - f1-score (micro avg) 0.7642 2023-10-17 11:06:57,556 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:07:06,501 epoch 9 - iter 198/1984 - loss 0.01800551 - time (sec): 8.94 - samples/sec: 1771.15 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:07:15,377 epoch 9 - iter 396/1984 - loss 0.01378590 - time (sec): 17.82 - samples/sec: 1863.42 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:07:24,608 epoch 9 - iter 594/1984 - loss 0.01314436 - time (sec): 27.05 - samples/sec: 1863.71 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:07:33,694 epoch 9 - iter 792/1984 - loss 0.01248684 - time (sec): 36.14 - samples/sec: 1845.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:07:42,786 epoch 9 - iter 990/1984 - loss 0.01186486 - time (sec): 45.23 - samples/sec: 1825.45 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:07:51,933 epoch 9 - iter 1188/1984 - loss 0.01172450 - time (sec): 54.38 - samples/sec: 1815.57 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:08:01,131 epoch 9 - iter 1386/1984 - loss 0.01171225 - time (sec): 63.57 - samples/sec: 1805.34 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:08:10,295 epoch 9 - iter 1584/1984 - loss 0.01234481 - time (sec): 72.74 - samples/sec: 1808.45 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:08:19,267 epoch 9 - iter 1782/1984 - loss 0.01340571 - time (sec): 81.71 - samples/sec: 1807.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:08:28,340 epoch 9 - iter 1980/1984 - loss 0.01367155 - time (sec): 90.78 - samples/sec: 1803.06 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:08:28,512 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:08:28,512 EPOCH 9 done: loss 0.0136 - lr: 0.000006 2023-10-17 11:08:31,940 DEV : loss 0.23887786269187927 - f1-score (micro avg) 0.7643 2023-10-17 11:08:31,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:08:40,850 epoch 10 - iter 198/1984 - loss 0.00470077 - time (sec): 8.89 - samples/sec: 1858.08 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:08:49,392 epoch 10 - iter 396/1984 - loss 0.00422202 - time (sec): 17.43 - samples/sec: 1844.72 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:08:58,007 epoch 10 - iter 594/1984 - loss 0.00543694 - time (sec): 26.04 - samples/sec: 1866.92 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:09:06,736 epoch 10 - iter 792/1984 - loss 0.00646607 - time (sec): 34.77 - samples/sec: 1863.80 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:09:15,340 epoch 10 - iter 990/1984 - loss 0.00650601 - time (sec): 43.38 - samples/sec: 1857.52 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:09:23,989 epoch 10 - iter 1188/1984 - loss 0.00670870 - time (sec): 52.03 - samples/sec: 1873.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:09:32,785 epoch 10 - iter 1386/1984 - loss 0.00698514 - time (sec): 60.82 - samples/sec: 1884.69 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:09:41,466 epoch 10 - iter 1584/1984 - loss 0.00711906 - time (sec): 69.50 - samples/sec: 1886.06 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:09:50,532 epoch 10 - iter 1782/1984 - loss 0.00702130 - time (sec): 78.57 - samples/sec: 1870.40 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:09:59,828 epoch 10 - iter 1980/1984 - loss 0.00764656 - time (sec): 87.86 - samples/sec: 1863.19 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:10:00,016 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:10:00,016 EPOCH 10 done: loss 0.0076 - lr: 0.000000 2023-10-17 11:10:03,438 DEV : loss 0.2601605951786041 - f1-score (micro avg) 0.7684 2023-10-17 11:10:03,459 saving best model 2023-10-17 11:10:04,382 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:10:04,383 Loading model from best epoch ... 2023-10-17 11:10:05,772 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 11:10:08,979 Results: - F-score (micro) 0.7792 - F-score (macro) 0.6974 - Accuracy 0.6658 By class: precision recall f1-score support LOC 0.8303 0.8443 0.8372 655 PER 0.7028 0.7848 0.7415 223 ORG 0.6000 0.4488 0.5135 127 micro avg 0.7772 0.7811 0.7792 1005 macro avg 0.7110 0.6926 0.6974 1005 weighted avg 0.7729 0.7811 0.7751 1005 2023-10-17 11:10:08,979 ----------------------------------------------------------------------------------------------------