2023-10-17 09:54:13,867 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,868 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:54:13,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Train: 1214 sentences 2023-10-17 09:54:13,869 (train_with_dev=False, train_with_test=False) 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Training Params: 2023-10-17 09:54:13,869 - learning_rate: "5e-05" 2023-10-17 09:54:13,869 - mini_batch_size: "4" 2023-10-17 09:54:13,869 - max_epochs: "10" 2023-10-17 09:54:13,869 - shuffle: "True" 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Plugins: 2023-10-17 09:54:13,869 - TensorboardLogger 2023-10-17 09:54:13,869 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:54:13,869 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Computation: 2023-10-17 09:54:13,869 - compute on device: cuda:0 2023-10-17 09:54:13,869 - embedding storage: none 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:13,869 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:54:15,461 epoch 1 - iter 30/304 - loss 3.92541070 - time (sec): 1.59 - samples/sec: 1953.57 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:54:17,033 epoch 1 - iter 60/304 - loss 2.83002534 - time (sec): 3.16 - samples/sec: 1982.33 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:54:18,613 epoch 1 - iter 90/304 - loss 2.09110440 - time (sec): 4.74 - samples/sec: 1977.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:54:20,180 epoch 1 - iter 120/304 - loss 1.68334725 - time (sec): 6.31 - samples/sec: 1977.51 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:54:21,832 epoch 1 - iter 150/304 - loss 1.43709984 - time (sec): 7.96 - samples/sec: 1960.01 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:54:23,389 epoch 1 - iter 180/304 - loss 1.26564420 - time (sec): 9.52 - samples/sec: 1942.99 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:54:24,929 epoch 1 - iter 210/304 - loss 1.12056567 - time (sec): 11.06 - samples/sec: 1939.18 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:54:26,530 epoch 1 - iter 240/304 - loss 1.02447655 - time (sec): 12.66 - samples/sec: 1927.22 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:54:28,082 epoch 1 - iter 270/304 - loss 0.93337178 - time (sec): 14.21 - samples/sec: 1937.90 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:54:29,676 epoch 1 - iter 300/304 - loss 0.85762804 - time (sec): 15.81 - samples/sec: 1937.78 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:54:29,873 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:29,873 EPOCH 1 done: loss 0.8498 - lr: 0.000049 2023-10-17 09:54:30,898 DEV : loss 0.22860193252563477 - f1-score (micro avg) 0.6309 2023-10-17 09:54:30,907 saving best model 2023-10-17 09:54:31,343 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:32,716 epoch 2 - iter 30/304 - loss 0.17256119 - time (sec): 1.37 - samples/sec: 2189.96 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:54:34,134 epoch 2 - iter 60/304 - loss 0.17606219 - time (sec): 2.79 - samples/sec: 2176.91 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:54:35,573 epoch 2 - iter 90/304 - loss 0.16356435 - time (sec): 4.23 - samples/sec: 2168.56 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:54:37,179 epoch 2 - iter 120/304 - loss 0.15726544 - time (sec): 5.84 - samples/sec: 2064.08 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:54:38,816 epoch 2 - iter 150/304 - loss 0.15305299 - time (sec): 7.47 - samples/sec: 2045.50 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:54:40,335 epoch 2 - iter 180/304 - loss 0.14349988 - time (sec): 8.99 - samples/sec: 2048.80 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:54:41,745 epoch 2 - iter 210/304 - loss 0.14441468 - time (sec): 10.40 - samples/sec: 2073.44 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:54:43,222 epoch 2 - iter 240/304 - loss 0.13509089 - time (sec): 11.88 - samples/sec: 2071.61 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:54:44,816 epoch 2 - iter 270/304 - loss 0.13165195 - time (sec): 13.47 - samples/sec: 2051.25 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:54:46,437 epoch 2 - iter 300/304 - loss 0.13298263 - time (sec): 15.09 - samples/sec: 2031.22 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:54:46,656 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:46,656 EPOCH 2 done: loss 0.1329 - lr: 0.000045 2023-10-17 09:54:47,681 DEV : loss 0.16872060298919678 - f1-score (micro avg) 0.7932 2023-10-17 09:54:47,690 saving best model 2023-10-17 09:54:48,238 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:49,636 epoch 3 - iter 30/304 - loss 0.05274848 - time (sec): 1.40 - samples/sec: 2056.34 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:54:51,055 epoch 3 - iter 60/304 - loss 0.05982692 - time (sec): 2.81 - samples/sec: 2096.78 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:54:52,612 epoch 3 - iter 90/304 - loss 0.08228454 - time (sec): 4.37 - samples/sec: 2012.76 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:54:54,133 epoch 3 - iter 120/304 - loss 0.08042581 - time (sec): 5.89 - samples/sec: 2033.82 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:54:55,719 epoch 3 - iter 150/304 - loss 0.08178362 - time (sec): 7.48 - samples/sec: 2028.63 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:54:57,300 epoch 3 - iter 180/304 - loss 0.07941139 - time (sec): 9.06 - samples/sec: 2005.37 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:54:58,925 epoch 3 - iter 210/304 - loss 0.08265085 - time (sec): 10.69 - samples/sec: 1988.99 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:55:00,536 epoch 3 - iter 240/304 - loss 0.08573224 - time (sec): 12.30 - samples/sec: 1982.40 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:55:02,133 epoch 3 - iter 270/304 - loss 0.08997078 - time (sec): 13.89 - samples/sec: 1983.58 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:55:03,763 epoch 3 - iter 300/304 - loss 0.09055316 - time (sec): 15.52 - samples/sec: 1973.09 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:55:03,972 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:03,972 EPOCH 3 done: loss 0.0900 - lr: 0.000039 2023-10-17 09:55:04,903 DEV : loss 0.15869022905826569 - f1-score (micro avg) 0.8223 2023-10-17 09:55:04,910 saving best model 2023-10-17 09:55:05,438 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:07,051 epoch 4 - iter 30/304 - loss 0.22243641 - time (sec): 1.61 - samples/sec: 1882.75 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:55:08,660 epoch 4 - iter 60/304 - loss 0.18033808 - time (sec): 3.22 - samples/sec: 1820.17 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:55:10,291 epoch 4 - iter 90/304 - loss 0.14561604 - time (sec): 4.85 - samples/sec: 1879.90 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:55:11,877 epoch 4 - iter 120/304 - loss 0.11864431 - time (sec): 6.44 - samples/sec: 1918.44 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:55:13,540 epoch 4 - iter 150/304 - loss 0.10256781 - time (sec): 8.10 - samples/sec: 1903.01 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:55:15,143 epoch 4 - iter 180/304 - loss 0.09530623 - time (sec): 9.70 - samples/sec: 1894.36 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:55:16,608 epoch 4 - iter 210/304 - loss 0.09260037 - time (sec): 11.17 - samples/sec: 1937.91 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:55:18,006 epoch 4 - iter 240/304 - loss 0.08909074 - time (sec): 12.57 - samples/sec: 1956.66 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:55:19,439 epoch 4 - iter 270/304 - loss 0.08554726 - time (sec): 14.00 - samples/sec: 1961.12 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:55:20,866 epoch 4 - iter 300/304 - loss 0.08568790 - time (sec): 15.43 - samples/sec: 1983.06 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:55:21,051 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:21,051 EPOCH 4 done: loss 0.0880 - lr: 0.000033 2023-10-17 09:55:21,991 DEV : loss 0.1595626175403595 - f1-score (micro avg) 0.8343 2023-10-17 09:55:21,998 saving best model 2023-10-17 09:55:22,546 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:24,058 epoch 5 - iter 30/304 - loss 0.03697545 - time (sec): 1.51 - samples/sec: 1901.02 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:55:25,463 epoch 5 - iter 60/304 - loss 0.03777380 - time (sec): 2.92 - samples/sec: 2122.32 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:55:26,930 epoch 5 - iter 90/304 - loss 0.04110514 - time (sec): 4.38 - samples/sec: 2116.43 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:55:28,524 epoch 5 - iter 120/304 - loss 0.03961826 - time (sec): 5.98 - samples/sec: 2026.55 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:55:30,130 epoch 5 - iter 150/304 - loss 0.03751316 - time (sec): 7.58 - samples/sec: 2000.79 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:55:31,722 epoch 5 - iter 180/304 - loss 0.03548953 - time (sec): 9.17 - samples/sec: 1999.72 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:55:33,330 epoch 5 - iter 210/304 - loss 0.03902609 - time (sec): 10.78 - samples/sec: 1993.15 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:55:34,924 epoch 5 - iter 240/304 - loss 0.04171248 - time (sec): 12.38 - samples/sec: 1983.58 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:55:36,293 epoch 5 - iter 270/304 - loss 0.04498475 - time (sec): 13.75 - samples/sec: 2006.22 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:55:37,674 epoch 5 - iter 300/304 - loss 0.04539674 - time (sec): 15.13 - samples/sec: 2029.85 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:55:37,866 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:37,867 EPOCH 5 done: loss 0.0451 - lr: 0.000028 2023-10-17 09:55:38,795 DEV : loss 0.20325231552124023 - f1-score (micro avg) 0.8329 2023-10-17 09:55:38,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:40,200 epoch 6 - iter 30/304 - loss 0.02360420 - time (sec): 1.40 - samples/sec: 2411.55 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:55:41,733 epoch 6 - iter 60/304 - loss 0.05298460 - time (sec): 2.93 - samples/sec: 2193.31 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:55:43,186 epoch 6 - iter 90/304 - loss 0.04447507 - time (sec): 4.38 - samples/sec: 2108.37 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:55:44,803 epoch 6 - iter 120/304 - loss 0.04234440 - time (sec): 6.00 - samples/sec: 1974.90 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:55:46,379 epoch 6 - iter 150/304 - loss 0.04134043 - time (sec): 7.58 - samples/sec: 1975.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:55:47,935 epoch 6 - iter 180/304 - loss 0.03644266 - time (sec): 9.13 - samples/sec: 1981.53 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:55:49,333 epoch 6 - iter 210/304 - loss 0.03483795 - time (sec): 10.53 - samples/sec: 2026.41 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:55:50,701 epoch 6 - iter 240/304 - loss 0.03394206 - time (sec): 11.90 - samples/sec: 2018.44 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:55:52,129 epoch 6 - iter 270/304 - loss 0.03112237 - time (sec): 13.33 - samples/sec: 2054.51 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:55:53,532 epoch 6 - iter 300/304 - loss 0.03255731 - time (sec): 14.73 - samples/sec: 2074.43 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:55:53,717 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:53,717 EPOCH 6 done: loss 0.0331 - lr: 0.000022 2023-10-17 09:55:54,653 DEV : loss 0.19660867750644684 - f1-score (micro avg) 0.8392 2023-10-17 09:55:54,661 saving best model 2023-10-17 09:55:55,147 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:56,599 epoch 7 - iter 30/304 - loss 0.00971525 - time (sec): 1.45 - samples/sec: 2108.25 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:55:58,035 epoch 7 - iter 60/304 - loss 0.01823157 - time (sec): 2.88 - samples/sec: 2169.76 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:55:59,480 epoch 7 - iter 90/304 - loss 0.01503166 - time (sec): 4.33 - samples/sec: 2160.06 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:56:00,863 epoch 7 - iter 120/304 - loss 0.01826492 - time (sec): 5.71 - samples/sec: 2216.00 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:56:02,265 epoch 7 - iter 150/304 - loss 0.01729233 - time (sec): 7.11 - samples/sec: 2199.37 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:56:03,632 epoch 7 - iter 180/304 - loss 0.01866288 - time (sec): 8.48 - samples/sec: 2196.68 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:56:05,010 epoch 7 - iter 210/304 - loss 0.02159675 - time (sec): 9.86 - samples/sec: 2192.19 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:56:06,402 epoch 7 - iter 240/304 - loss 0.02325090 - time (sec): 11.25 - samples/sec: 2186.48 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:56:07,770 epoch 7 - iter 270/304 - loss 0.02334865 - time (sec): 12.62 - samples/sec: 2180.37 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:56:09,171 epoch 7 - iter 300/304 - loss 0.02307807 - time (sec): 14.02 - samples/sec: 2188.50 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:56:09,355 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:09,355 EPOCH 7 done: loss 0.0234 - lr: 0.000017 2023-10-17 09:56:10,274 DEV : loss 0.2072305530309677 - f1-score (micro avg) 0.84 2023-10-17 09:56:10,281 saving best model 2023-10-17 09:56:10,790 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:12,174 epoch 8 - iter 30/304 - loss 0.00679041 - time (sec): 1.38 - samples/sec: 2031.26 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:56:13,554 epoch 8 - iter 60/304 - loss 0.00994703 - time (sec): 2.76 - samples/sec: 2114.09 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:56:14,957 epoch 8 - iter 90/304 - loss 0.02010303 - time (sec): 4.17 - samples/sec: 2187.26 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:56:16,378 epoch 8 - iter 120/304 - loss 0.02061087 - time (sec): 5.59 - samples/sec: 2120.24 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:56:17,788 epoch 8 - iter 150/304 - loss 0.02246354 - time (sec): 7.00 - samples/sec: 2136.88 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:56:19,187 epoch 8 - iter 180/304 - loss 0.02369727 - time (sec): 8.40 - samples/sec: 2144.79 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:56:20,510 epoch 8 - iter 210/304 - loss 0.02258358 - time (sec): 9.72 - samples/sec: 2187.46 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:56:21,845 epoch 8 - iter 240/304 - loss 0.02108637 - time (sec): 11.05 - samples/sec: 2202.68 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:56:23,173 epoch 8 - iter 270/304 - loss 0.02132594 - time (sec): 12.38 - samples/sec: 2201.90 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:56:24,603 epoch 8 - iter 300/304 - loss 0.02121622 - time (sec): 13.81 - samples/sec: 2216.72 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:56:24,780 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:24,780 EPOCH 8 done: loss 0.0210 - lr: 0.000011 2023-10-17 09:56:25,723 DEV : loss 0.21292492747306824 - f1-score (micro avg) 0.848 2023-10-17 09:56:25,730 saving best model 2023-10-17 09:56:26,197 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:27,668 epoch 9 - iter 30/304 - loss 0.01587622 - time (sec): 1.47 - samples/sec: 2101.64 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:56:29,102 epoch 9 - iter 60/304 - loss 0.01070887 - time (sec): 2.90 - samples/sec: 2050.10 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:56:30,478 epoch 9 - iter 90/304 - loss 0.01496039 - time (sec): 4.28 - samples/sec: 2073.06 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:56:31,887 epoch 9 - iter 120/304 - loss 0.01124426 - time (sec): 5.69 - samples/sec: 2077.72 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:56:33,312 epoch 9 - iter 150/304 - loss 0.01410474 - time (sec): 7.11 - samples/sec: 2086.37 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:56:34,795 epoch 9 - iter 180/304 - loss 0.01244996 - time (sec): 8.60 - samples/sec: 2092.32 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:56:36,264 epoch 9 - iter 210/304 - loss 0.01248262 - time (sec): 10.06 - samples/sec: 2091.37 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:56:37,754 epoch 9 - iter 240/304 - loss 0.01213209 - time (sec): 11.55 - samples/sec: 2102.23 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:56:39,170 epoch 9 - iter 270/304 - loss 0.01353100 - time (sec): 12.97 - samples/sec: 2131.95 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:56:40,608 epoch 9 - iter 300/304 - loss 0.01378414 - time (sec): 14.41 - samples/sec: 2124.72 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:56:40,812 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:40,812 EPOCH 9 done: loss 0.0136 - lr: 0.000006 2023-10-17 09:56:41,731 DEV : loss 0.2044668346643448 - f1-score (micro avg) 0.8541 2023-10-17 09:56:41,738 saving best model 2023-10-17 09:56:42,227 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:43,715 epoch 10 - iter 30/304 - loss 0.01354305 - time (sec): 1.48 - samples/sec: 2072.22 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:56:45,124 epoch 10 - iter 60/304 - loss 0.00751199 - time (sec): 2.89 - samples/sec: 2071.21 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:56:46,546 epoch 10 - iter 90/304 - loss 0.00928881 - time (sec): 4.31 - samples/sec: 2161.61 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:56:47,844 epoch 10 - iter 120/304 - loss 0.01290944 - time (sec): 5.61 - samples/sec: 2182.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:56:49,142 epoch 10 - iter 150/304 - loss 0.01173024 - time (sec): 6.91 - samples/sec: 2205.14 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:56:50,445 epoch 10 - iter 180/304 - loss 0.01028826 - time (sec): 8.21 - samples/sec: 2220.18 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:56:51,743 epoch 10 - iter 210/304 - loss 0.00959073 - time (sec): 9.51 - samples/sec: 2213.19 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:56:53,047 epoch 10 - iter 240/304 - loss 0.00963868 - time (sec): 10.82 - samples/sec: 2205.85 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:56:54,380 epoch 10 - iter 270/304 - loss 0.00918610 - time (sec): 12.15 - samples/sec: 2248.93 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:56:55,708 epoch 10 - iter 300/304 - loss 0.01027470 - time (sec): 13.48 - samples/sec: 2269.04 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:56:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:55,887 EPOCH 10 done: loss 0.0101 - lr: 0.000000 2023-10-17 09:56:56,858 DEV : loss 0.21448352932929993 - f1-score (micro avg) 0.8595 2023-10-17 09:56:56,871 saving best model 2023-10-17 09:56:57,685 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:57,687 Loading model from best epoch ... 2023-10-17 09:56:59,606 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-17 09:57:00,722 Results: - F-score (micro) 0.8425 - F-score (macro) 0.7312 - Accuracy 0.7332 By class: precision recall f1-score support scope 0.7812 0.8278 0.8039 151 work 0.7944 0.8947 0.8416 95 pers 0.8932 0.9583 0.9246 96 date 0.2500 0.3333 0.2857 3 loc 1.0000 0.6667 0.8000 3 micro avg 0.8112 0.8764 0.8425 348 macro avg 0.7438 0.7362 0.7312 348 weighted avg 0.8130 0.8764 0.8430 348 2023-10-17 09:57:00,722 ----------------------------------------------------------------------------------------------------