2023-10-17 15:58:59,059 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,061 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:58:59,061 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,062 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-17 15:58:59,062 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,062 Train: 3575 sentences 2023-10-17 15:58:59,062 (train_with_dev=False, train_with_test=False) 2023-10-17 15:58:59,062 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,062 Training Params: 2023-10-17 15:58:59,062 - learning_rate: "3e-05" 2023-10-17 15:58:59,062 - mini_batch_size: "8" 2023-10-17 15:58:59,062 - max_epochs: "10" 2023-10-17 15:58:59,062 - shuffle: "True" 2023-10-17 15:58:59,062 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,062 Plugins: 2023-10-17 15:58:59,063 - TensorboardLogger 2023-10-17 15:58:59,063 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:58:59,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,063 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:58:59,063 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:58:59,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,063 Computation: 2023-10-17 15:58:59,063 - compute on device: cuda:0 2023-10-17 15:58:59,063 - embedding storage: none 2023-10-17 15:58:59,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,063 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 15:58:59,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:59,064 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:59:03,268 epoch 1 - iter 44/447 - loss 3.75328637 - time (sec): 4.20 - samples/sec: 2112.58 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:59:07,512 epoch 1 - iter 88/447 - loss 2.79174559 - time (sec): 8.45 - samples/sec: 2030.17 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:59:11,524 epoch 1 - iter 132/447 - loss 2.08551841 - time (sec): 12.46 - samples/sec: 1987.11 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:59:16,031 epoch 1 - iter 176/447 - loss 1.63894890 - time (sec): 16.97 - samples/sec: 2016.62 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:59:20,141 epoch 1 - iter 220/447 - loss 1.39670438 - time (sec): 21.08 - samples/sec: 2016.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:59:24,433 epoch 1 - iter 264/447 - loss 1.22305966 - time (sec): 25.37 - samples/sec: 2016.49 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:59:28,479 epoch 1 - iter 308/447 - loss 1.10231203 - time (sec): 29.41 - samples/sec: 2005.73 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:59:32,764 epoch 1 - iter 352/447 - loss 1.00085629 - time (sec): 33.70 - samples/sec: 2007.65 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:59:37,492 epoch 1 - iter 396/447 - loss 0.91299761 - time (sec): 38.43 - samples/sec: 2004.60 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:59:41,758 epoch 1 - iter 440/447 - loss 0.84847471 - time (sec): 42.69 - samples/sec: 2001.55 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:59:42,398 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:59:42,398 EPOCH 1 done: loss 0.8401 - lr: 0.000029 2023-10-17 15:59:49,299 DEV : loss 0.1743467152118683 - f1-score (micro avg) 0.5904 2023-10-17 15:59:49,362 saving best model 2023-10-17 15:59:50,000 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:59:54,169 epoch 2 - iter 44/447 - loss 0.20202135 - time (sec): 4.17 - samples/sec: 2020.85 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:59:58,785 epoch 2 - iter 88/447 - loss 0.19013888 - time (sec): 8.78 - samples/sec: 2021.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:03,225 epoch 2 - iter 132/447 - loss 0.19608616 - time (sec): 13.22 - samples/sec: 1930.61 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:08,005 epoch 2 - iter 176/447 - loss 0.18596049 - time (sec): 18.00 - samples/sec: 1937.03 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:12,138 epoch 2 - iter 220/447 - loss 0.18035707 - time (sec): 22.13 - samples/sec: 1977.69 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:16,236 epoch 2 - iter 264/447 - loss 0.17461322 - time (sec): 26.23 - samples/sec: 1965.57 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:20,493 epoch 2 - iter 308/447 - loss 0.16847653 - time (sec): 30.49 - samples/sec: 1986.10 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:24,739 epoch 2 - iter 352/447 - loss 0.16607108 - time (sec): 34.74 - samples/sec: 1993.31 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:28,945 epoch 2 - iter 396/447 - loss 0.16349240 - time (sec): 38.94 - samples/sec: 1988.06 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:33,205 epoch 2 - iter 440/447 - loss 0.16613347 - time (sec): 43.20 - samples/sec: 1976.45 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:33,854 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:33,855 EPOCH 2 done: loss 0.1648 - lr: 0.000027 2023-10-17 16:00:45,420 DEV : loss 0.11944183707237244 - f1-score (micro avg) 0.7175 2023-10-17 16:00:45,481 saving best model 2023-10-17 16:00:46,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:51,110 epoch 3 - iter 44/447 - loss 0.08131650 - time (sec): 4.17 - samples/sec: 1891.39 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:00:55,405 epoch 3 - iter 88/447 - loss 0.08702455 - time (sec): 8.47 - samples/sec: 1967.19 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:00:59,963 epoch 3 - iter 132/447 - loss 0.07972396 - time (sec): 13.02 - samples/sec: 1964.70 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:01:04,142 epoch 3 - iter 176/447 - loss 0.07975901 - time (sec): 17.20 - samples/sec: 1939.38 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:01:08,611 epoch 3 - iter 220/447 - loss 0.08247967 - time (sec): 21.67 - samples/sec: 1967.29 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:01:12,697 epoch 3 - iter 264/447 - loss 0.08537389 - time (sec): 25.76 - samples/sec: 1978.18 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:01:17,006 epoch 3 - iter 308/447 - loss 0.08411404 - time (sec): 30.07 - samples/sec: 1987.82 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:21,139 epoch 3 - iter 352/447 - loss 0.08280911 - time (sec): 34.20 - samples/sec: 1990.44 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:25,354 epoch 3 - iter 396/447 - loss 0.08456497 - time (sec): 38.42 - samples/sec: 1993.20 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:29,534 epoch 3 - iter 440/447 - loss 0.08638185 - time (sec): 42.60 - samples/sec: 1984.93 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:30,441 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:01:30,442 EPOCH 3 done: loss 0.0860 - lr: 0.000023 2023-10-17 16:01:41,799 DEV : loss 0.13438037037849426 - f1-score (micro avg) 0.7434 2023-10-17 16:01:41,855 saving best model 2023-10-17 16:01:43,314 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:01:47,660 epoch 4 - iter 44/447 - loss 0.05444230 - time (sec): 4.34 - samples/sec: 2164.26 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:52,647 epoch 4 - iter 88/447 - loss 0.05193996 - time (sec): 9.33 - samples/sec: 2018.23 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:57,154 epoch 4 - iter 132/447 - loss 0.05306498 - time (sec): 13.83 - samples/sec: 1941.12 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:02:01,286 epoch 4 - iter 176/447 - loss 0.05252827 - time (sec): 17.97 - samples/sec: 1934.72 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:02:05,472 epoch 4 - iter 220/447 - loss 0.05261466 - time (sec): 22.15 - samples/sec: 1950.76 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:02:09,829 epoch 4 - iter 264/447 - loss 0.05182422 - time (sec): 26.51 - samples/sec: 1953.80 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:02:13,882 epoch 4 - iter 308/447 - loss 0.05086636 - time (sec): 30.56 - samples/sec: 1955.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:02:18,196 epoch 4 - iter 352/447 - loss 0.05167051 - time (sec): 34.88 - samples/sec: 1968.66 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:02:22,250 epoch 4 - iter 396/447 - loss 0.05200802 - time (sec): 38.93 - samples/sec: 1971.49 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:02:26,437 epoch 4 - iter 440/447 - loss 0.05402605 - time (sec): 43.12 - samples/sec: 1980.43 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:02:27,080 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:02:27,080 EPOCH 4 done: loss 0.0547 - lr: 0.000020 2023-10-17 16:02:38,092 DEV : loss 0.15765978395938873 - f1-score (micro avg) 0.7564 2023-10-17 16:02:38,149 saving best model 2023-10-17 16:02:39,959 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:02:44,066 epoch 5 - iter 44/447 - loss 0.02163230 - time (sec): 4.10 - samples/sec: 2014.16 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:02:48,624 epoch 5 - iter 88/447 - loss 0.02328428 - time (sec): 8.66 - samples/sec: 2063.37 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:02:52,824 epoch 5 - iter 132/447 - loss 0.02815843 - time (sec): 12.86 - samples/sec: 2055.05 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:02:57,091 epoch 5 - iter 176/447 - loss 0.03162369 - time (sec): 17.13 - samples/sec: 2030.52 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:03:01,212 epoch 5 - iter 220/447 - loss 0.03390401 - time (sec): 21.25 - samples/sec: 2016.40 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:03:05,489 epoch 5 - iter 264/447 - loss 0.03510618 - time (sec): 25.53 - samples/sec: 1996.63 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:03:09,836 epoch 5 - iter 308/447 - loss 0.03570858 - time (sec): 29.87 - samples/sec: 2011.06 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:03:14,158 epoch 5 - iter 352/447 - loss 0.03496530 - time (sec): 34.19 - samples/sec: 2017.95 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:03:18,233 epoch 5 - iter 396/447 - loss 0.03442000 - time (sec): 38.27 - samples/sec: 2012.70 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:03:22,296 epoch 5 - iter 440/447 - loss 0.03343545 - time (sec): 42.33 - samples/sec: 2012.39 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:03:22,912 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:22,912 EPOCH 5 done: loss 0.0336 - lr: 0.000017 2023-10-17 16:03:34,146 DEV : loss 0.18739798665046692 - f1-score (micro avg) 0.7602 2023-10-17 16:03:34,211 saving best model 2023-10-17 16:03:35,654 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:40,239 epoch 6 - iter 44/447 - loss 0.02567182 - time (sec): 4.58 - samples/sec: 1915.30 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:03:44,559 epoch 6 - iter 88/447 - loss 0.02386413 - time (sec): 8.90 - samples/sec: 1963.68 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:03:48,830 epoch 6 - iter 132/447 - loss 0.02665298 - time (sec): 13.17 - samples/sec: 1938.11 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:03:52,872 epoch 6 - iter 176/447 - loss 0.02638426 - time (sec): 17.21 - samples/sec: 1966.56 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:03:56,972 epoch 6 - iter 220/447 - loss 0.02455531 - time (sec): 21.31 - samples/sec: 2002.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:04:01,089 epoch 6 - iter 264/447 - loss 0.02439035 - time (sec): 25.43 - samples/sec: 1989.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:04:05,048 epoch 6 - iter 308/447 - loss 0.02384438 - time (sec): 29.39 - samples/sec: 1990.74 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:04:09,461 epoch 6 - iter 352/447 - loss 0.02340501 - time (sec): 33.80 - samples/sec: 1998.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:04:13,567 epoch 6 - iter 396/447 - loss 0.02300984 - time (sec): 37.91 - samples/sec: 1999.25 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:04:18,493 epoch 6 - iter 440/447 - loss 0.02335350 - time (sec): 42.83 - samples/sec: 1992.49 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:04:19,112 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:04:19,113 EPOCH 6 done: loss 0.0233 - lr: 0.000013 2023-10-17 16:04:30,550 DEV : loss 0.1992470622062683 - f1-score (micro avg) 0.7886 2023-10-17 16:04:30,611 saving best model 2023-10-17 16:04:32,023 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:04:36,035 epoch 7 - iter 44/447 - loss 0.01165349 - time (sec): 4.01 - samples/sec: 2154.32 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:04:39,982 epoch 7 - iter 88/447 - loss 0.01076700 - time (sec): 7.95 - samples/sec: 2043.10 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:04:43,960 epoch 7 - iter 132/447 - loss 0.01479762 - time (sec): 11.93 - samples/sec: 2072.77 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:04:48,204 epoch 7 - iter 176/447 - loss 0.01368042 - time (sec): 16.18 - samples/sec: 2089.53 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:04:52,249 epoch 7 - iter 220/447 - loss 0.01535559 - time (sec): 20.22 - samples/sec: 2089.04 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:04:56,520 epoch 7 - iter 264/447 - loss 0.01497302 - time (sec): 24.49 - samples/sec: 2078.72 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:05:00,828 epoch 7 - iter 308/447 - loss 0.01480877 - time (sec): 28.80 - samples/sec: 2071.01 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:05:05,246 epoch 7 - iter 352/447 - loss 0.01534251 - time (sec): 33.22 - samples/sec: 2063.40 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:05:09,457 epoch 7 - iter 396/447 - loss 0.01577918 - time (sec): 37.43 - samples/sec: 2062.23 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:05:13,493 epoch 7 - iter 440/447 - loss 0.01506408 - time (sec): 41.47 - samples/sec: 2055.56 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:05:14,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:14,130 EPOCH 7 done: loss 0.0151 - lr: 0.000010 2023-10-17 16:05:25,738 DEV : loss 0.19981977343559265 - f1-score (micro avg) 0.7962 2023-10-17 16:05:25,795 saving best model 2023-10-17 16:05:27,205 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:31,392 epoch 8 - iter 44/447 - loss 0.00747848 - time (sec): 4.18 - samples/sec: 2091.78 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:05:35,357 epoch 8 - iter 88/447 - loss 0.00942650 - time (sec): 8.15 - samples/sec: 2073.55 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:05:39,452 epoch 8 - iter 132/447 - loss 0.01019952 - time (sec): 12.24 - samples/sec: 2051.86 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:05:43,674 epoch 8 - iter 176/447 - loss 0.01051999 - time (sec): 16.46 - samples/sec: 2018.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:05:48,447 epoch 8 - iter 220/447 - loss 0.00988506 - time (sec): 21.24 - samples/sec: 2029.40 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:05:52,743 epoch 8 - iter 264/447 - loss 0.00923766 - time (sec): 25.53 - samples/sec: 2028.90 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:05:56,918 epoch 8 - iter 308/447 - loss 0.00896950 - time (sec): 29.71 - samples/sec: 2045.61 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:06:01,147 epoch 8 - iter 352/447 - loss 0.00877691 - time (sec): 33.94 - samples/sec: 2033.23 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:06:05,239 epoch 8 - iter 396/447 - loss 0.00839354 - time (sec): 38.03 - samples/sec: 2027.39 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:06:09,389 epoch 8 - iter 440/447 - loss 0.00837721 - time (sec): 42.18 - samples/sec: 2020.17 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:06:10,039 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:06:10,039 EPOCH 8 done: loss 0.0085 - lr: 0.000007 2023-10-17 16:06:21,780 DEV : loss 0.21822908520698547 - f1-score (micro avg) 0.7915 2023-10-17 16:06:21,841 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:06:25,974 epoch 9 - iter 44/447 - loss 0.00971530 - time (sec): 4.13 - samples/sec: 2046.00 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:06:30,128 epoch 9 - iter 88/447 - loss 0.00589797 - time (sec): 8.28 - samples/sec: 2051.38 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:06:34,084 epoch 9 - iter 132/447 - loss 0.00489136 - time (sec): 12.24 - samples/sec: 2019.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:06:38,278 epoch 9 - iter 176/447 - loss 0.00468170 - time (sec): 16.43 - samples/sec: 2041.42 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:06:42,419 epoch 9 - iter 220/447 - loss 0.00474446 - time (sec): 20.58 - samples/sec: 2047.85 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:06:46,728 epoch 9 - iter 264/447 - loss 0.00529921 - time (sec): 24.88 - samples/sec: 2054.82 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:06:50,744 epoch 9 - iter 308/447 - loss 0.00534902 - time (sec): 28.90 - samples/sec: 2056.82 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:06:54,817 epoch 9 - iter 352/447 - loss 0.00532626 - time (sec): 32.97 - samples/sec: 2060.60 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:06:59,232 epoch 9 - iter 396/447 - loss 0.00517003 - time (sec): 37.39 - samples/sec: 2059.50 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:07:03,570 epoch 9 - iter 440/447 - loss 0.00496894 - time (sec): 41.73 - samples/sec: 2052.54 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:07:04,188 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:07:04,188 EPOCH 9 done: loss 0.0049 - lr: 0.000003 2023-10-17 16:07:15,489 DEV : loss 0.2285258173942566 - f1-score (micro avg) 0.7966 2023-10-17 16:07:15,545 saving best model 2023-10-17 16:07:16,964 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:07:21,376 epoch 10 - iter 44/447 - loss 0.00131630 - time (sec): 4.41 - samples/sec: 2067.92 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:07:25,410 epoch 10 - iter 88/447 - loss 0.00319788 - time (sec): 8.44 - samples/sec: 2035.90 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:07:29,655 epoch 10 - iter 132/447 - loss 0.00353846 - time (sec): 12.69 - samples/sec: 1978.90 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:07:34,051 epoch 10 - iter 176/447 - loss 0.00330267 - time (sec): 17.08 - samples/sec: 1977.05 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:07:38,665 epoch 10 - iter 220/447 - loss 0.00339446 - time (sec): 21.70 - samples/sec: 1945.15 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:07:43,234 epoch 10 - iter 264/447 - loss 0.00404962 - time (sec): 26.27 - samples/sec: 1971.40 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:07:47,304 epoch 10 - iter 308/447 - loss 0.00401999 - time (sec): 30.34 - samples/sec: 1956.79 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:07:51,685 epoch 10 - iter 352/447 - loss 0.00372843 - time (sec): 34.72 - samples/sec: 1959.83 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:07:55,746 epoch 10 - iter 396/447 - loss 0.00351848 - time (sec): 38.78 - samples/sec: 1963.55 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:07:59,987 epoch 10 - iter 440/447 - loss 0.00359443 - time (sec): 43.02 - samples/sec: 1975.68 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:08:00,668 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:00,669 EPOCH 10 done: loss 0.0035 - lr: 0.000000 2023-10-17 16:08:11,622 DEV : loss 0.22872760891914368 - f1-score (micro avg) 0.7952 2023-10-17 16:08:12,224 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:08:12,226 Loading model from best epoch ... 2023-10-17 16:08:14,951 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-17 16:08:21,279 Results: - F-score (micro) 0.764 - F-score (macro) 0.6761 - Accuracy 0.6388 By class: precision recall f1-score support loc 0.8315 0.8775 0.8539 596 pers 0.7314 0.7688 0.7496 333 org 0.4892 0.5152 0.5018 132 prod 0.6154 0.4848 0.5424 66 time 0.7115 0.7551 0.7327 49 micro avg 0.7496 0.7789 0.7640 1176 macro avg 0.6758 0.6803 0.6761 1176 weighted avg 0.7476 0.7789 0.7623 1176 2023-10-17 16:08:21,280 ----------------------------------------------------------------------------------------------------