2023-10-17 22:29:49,087 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,089 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 22:29:49,089 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,089 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 22:29:49,089 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,089 Train: 20847 sentences 2023-10-17 22:29:49,089 (train_with_dev=False, train_with_test=False) 2023-10-17 22:29:49,089 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,089 Training Params: 2023-10-17 22:29:49,089 - learning_rate: "5e-05" 2023-10-17 22:29:49,089 - mini_batch_size: "8" 2023-10-17 22:29:49,090 - max_epochs: "10" 2023-10-17 22:29:49,090 - shuffle: "True" 2023-10-17 22:29:49,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,090 Plugins: 2023-10-17 22:29:49,090 - TensorboardLogger 2023-10-17 22:29:49,090 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 22:29:49,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,090 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 22:29:49,090 - metric: "('micro avg', 'f1-score')" 2023-10-17 22:29:49,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,090 Computation: 2023-10-17 22:29:49,090 - compute on device: cuda:0 2023-10-17 22:29:49,090 - embedding storage: none 2023-10-17 22:29:49,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,090 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 22:29:49,091 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,091 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:29:49,091 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 22:30:17,028 epoch 1 - iter 260/2606 - loss 2.02848927 - time (sec): 27.94 - samples/sec: 1371.19 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:30:44,429 epoch 1 - iter 520/2606 - loss 1.21844947 - time (sec): 55.34 - samples/sec: 1346.96 - lr: 0.000010 - momentum: 0.000000 2023-10-17 22:31:12,228 epoch 1 - iter 780/2606 - loss 0.91990652 - time (sec): 83.14 - samples/sec: 1318.02 - lr: 0.000015 - momentum: 0.000000 2023-10-17 22:31:40,051 epoch 1 - iter 1040/2606 - loss 0.74718680 - time (sec): 110.96 - samples/sec: 1326.72 - lr: 0.000020 - momentum: 0.000000 2023-10-17 22:32:06,348 epoch 1 - iter 1300/2606 - loss 0.64893783 - time (sec): 137.26 - samples/sec: 1328.81 - lr: 0.000025 - momentum: 0.000000 2023-10-17 22:32:33,330 epoch 1 - iter 1560/2606 - loss 0.57578103 - time (sec): 164.24 - samples/sec: 1328.85 - lr: 0.000030 - momentum: 0.000000 2023-10-17 22:32:59,917 epoch 1 - iter 1820/2606 - loss 0.52157619 - time (sec): 190.82 - samples/sec: 1348.16 - lr: 0.000035 - momentum: 0.000000 2023-10-17 22:33:28,070 epoch 1 - iter 2080/2606 - loss 0.48435644 - time (sec): 218.98 - samples/sec: 1347.86 - lr: 0.000040 - momentum: 0.000000 2023-10-17 22:33:56,168 epoch 1 - iter 2340/2606 - loss 0.45266046 - time (sec): 247.07 - samples/sec: 1341.87 - lr: 0.000045 - momentum: 0.000000 2023-10-17 22:34:23,642 epoch 1 - iter 2600/2606 - loss 0.42860811 - time (sec): 274.55 - samples/sec: 1334.20 - lr: 0.000050 - momentum: 0.000000 2023-10-17 22:34:24,375 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:34:24,375 EPOCH 1 done: loss 0.4278 - lr: 0.000050 2023-10-17 22:34:31,962 DEV : loss 0.19458557665348053 - f1-score (micro avg) 0.3003 2023-10-17 22:34:32,021 saving best model 2023-10-17 22:34:32,611 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:35:00,780 epoch 2 - iter 260/2606 - loss 0.18573805 - time (sec): 28.17 - samples/sec: 1284.73 - lr: 0.000049 - momentum: 0.000000 2023-10-17 22:35:28,795 epoch 2 - iter 520/2606 - loss 0.17536530 - time (sec): 56.18 - samples/sec: 1294.10 - lr: 0.000049 - momentum: 0.000000 2023-10-17 22:35:57,211 epoch 2 - iter 780/2606 - loss 0.18464606 - time (sec): 84.60 - samples/sec: 1284.88 - lr: 0.000048 - momentum: 0.000000 2023-10-17 22:36:24,191 epoch 2 - iter 1040/2606 - loss 0.18572133 - time (sec): 111.58 - samples/sec: 1288.97 - lr: 0.000048 - momentum: 0.000000 2023-10-17 22:36:52,771 epoch 2 - iter 1300/2606 - loss 0.17573034 - time (sec): 140.16 - samples/sec: 1295.79 - lr: 0.000047 - momentum: 0.000000 2023-10-17 22:37:20,317 epoch 2 - iter 1560/2606 - loss 0.17431997 - time (sec): 167.70 - samples/sec: 1299.62 - lr: 0.000047 - momentum: 0.000000 2023-10-17 22:37:46,652 epoch 2 - iter 1820/2606 - loss 0.17276740 - time (sec): 194.04 - samples/sec: 1300.78 - lr: 0.000046 - momentum: 0.000000 2023-10-17 22:38:14,196 epoch 2 - iter 2080/2606 - loss 0.16921817 - time (sec): 221.58 - samples/sec: 1320.04 - lr: 0.000046 - momentum: 0.000000 2023-10-17 22:38:43,447 epoch 2 - iter 2340/2606 - loss 0.16736292 - time (sec): 250.83 - samples/sec: 1323.21 - lr: 0.000045 - momentum: 0.000000 2023-10-17 22:39:09,856 epoch 2 - iter 2600/2606 - loss 0.16436618 - time (sec): 277.24 - samples/sec: 1322.57 - lr: 0.000044 - momentum: 0.000000 2023-10-17 22:39:10,492 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:39:10,492 EPOCH 2 done: loss 0.1643 - lr: 0.000044 2023-10-17 22:39:22,270 DEV : loss 0.17822161316871643 - f1-score (micro avg) 0.3177 2023-10-17 22:39:22,322 saving best model 2023-10-17 22:39:23,752 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:39:51,896 epoch 3 - iter 260/2606 - loss 0.12920206 - time (sec): 28.14 - samples/sec: 1282.36 - lr: 0.000044 - momentum: 0.000000 2023-10-17 22:40:20,405 epoch 3 - iter 520/2606 - loss 0.11453609 - time (sec): 56.64 - samples/sec: 1296.29 - lr: 0.000043 - momentum: 0.000000 2023-10-17 22:40:48,913 epoch 3 - iter 780/2606 - loss 0.11650277 - time (sec): 85.15 - samples/sec: 1267.04 - lr: 0.000043 - momentum: 0.000000 2023-10-17 22:41:17,273 epoch 3 - iter 1040/2606 - loss 0.12299209 - time (sec): 113.51 - samples/sec: 1291.79 - lr: 0.000042 - momentum: 0.000000 2023-10-17 22:41:44,017 epoch 3 - iter 1300/2606 - loss 0.12055576 - time (sec): 140.26 - samples/sec: 1302.55 - lr: 0.000042 - momentum: 0.000000 2023-10-17 22:42:10,968 epoch 3 - iter 1560/2606 - loss 0.11843686 - time (sec): 167.21 - samples/sec: 1311.30 - lr: 0.000041 - momentum: 0.000000 2023-10-17 22:42:39,419 epoch 3 - iter 1820/2606 - loss 0.11721896 - time (sec): 195.66 - samples/sec: 1315.84 - lr: 0.000041 - momentum: 0.000000 2023-10-17 22:43:07,827 epoch 3 - iter 2080/2606 - loss 0.11733893 - time (sec): 224.07 - samples/sec: 1312.43 - lr: 0.000040 - momentum: 0.000000 2023-10-17 22:43:34,880 epoch 3 - iter 2340/2606 - loss 0.11675506 - time (sec): 251.12 - samples/sec: 1309.40 - lr: 0.000039 - momentum: 0.000000 2023-10-17 22:44:03,157 epoch 3 - iter 2600/2606 - loss 0.11643558 - time (sec): 279.40 - samples/sec: 1311.26 - lr: 0.000039 - momentum: 0.000000 2023-10-17 22:44:03,732 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:44:03,733 EPOCH 3 done: loss 0.1164 - lr: 0.000039 2023-10-17 22:44:16,079 DEV : loss 0.18669994175434113 - f1-score (micro avg) 0.3881 2023-10-17 22:44:16,134 saving best model 2023-10-17 22:44:17,540 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:44:44,788 epoch 4 - iter 260/2606 - loss 0.08494930 - time (sec): 27.24 - samples/sec: 1367.76 - lr: 0.000038 - momentum: 0.000000 2023-10-17 22:45:12,077 epoch 4 - iter 520/2606 - loss 0.07898452 - time (sec): 54.53 - samples/sec: 1348.42 - lr: 0.000038 - momentum: 0.000000 2023-10-17 22:45:39,790 epoch 4 - iter 780/2606 - loss 0.07898936 - time (sec): 82.25 - samples/sec: 1351.84 - lr: 0.000037 - momentum: 0.000000 2023-10-17 22:46:06,642 epoch 4 - iter 1040/2606 - loss 0.08216188 - time (sec): 109.10 - samples/sec: 1340.12 - lr: 0.000037 - momentum: 0.000000 2023-10-17 22:46:34,413 epoch 4 - iter 1300/2606 - loss 0.08359594 - time (sec): 136.87 - samples/sec: 1330.39 - lr: 0.000036 - momentum: 0.000000 2023-10-17 22:47:01,446 epoch 4 - iter 1560/2606 - loss 0.08486629 - time (sec): 163.90 - samples/sec: 1332.80 - lr: 0.000036 - momentum: 0.000000 2023-10-17 22:47:28,235 epoch 4 - iter 1820/2606 - loss 0.08429319 - time (sec): 190.69 - samples/sec: 1339.58 - lr: 0.000035 - momentum: 0.000000 2023-10-17 22:47:56,351 epoch 4 - iter 2080/2606 - loss 0.08455278 - time (sec): 218.81 - samples/sec: 1342.81 - lr: 0.000034 - momentum: 0.000000 2023-10-17 22:48:23,794 epoch 4 - iter 2340/2606 - loss 0.08431644 - time (sec): 246.25 - samples/sec: 1339.45 - lr: 0.000034 - momentum: 0.000000 2023-10-17 22:48:52,450 epoch 4 - iter 2600/2606 - loss 0.08411450 - time (sec): 274.91 - samples/sec: 1332.31 - lr: 0.000033 - momentum: 0.000000 2023-10-17 22:48:53,209 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:48:53,210 EPOCH 4 done: loss 0.0840 - lr: 0.000033 2023-10-17 22:49:04,552 DEV : loss 0.2566596269607544 - f1-score (micro avg) 0.3956 2023-10-17 22:49:04,607 saving best model 2023-10-17 22:49:06,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:49:35,940 epoch 5 - iter 260/2606 - loss 0.04669936 - time (sec): 29.85 - samples/sec: 1205.57 - lr: 0.000033 - momentum: 0.000000 2023-10-17 22:50:03,138 epoch 5 - iter 520/2606 - loss 0.05345590 - time (sec): 57.05 - samples/sec: 1309.26 - lr: 0.000032 - momentum: 0.000000 2023-10-17 22:50:32,012 epoch 5 - iter 780/2606 - loss 0.05303634 - time (sec): 85.93 - samples/sec: 1325.22 - lr: 0.000032 - momentum: 0.000000 2023-10-17 22:50:59,501 epoch 5 - iter 1040/2606 - loss 0.05440153 - time (sec): 113.41 - samples/sec: 1314.75 - lr: 0.000031 - momentum: 0.000000 2023-10-17 22:51:27,926 epoch 5 - iter 1300/2606 - loss 0.05722889 - time (sec): 141.84 - samples/sec: 1310.04 - lr: 0.000031 - momentum: 0.000000 2023-10-17 22:51:55,765 epoch 5 - iter 1560/2606 - loss 0.05872981 - time (sec): 169.68 - samples/sec: 1319.28 - lr: 0.000030 - momentum: 0.000000 2023-10-17 22:52:23,788 epoch 5 - iter 1820/2606 - loss 0.05737820 - time (sec): 197.70 - samples/sec: 1316.82 - lr: 0.000029 - momentum: 0.000000 2023-10-17 22:52:51,093 epoch 5 - iter 2080/2606 - loss 0.05740010 - time (sec): 225.01 - samples/sec: 1321.20 - lr: 0.000029 - momentum: 0.000000 2023-10-17 22:53:17,507 epoch 5 - iter 2340/2606 - loss 0.05836892 - time (sec): 251.42 - samples/sec: 1323.52 - lr: 0.000028 - momentum: 0.000000 2023-10-17 22:53:43,210 epoch 5 - iter 2600/2606 - loss 0.05838649 - time (sec): 277.12 - samples/sec: 1323.20 - lr: 0.000028 - momentum: 0.000000 2023-10-17 22:53:43,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:53:43,761 EPOCH 5 done: loss 0.0585 - lr: 0.000028 2023-10-17 22:53:55,674 DEV : loss 0.27858346700668335 - f1-score (micro avg) 0.4073 2023-10-17 22:53:55,737 saving best model 2023-10-17 22:53:57,184 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:54:24,007 epoch 6 - iter 260/2606 - loss 0.05343449 - time (sec): 26.82 - samples/sec: 1428.06 - lr: 0.000027 - momentum: 0.000000 2023-10-17 22:54:50,782 epoch 6 - iter 520/2606 - loss 0.05204942 - time (sec): 53.59 - samples/sec: 1383.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 22:55:17,857 epoch 6 - iter 780/2606 - loss 0.05156502 - time (sec): 80.67 - samples/sec: 1394.32 - lr: 0.000026 - momentum: 0.000000 2023-10-17 22:55:45,833 epoch 6 - iter 1040/2606 - loss 0.04980644 - time (sec): 108.65 - samples/sec: 1385.34 - lr: 0.000026 - momentum: 0.000000 2023-10-17 22:56:14,864 epoch 6 - iter 1300/2606 - loss 0.04934279 - time (sec): 137.68 - samples/sec: 1352.64 - lr: 0.000025 - momentum: 0.000000 2023-10-17 22:56:42,014 epoch 6 - iter 1560/2606 - loss 0.04831799 - time (sec): 164.83 - samples/sec: 1330.27 - lr: 0.000024 - momentum: 0.000000 2023-10-17 22:57:08,026 epoch 6 - iter 1820/2606 - loss 0.04711258 - time (sec): 190.84 - samples/sec: 1333.27 - lr: 0.000024 - momentum: 0.000000 2023-10-17 22:57:36,502 epoch 6 - iter 2080/2606 - loss 0.04700619 - time (sec): 219.31 - samples/sec: 1325.55 - lr: 0.000023 - momentum: 0.000000 2023-10-17 22:58:04,030 epoch 6 - iter 2340/2606 - loss 0.04631173 - time (sec): 246.84 - samples/sec: 1329.18 - lr: 0.000023 - momentum: 0.000000 2023-10-17 22:58:32,499 epoch 6 - iter 2600/2606 - loss 0.04553521 - time (sec): 275.31 - samples/sec: 1331.16 - lr: 0.000022 - momentum: 0.000000 2023-10-17 22:58:33,201 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:58:33,202 EPOCH 6 done: loss 0.0455 - lr: 0.000022 2023-10-17 22:58:44,947 DEV : loss 0.31613317131996155 - f1-score (micro avg) 0.3619 2023-10-17 22:58:45,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:59:13,685 epoch 7 - iter 260/2606 - loss 0.02491344 - time (sec): 28.67 - samples/sec: 1318.70 - lr: 0.000022 - momentum: 0.000000 2023-10-17 22:59:40,721 epoch 7 - iter 520/2606 - loss 0.02684154 - time (sec): 55.71 - samples/sec: 1329.26 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:00:09,079 epoch 7 - iter 780/2606 - loss 0.02766037 - time (sec): 84.07 - samples/sec: 1290.18 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:00:37,772 epoch 7 - iter 1040/2606 - loss 0.02974319 - time (sec): 112.76 - samples/sec: 1279.62 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:01:06,063 epoch 7 - iter 1300/2606 - loss 0.03074650 - time (sec): 141.05 - samples/sec: 1284.12 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:01:32,926 epoch 7 - iter 1560/2606 - loss 0.03182663 - time (sec): 167.91 - samples/sec: 1287.99 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:02:00,232 epoch 7 - iter 1820/2606 - loss 0.03187523 - time (sec): 195.22 - samples/sec: 1297.48 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:02:29,230 epoch 7 - iter 2080/2606 - loss 0.03088266 - time (sec): 224.22 - samples/sec: 1318.64 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:02:55,694 epoch 7 - iter 2340/2606 - loss 0.03076656 - time (sec): 250.68 - samples/sec: 1316.90 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:03:23,449 epoch 7 - iter 2600/2606 - loss 0.02992379 - time (sec): 278.44 - samples/sec: 1317.67 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:03:23,970 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:03:23,970 EPOCH 7 done: loss 0.0299 - lr: 0.000017 2023-10-17 23:03:35,173 DEV : loss 0.4403761923313141 - f1-score (micro avg) 0.3735 2023-10-17 23:03:35,238 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:04:03,459 epoch 8 - iter 260/2606 - loss 0.02142000 - time (sec): 28.22 - samples/sec: 1283.04 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:04:31,714 epoch 8 - iter 520/2606 - loss 0.01991354 - time (sec): 56.47 - samples/sec: 1294.22 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:04:58,892 epoch 8 - iter 780/2606 - loss 0.01931704 - time (sec): 83.65 - samples/sec: 1279.97 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:05:27,331 epoch 8 - iter 1040/2606 - loss 0.02006574 - time (sec): 112.09 - samples/sec: 1275.23 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:05:55,670 epoch 8 - iter 1300/2606 - loss 0.02078348 - time (sec): 140.43 - samples/sec: 1274.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:06:24,906 epoch 8 - iter 1560/2606 - loss 0.02167079 - time (sec): 169.67 - samples/sec: 1277.81 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:06:53,016 epoch 8 - iter 1820/2606 - loss 0.02190993 - time (sec): 197.78 - samples/sec: 1301.34 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:07:20,238 epoch 8 - iter 2080/2606 - loss 0.02313931 - time (sec): 225.00 - samples/sec: 1310.25 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:07:45,885 epoch 8 - iter 2340/2606 - loss 0.02250970 - time (sec): 250.64 - samples/sec: 1315.03 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:08:14,890 epoch 8 - iter 2600/2606 - loss 0.02273813 - time (sec): 279.65 - samples/sec: 1311.02 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:08:15,460 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:08:15,460 EPOCH 8 done: loss 0.0227 - lr: 0.000011 2023-10-17 23:08:26,347 DEV : loss 0.44188764691352844 - f1-score (micro avg) 0.3802 2023-10-17 23:08:26,402 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:08:55,086 epoch 9 - iter 260/2606 - loss 0.01290092 - time (sec): 28.68 - samples/sec: 1357.88 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:09:22,049 epoch 9 - iter 520/2606 - loss 0.01379589 - time (sec): 55.64 - samples/sec: 1348.29 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:09:50,080 epoch 9 - iter 780/2606 - loss 0.01434365 - time (sec): 83.68 - samples/sec: 1312.48 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:10:18,063 epoch 9 - iter 1040/2606 - loss 0.01448784 - time (sec): 111.66 - samples/sec: 1292.94 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:10:45,062 epoch 9 - iter 1300/2606 - loss 0.01451714 - time (sec): 138.66 - samples/sec: 1293.96 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:11:13,160 epoch 9 - iter 1560/2606 - loss 0.01441414 - time (sec): 166.75 - samples/sec: 1300.35 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:11:40,264 epoch 9 - iter 1820/2606 - loss 0.01475688 - time (sec): 193.86 - samples/sec: 1310.23 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:12:08,004 epoch 9 - iter 2080/2606 - loss 0.01490181 - time (sec): 221.60 - samples/sec: 1320.90 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:12:36,041 epoch 9 - iter 2340/2606 - loss 0.01479776 - time (sec): 249.64 - samples/sec: 1321.83 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:13:05,286 epoch 9 - iter 2600/2606 - loss 0.01511359 - time (sec): 278.88 - samples/sec: 1314.75 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:13:05,943 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:13:05,944 EPOCH 9 done: loss 0.0151 - lr: 0.000006 2023-10-17 23:13:18,078 DEV : loss 0.4638948440551758 - f1-score (micro avg) 0.4106 2023-10-17 23:13:18,144 saving best model 2023-10-17 23:13:19,694 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:13:50,439 epoch 10 - iter 260/2606 - loss 0.00890757 - time (sec): 30.74 - samples/sec: 1219.33 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:14:18,832 epoch 10 - iter 520/2606 - loss 0.00864088 - time (sec): 59.14 - samples/sec: 1260.06 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:14:47,585 epoch 10 - iter 780/2606 - loss 0.01049821 - time (sec): 87.89 - samples/sec: 1243.94 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:15:14,969 epoch 10 - iter 1040/2606 - loss 0.01013399 - time (sec): 115.27 - samples/sec: 1247.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:15:42,175 epoch 10 - iter 1300/2606 - loss 0.00906366 - time (sec): 142.48 - samples/sec: 1286.90 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:16:09,614 epoch 10 - iter 1560/2606 - loss 0.00909100 - time (sec): 169.92 - samples/sec: 1297.96 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:16:39,396 epoch 10 - iter 1820/2606 - loss 0.00925267 - time (sec): 199.70 - samples/sec: 1299.91 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:17:07,503 epoch 10 - iter 2080/2606 - loss 0.00919351 - time (sec): 227.81 - samples/sec: 1297.14 - lr: 0.000001 - momentum: 0.000000 2023-10-17 23:17:33,875 epoch 10 - iter 2340/2606 - loss 0.00928314 - time (sec): 254.18 - samples/sec: 1298.02 - lr: 0.000001 - momentum: 0.000000 2023-10-17 23:17:59,956 epoch 10 - iter 2600/2606 - loss 0.00932223 - time (sec): 280.26 - samples/sec: 1308.39 - lr: 0.000000 - momentum: 0.000000 2023-10-17 23:18:00,484 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:18:00,484 EPOCH 10 done: loss 0.0093 - lr: 0.000000 2023-10-17 23:18:12,656 DEV : loss 0.48719704151153564 - f1-score (micro avg) 0.3936 2023-10-17 23:18:13,293 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:18:13,295 Loading model from best epoch ... 2023-10-17 23:18:15,665 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 23:18:34,724 Results: - F-score (micro) 0.4853 - F-score (macro) 0.3233 - Accuracy 0.3255 By class: precision recall f1-score support LOC 0.5263 0.6343 0.5753 1214 PER 0.3903 0.5087 0.4417 808 ORG 0.2745 0.2776 0.2761 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4439 0.5351 0.4853 2390 macro avg 0.2978 0.3551 0.3233 2390 weighted avg 0.4398 0.5351 0.4823 2390 2023-10-17 23:18:34,724 ----------------------------------------------------------------------------------------------------