stefan-it's picture
Upload folder using huggingface_hub
c03def7
2023-10-17 22:29:49,087 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,089 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 22:29:49,089 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,089 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-17 22:29:49,089 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,089 Train: 20847 sentences
2023-10-17 22:29:49,089 (train_with_dev=False, train_with_test=False)
2023-10-17 22:29:49,089 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,089 Training Params:
2023-10-17 22:29:49,089 - learning_rate: "5e-05"
2023-10-17 22:29:49,089 - mini_batch_size: "8"
2023-10-17 22:29:49,090 - max_epochs: "10"
2023-10-17 22:29:49,090 - shuffle: "True"
2023-10-17 22:29:49,090 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,090 Plugins:
2023-10-17 22:29:49,090 - TensorboardLogger
2023-10-17 22:29:49,090 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 22:29:49,090 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,090 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 22:29:49,090 - metric: "('micro avg', 'f1-score')"
2023-10-17 22:29:49,090 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,090 Computation:
2023-10-17 22:29:49,090 - compute on device: cuda:0
2023-10-17 22:29:49,090 - embedding storage: none
2023-10-17 22:29:49,090 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,090 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 22:29:49,091 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,091 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:49,091 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 22:30:17,028 epoch 1 - iter 260/2606 - loss 2.02848927 - time (sec): 27.94 - samples/sec: 1371.19 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:30:44,429 epoch 1 - iter 520/2606 - loss 1.21844947 - time (sec): 55.34 - samples/sec: 1346.96 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:31:12,228 epoch 1 - iter 780/2606 - loss 0.91990652 - time (sec): 83.14 - samples/sec: 1318.02 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:31:40,051 epoch 1 - iter 1040/2606 - loss 0.74718680 - time (sec): 110.96 - samples/sec: 1326.72 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:32:06,348 epoch 1 - iter 1300/2606 - loss 0.64893783 - time (sec): 137.26 - samples/sec: 1328.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:32:33,330 epoch 1 - iter 1560/2606 - loss 0.57578103 - time (sec): 164.24 - samples/sec: 1328.85 - lr: 0.000030 - momentum: 0.000000
2023-10-17 22:32:59,917 epoch 1 - iter 1820/2606 - loss 0.52157619 - time (sec): 190.82 - samples/sec: 1348.16 - lr: 0.000035 - momentum: 0.000000
2023-10-17 22:33:28,070 epoch 1 - iter 2080/2606 - loss 0.48435644 - time (sec): 218.98 - samples/sec: 1347.86 - lr: 0.000040 - momentum: 0.000000
2023-10-17 22:33:56,168 epoch 1 - iter 2340/2606 - loss 0.45266046 - time (sec): 247.07 - samples/sec: 1341.87 - lr: 0.000045 - momentum: 0.000000
2023-10-17 22:34:23,642 epoch 1 - iter 2600/2606 - loss 0.42860811 - time (sec): 274.55 - samples/sec: 1334.20 - lr: 0.000050 - momentum: 0.000000
2023-10-17 22:34:24,375 ----------------------------------------------------------------------------------------------------
2023-10-17 22:34:24,375 EPOCH 1 done: loss 0.4278 - lr: 0.000050
2023-10-17 22:34:31,962 DEV : loss 0.19458557665348053 - f1-score (micro avg) 0.3003
2023-10-17 22:34:32,021 saving best model
2023-10-17 22:34:32,611 ----------------------------------------------------------------------------------------------------
2023-10-17 22:35:00,780 epoch 2 - iter 260/2606 - loss 0.18573805 - time (sec): 28.17 - samples/sec: 1284.73 - lr: 0.000049 - momentum: 0.000000
2023-10-17 22:35:28,795 epoch 2 - iter 520/2606 - loss 0.17536530 - time (sec): 56.18 - samples/sec: 1294.10 - lr: 0.000049 - momentum: 0.000000
2023-10-17 22:35:57,211 epoch 2 - iter 780/2606 - loss 0.18464606 - time (sec): 84.60 - samples/sec: 1284.88 - lr: 0.000048 - momentum: 0.000000
2023-10-17 22:36:24,191 epoch 2 - iter 1040/2606 - loss 0.18572133 - time (sec): 111.58 - samples/sec: 1288.97 - lr: 0.000048 - momentum: 0.000000
2023-10-17 22:36:52,771 epoch 2 - iter 1300/2606 - loss 0.17573034 - time (sec): 140.16 - samples/sec: 1295.79 - lr: 0.000047 - momentum: 0.000000
2023-10-17 22:37:20,317 epoch 2 - iter 1560/2606 - loss 0.17431997 - time (sec): 167.70 - samples/sec: 1299.62 - lr: 0.000047 - momentum: 0.000000
2023-10-17 22:37:46,652 epoch 2 - iter 1820/2606 - loss 0.17276740 - time (sec): 194.04 - samples/sec: 1300.78 - lr: 0.000046 - momentum: 0.000000
2023-10-17 22:38:14,196 epoch 2 - iter 2080/2606 - loss 0.16921817 - time (sec): 221.58 - samples/sec: 1320.04 - lr: 0.000046 - momentum: 0.000000
2023-10-17 22:38:43,447 epoch 2 - iter 2340/2606 - loss 0.16736292 - time (sec): 250.83 - samples/sec: 1323.21 - lr: 0.000045 - momentum: 0.000000
2023-10-17 22:39:09,856 epoch 2 - iter 2600/2606 - loss 0.16436618 - time (sec): 277.24 - samples/sec: 1322.57 - lr: 0.000044 - momentum: 0.000000
2023-10-17 22:39:10,492 ----------------------------------------------------------------------------------------------------
2023-10-17 22:39:10,492 EPOCH 2 done: loss 0.1643 - lr: 0.000044
2023-10-17 22:39:22,270 DEV : loss 0.17822161316871643 - f1-score (micro avg) 0.3177
2023-10-17 22:39:22,322 saving best model
2023-10-17 22:39:23,752 ----------------------------------------------------------------------------------------------------
2023-10-17 22:39:51,896 epoch 3 - iter 260/2606 - loss 0.12920206 - time (sec): 28.14 - samples/sec: 1282.36 - lr: 0.000044 - momentum: 0.000000
2023-10-17 22:40:20,405 epoch 3 - iter 520/2606 - loss 0.11453609 - time (sec): 56.64 - samples/sec: 1296.29 - lr: 0.000043 - momentum: 0.000000
2023-10-17 22:40:48,913 epoch 3 - iter 780/2606 - loss 0.11650277 - time (sec): 85.15 - samples/sec: 1267.04 - lr: 0.000043 - momentum: 0.000000
2023-10-17 22:41:17,273 epoch 3 - iter 1040/2606 - loss 0.12299209 - time (sec): 113.51 - samples/sec: 1291.79 - lr: 0.000042 - momentum: 0.000000
2023-10-17 22:41:44,017 epoch 3 - iter 1300/2606 - loss 0.12055576 - time (sec): 140.26 - samples/sec: 1302.55 - lr: 0.000042 - momentum: 0.000000
2023-10-17 22:42:10,968 epoch 3 - iter 1560/2606 - loss 0.11843686 - time (sec): 167.21 - samples/sec: 1311.30 - lr: 0.000041 - momentum: 0.000000
2023-10-17 22:42:39,419 epoch 3 - iter 1820/2606 - loss 0.11721896 - time (sec): 195.66 - samples/sec: 1315.84 - lr: 0.000041 - momentum: 0.000000
2023-10-17 22:43:07,827 epoch 3 - iter 2080/2606 - loss 0.11733893 - time (sec): 224.07 - samples/sec: 1312.43 - lr: 0.000040 - momentum: 0.000000
2023-10-17 22:43:34,880 epoch 3 - iter 2340/2606 - loss 0.11675506 - time (sec): 251.12 - samples/sec: 1309.40 - lr: 0.000039 - momentum: 0.000000
2023-10-17 22:44:03,157 epoch 3 - iter 2600/2606 - loss 0.11643558 - time (sec): 279.40 - samples/sec: 1311.26 - lr: 0.000039 - momentum: 0.000000
2023-10-17 22:44:03,732 ----------------------------------------------------------------------------------------------------
2023-10-17 22:44:03,733 EPOCH 3 done: loss 0.1164 - lr: 0.000039
2023-10-17 22:44:16,079 DEV : loss 0.18669994175434113 - f1-score (micro avg) 0.3881
2023-10-17 22:44:16,134 saving best model
2023-10-17 22:44:17,540 ----------------------------------------------------------------------------------------------------
2023-10-17 22:44:44,788 epoch 4 - iter 260/2606 - loss 0.08494930 - time (sec): 27.24 - samples/sec: 1367.76 - lr: 0.000038 - momentum: 0.000000
2023-10-17 22:45:12,077 epoch 4 - iter 520/2606 - loss 0.07898452 - time (sec): 54.53 - samples/sec: 1348.42 - lr: 0.000038 - momentum: 0.000000
2023-10-17 22:45:39,790 epoch 4 - iter 780/2606 - loss 0.07898936 - time (sec): 82.25 - samples/sec: 1351.84 - lr: 0.000037 - momentum: 0.000000
2023-10-17 22:46:06,642 epoch 4 - iter 1040/2606 - loss 0.08216188 - time (sec): 109.10 - samples/sec: 1340.12 - lr: 0.000037 - momentum: 0.000000
2023-10-17 22:46:34,413 epoch 4 - iter 1300/2606 - loss 0.08359594 - time (sec): 136.87 - samples/sec: 1330.39 - lr: 0.000036 - momentum: 0.000000
2023-10-17 22:47:01,446 epoch 4 - iter 1560/2606 - loss 0.08486629 - time (sec): 163.90 - samples/sec: 1332.80 - lr: 0.000036 - momentum: 0.000000
2023-10-17 22:47:28,235 epoch 4 - iter 1820/2606 - loss 0.08429319 - time (sec): 190.69 - samples/sec: 1339.58 - lr: 0.000035 - momentum: 0.000000
2023-10-17 22:47:56,351 epoch 4 - iter 2080/2606 - loss 0.08455278 - time (sec): 218.81 - samples/sec: 1342.81 - lr: 0.000034 - momentum: 0.000000
2023-10-17 22:48:23,794 epoch 4 - iter 2340/2606 - loss 0.08431644 - time (sec): 246.25 - samples/sec: 1339.45 - lr: 0.000034 - momentum: 0.000000
2023-10-17 22:48:52,450 epoch 4 - iter 2600/2606 - loss 0.08411450 - time (sec): 274.91 - samples/sec: 1332.31 - lr: 0.000033 - momentum: 0.000000
2023-10-17 22:48:53,209 ----------------------------------------------------------------------------------------------------
2023-10-17 22:48:53,210 EPOCH 4 done: loss 0.0840 - lr: 0.000033
2023-10-17 22:49:04,552 DEV : loss 0.2566596269607544 - f1-score (micro avg) 0.3956
2023-10-17 22:49:04,607 saving best model
2023-10-17 22:49:06,082 ----------------------------------------------------------------------------------------------------
2023-10-17 22:49:35,940 epoch 5 - iter 260/2606 - loss 0.04669936 - time (sec): 29.85 - samples/sec: 1205.57 - lr: 0.000033 - momentum: 0.000000
2023-10-17 22:50:03,138 epoch 5 - iter 520/2606 - loss 0.05345590 - time (sec): 57.05 - samples/sec: 1309.26 - lr: 0.000032 - momentum: 0.000000
2023-10-17 22:50:32,012 epoch 5 - iter 780/2606 - loss 0.05303634 - time (sec): 85.93 - samples/sec: 1325.22 - lr: 0.000032 - momentum: 0.000000
2023-10-17 22:50:59,501 epoch 5 - iter 1040/2606 - loss 0.05440153 - time (sec): 113.41 - samples/sec: 1314.75 - lr: 0.000031 - momentum: 0.000000
2023-10-17 22:51:27,926 epoch 5 - iter 1300/2606 - loss 0.05722889 - time (sec): 141.84 - samples/sec: 1310.04 - lr: 0.000031 - momentum: 0.000000
2023-10-17 22:51:55,765 epoch 5 - iter 1560/2606 - loss 0.05872981 - time (sec): 169.68 - samples/sec: 1319.28 - lr: 0.000030 - momentum: 0.000000
2023-10-17 22:52:23,788 epoch 5 - iter 1820/2606 - loss 0.05737820 - time (sec): 197.70 - samples/sec: 1316.82 - lr: 0.000029 - momentum: 0.000000
2023-10-17 22:52:51,093 epoch 5 - iter 2080/2606 - loss 0.05740010 - time (sec): 225.01 - samples/sec: 1321.20 - lr: 0.000029 - momentum: 0.000000
2023-10-17 22:53:17,507 epoch 5 - iter 2340/2606 - loss 0.05836892 - time (sec): 251.42 - samples/sec: 1323.52 - lr: 0.000028 - momentum: 0.000000
2023-10-17 22:53:43,210 epoch 5 - iter 2600/2606 - loss 0.05838649 - time (sec): 277.12 - samples/sec: 1323.20 - lr: 0.000028 - momentum: 0.000000
2023-10-17 22:53:43,761 ----------------------------------------------------------------------------------------------------
2023-10-17 22:53:43,761 EPOCH 5 done: loss 0.0585 - lr: 0.000028
2023-10-17 22:53:55,674 DEV : loss 0.27858346700668335 - f1-score (micro avg) 0.4073
2023-10-17 22:53:55,737 saving best model
2023-10-17 22:53:57,184 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:24,007 epoch 6 - iter 260/2606 - loss 0.05343449 - time (sec): 26.82 - samples/sec: 1428.06 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:54:50,782 epoch 6 - iter 520/2606 - loss 0.05204942 - time (sec): 53.59 - samples/sec: 1383.52 - lr: 0.000027 - momentum: 0.000000
2023-10-17 22:55:17,857 epoch 6 - iter 780/2606 - loss 0.05156502 - time (sec): 80.67 - samples/sec: 1394.32 - lr: 0.000026 - momentum: 0.000000
2023-10-17 22:55:45,833 epoch 6 - iter 1040/2606 - loss 0.04980644 - time (sec): 108.65 - samples/sec: 1385.34 - lr: 0.000026 - momentum: 0.000000
2023-10-17 22:56:14,864 epoch 6 - iter 1300/2606 - loss 0.04934279 - time (sec): 137.68 - samples/sec: 1352.64 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:56:42,014 epoch 6 - iter 1560/2606 - loss 0.04831799 - time (sec): 164.83 - samples/sec: 1330.27 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:57:08,026 epoch 6 - iter 1820/2606 - loss 0.04711258 - time (sec): 190.84 - samples/sec: 1333.27 - lr: 0.000024 - momentum: 0.000000
2023-10-17 22:57:36,502 epoch 6 - iter 2080/2606 - loss 0.04700619 - time (sec): 219.31 - samples/sec: 1325.55 - lr: 0.000023 - momentum: 0.000000
2023-10-17 22:58:04,030 epoch 6 - iter 2340/2606 - loss 0.04631173 - time (sec): 246.84 - samples/sec: 1329.18 - lr: 0.000023 - momentum: 0.000000
2023-10-17 22:58:32,499 epoch 6 - iter 2600/2606 - loss 0.04553521 - time (sec): 275.31 - samples/sec: 1331.16 - lr: 0.000022 - momentum: 0.000000
2023-10-17 22:58:33,201 ----------------------------------------------------------------------------------------------------
2023-10-17 22:58:33,202 EPOCH 6 done: loss 0.0455 - lr: 0.000022
2023-10-17 22:58:44,947 DEV : loss 0.31613317131996155 - f1-score (micro avg) 0.3619
2023-10-17 22:58:45,011 ----------------------------------------------------------------------------------------------------
2023-10-17 22:59:13,685 epoch 7 - iter 260/2606 - loss 0.02491344 - time (sec): 28.67 - samples/sec: 1318.70 - lr: 0.000022 - momentum: 0.000000
2023-10-17 22:59:40,721 epoch 7 - iter 520/2606 - loss 0.02684154 - time (sec): 55.71 - samples/sec: 1329.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 23:00:09,079 epoch 7 - iter 780/2606 - loss 0.02766037 - time (sec): 84.07 - samples/sec: 1290.18 - lr: 0.000021 - momentum: 0.000000
2023-10-17 23:00:37,772 epoch 7 - iter 1040/2606 - loss 0.02974319 - time (sec): 112.76 - samples/sec: 1279.62 - lr: 0.000020 - momentum: 0.000000
2023-10-17 23:01:06,063 epoch 7 - iter 1300/2606 - loss 0.03074650 - time (sec): 141.05 - samples/sec: 1284.12 - lr: 0.000019 - momentum: 0.000000
2023-10-17 23:01:32,926 epoch 7 - iter 1560/2606 - loss 0.03182663 - time (sec): 167.91 - samples/sec: 1287.99 - lr: 0.000019 - momentum: 0.000000
2023-10-17 23:02:00,232 epoch 7 - iter 1820/2606 - loss 0.03187523 - time (sec): 195.22 - samples/sec: 1297.48 - lr: 0.000018 - momentum: 0.000000
2023-10-17 23:02:29,230 epoch 7 - iter 2080/2606 - loss 0.03088266 - time (sec): 224.22 - samples/sec: 1318.64 - lr: 0.000018 - momentum: 0.000000
2023-10-17 23:02:55,694 epoch 7 - iter 2340/2606 - loss 0.03076656 - time (sec): 250.68 - samples/sec: 1316.90 - lr: 0.000017 - momentum: 0.000000
2023-10-17 23:03:23,449 epoch 7 - iter 2600/2606 - loss 0.02992379 - time (sec): 278.44 - samples/sec: 1317.67 - lr: 0.000017 - momentum: 0.000000
2023-10-17 23:03:23,970 ----------------------------------------------------------------------------------------------------
2023-10-17 23:03:23,970 EPOCH 7 done: loss 0.0299 - lr: 0.000017
2023-10-17 23:03:35,173 DEV : loss 0.4403761923313141 - f1-score (micro avg) 0.3735
2023-10-17 23:03:35,238 ----------------------------------------------------------------------------------------------------
2023-10-17 23:04:03,459 epoch 8 - iter 260/2606 - loss 0.02142000 - time (sec): 28.22 - samples/sec: 1283.04 - lr: 0.000016 - momentum: 0.000000
2023-10-17 23:04:31,714 epoch 8 - iter 520/2606 - loss 0.01991354 - time (sec): 56.47 - samples/sec: 1294.22 - lr: 0.000016 - momentum: 0.000000
2023-10-17 23:04:58,892 epoch 8 - iter 780/2606 - loss 0.01931704 - time (sec): 83.65 - samples/sec: 1279.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 23:05:27,331 epoch 8 - iter 1040/2606 - loss 0.02006574 - time (sec): 112.09 - samples/sec: 1275.23 - lr: 0.000014 - momentum: 0.000000
2023-10-17 23:05:55,670 epoch 8 - iter 1300/2606 - loss 0.02078348 - time (sec): 140.43 - samples/sec: 1274.13 - lr: 0.000014 - momentum: 0.000000
2023-10-17 23:06:24,906 epoch 8 - iter 1560/2606 - loss 0.02167079 - time (sec): 169.67 - samples/sec: 1277.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 23:06:53,016 epoch 8 - iter 1820/2606 - loss 0.02190993 - time (sec): 197.78 - samples/sec: 1301.34 - lr: 0.000013 - momentum: 0.000000
2023-10-17 23:07:20,238 epoch 8 - iter 2080/2606 - loss 0.02313931 - time (sec): 225.00 - samples/sec: 1310.25 - lr: 0.000012 - momentum: 0.000000
2023-10-17 23:07:45,885 epoch 8 - iter 2340/2606 - loss 0.02250970 - time (sec): 250.64 - samples/sec: 1315.03 - lr: 0.000012 - momentum: 0.000000
2023-10-17 23:08:14,890 epoch 8 - iter 2600/2606 - loss 0.02273813 - time (sec): 279.65 - samples/sec: 1311.02 - lr: 0.000011 - momentum: 0.000000
2023-10-17 23:08:15,460 ----------------------------------------------------------------------------------------------------
2023-10-17 23:08:15,460 EPOCH 8 done: loss 0.0227 - lr: 0.000011
2023-10-17 23:08:26,347 DEV : loss 0.44188764691352844 - f1-score (micro avg) 0.3802
2023-10-17 23:08:26,402 ----------------------------------------------------------------------------------------------------
2023-10-17 23:08:55,086 epoch 9 - iter 260/2606 - loss 0.01290092 - time (sec): 28.68 - samples/sec: 1357.88 - lr: 0.000011 - momentum: 0.000000
2023-10-17 23:09:22,049 epoch 9 - iter 520/2606 - loss 0.01379589 - time (sec): 55.64 - samples/sec: 1348.29 - lr: 0.000010 - momentum: 0.000000
2023-10-17 23:09:50,080 epoch 9 - iter 780/2606 - loss 0.01434365 - time (sec): 83.68 - samples/sec: 1312.48 - lr: 0.000009 - momentum: 0.000000
2023-10-17 23:10:18,063 epoch 9 - iter 1040/2606 - loss 0.01448784 - time (sec): 111.66 - samples/sec: 1292.94 - lr: 0.000009 - momentum: 0.000000
2023-10-17 23:10:45,062 epoch 9 - iter 1300/2606 - loss 0.01451714 - time (sec): 138.66 - samples/sec: 1293.96 - lr: 0.000008 - momentum: 0.000000
2023-10-17 23:11:13,160 epoch 9 - iter 1560/2606 - loss 0.01441414 - time (sec): 166.75 - samples/sec: 1300.35 - lr: 0.000008 - momentum: 0.000000
2023-10-17 23:11:40,264 epoch 9 - iter 1820/2606 - loss 0.01475688 - time (sec): 193.86 - samples/sec: 1310.23 - lr: 0.000007 - momentum: 0.000000
2023-10-17 23:12:08,004 epoch 9 - iter 2080/2606 - loss 0.01490181 - time (sec): 221.60 - samples/sec: 1320.90 - lr: 0.000007 - momentum: 0.000000
2023-10-17 23:12:36,041 epoch 9 - iter 2340/2606 - loss 0.01479776 - time (sec): 249.64 - samples/sec: 1321.83 - lr: 0.000006 - momentum: 0.000000
2023-10-17 23:13:05,286 epoch 9 - iter 2600/2606 - loss 0.01511359 - time (sec): 278.88 - samples/sec: 1314.75 - lr: 0.000006 - momentum: 0.000000
2023-10-17 23:13:05,943 ----------------------------------------------------------------------------------------------------
2023-10-17 23:13:05,944 EPOCH 9 done: loss 0.0151 - lr: 0.000006
2023-10-17 23:13:18,078 DEV : loss 0.4638948440551758 - f1-score (micro avg) 0.4106
2023-10-17 23:13:18,144 saving best model
2023-10-17 23:13:19,694 ----------------------------------------------------------------------------------------------------
2023-10-17 23:13:50,439 epoch 10 - iter 260/2606 - loss 0.00890757 - time (sec): 30.74 - samples/sec: 1219.33 - lr: 0.000005 - momentum: 0.000000
2023-10-17 23:14:18,832 epoch 10 - iter 520/2606 - loss 0.00864088 - time (sec): 59.14 - samples/sec: 1260.06 - lr: 0.000004 - momentum: 0.000000
2023-10-17 23:14:47,585 epoch 10 - iter 780/2606 - loss 0.01049821 - time (sec): 87.89 - samples/sec: 1243.94 - lr: 0.000004 - momentum: 0.000000
2023-10-17 23:15:14,969 epoch 10 - iter 1040/2606 - loss 0.01013399 - time (sec): 115.27 - samples/sec: 1247.17 - lr: 0.000003 - momentum: 0.000000
2023-10-17 23:15:42,175 epoch 10 - iter 1300/2606 - loss 0.00906366 - time (sec): 142.48 - samples/sec: 1286.90 - lr: 0.000003 - momentum: 0.000000
2023-10-17 23:16:09,614 epoch 10 - iter 1560/2606 - loss 0.00909100 - time (sec): 169.92 - samples/sec: 1297.96 - lr: 0.000002 - momentum: 0.000000
2023-10-17 23:16:39,396 epoch 10 - iter 1820/2606 - loss 0.00925267 - time (sec): 199.70 - samples/sec: 1299.91 - lr: 0.000002 - momentum: 0.000000
2023-10-17 23:17:07,503 epoch 10 - iter 2080/2606 - loss 0.00919351 - time (sec): 227.81 - samples/sec: 1297.14 - lr: 0.000001 - momentum: 0.000000
2023-10-17 23:17:33,875 epoch 10 - iter 2340/2606 - loss 0.00928314 - time (sec): 254.18 - samples/sec: 1298.02 - lr: 0.000001 - momentum: 0.000000
2023-10-17 23:17:59,956 epoch 10 - iter 2600/2606 - loss 0.00932223 - time (sec): 280.26 - samples/sec: 1308.39 - lr: 0.000000 - momentum: 0.000000
2023-10-17 23:18:00,484 ----------------------------------------------------------------------------------------------------
2023-10-17 23:18:00,484 EPOCH 10 done: loss 0.0093 - lr: 0.000000
2023-10-17 23:18:12,656 DEV : loss 0.48719704151153564 - f1-score (micro avg) 0.3936
2023-10-17 23:18:13,293 ----------------------------------------------------------------------------------------------------
2023-10-17 23:18:13,295 Loading model from best epoch ...
2023-10-17 23:18:15,665 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 23:18:34,724
Results:
- F-score (micro) 0.4853
- F-score (macro) 0.3233
- Accuracy 0.3255
By class:
precision recall f1-score support
LOC 0.5263 0.6343 0.5753 1214
PER 0.3903 0.5087 0.4417 808
ORG 0.2745 0.2776 0.2761 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4439 0.5351 0.4853 2390
macro avg 0.2978 0.3551 0.3233 2390
weighted avg 0.4398 0.5351 0.4823 2390
2023-10-17 23:18:34,724 ----------------------------------------------------------------------------------------------------