stefan-it's picture
Upload folder using huggingface_hub
6c81002
2023-10-17 10:02:06,493 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,495 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:02:06,495 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,495 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-17 10:02:06,495 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,495 Train: 1214 sentences
2023-10-17 10:02:06,495 (train_with_dev=False, train_with_test=False)
2023-10-17 10:02:06,495 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,495 Training Params:
2023-10-17 10:02:06,495 - learning_rate: "3e-05"
2023-10-17 10:02:06,495 - mini_batch_size: "4"
2023-10-17 10:02:06,495 - max_epochs: "10"
2023-10-17 10:02:06,495 - shuffle: "True"
2023-10-17 10:02:06,495 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,495 Plugins:
2023-10-17 10:02:06,496 - TensorboardLogger
2023-10-17 10:02:06,496 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:02:06,496 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,496 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:02:06,496 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:02:06,496 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,496 Computation:
2023-10-17 10:02:06,496 - compute on device: cuda:0
2023-10-17 10:02:06,496 - embedding storage: none
2023-10-17 10:02:06,496 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,496 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 10:02:06,496 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,496 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:06,496 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:02:07,999 epoch 1 - iter 30/304 - loss 3.91471408 - time (sec): 1.50 - samples/sec: 1893.38 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:02:09,519 epoch 1 - iter 60/304 - loss 3.15356945 - time (sec): 3.02 - samples/sec: 1988.92 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:02:11,060 epoch 1 - iter 90/304 - loss 2.42737450 - time (sec): 4.56 - samples/sec: 1973.07 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:02:12,632 epoch 1 - iter 120/304 - loss 1.92457254 - time (sec): 6.13 - samples/sec: 1974.94 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:02:14,193 epoch 1 - iter 150/304 - loss 1.63369367 - time (sec): 7.70 - samples/sec: 1953.71 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:02:15,749 epoch 1 - iter 180/304 - loss 1.41758456 - time (sec): 9.25 - samples/sec: 1965.58 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:02:17,275 epoch 1 - iter 210/304 - loss 1.24794432 - time (sec): 10.78 - samples/sec: 2005.91 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:02:18,649 epoch 1 - iter 240/304 - loss 1.12096737 - time (sec): 12.15 - samples/sec: 2033.68 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:02:20,022 epoch 1 - iter 270/304 - loss 1.02357630 - time (sec): 13.52 - samples/sec: 2061.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:02:21,387 epoch 1 - iter 300/304 - loss 0.94687326 - time (sec): 14.89 - samples/sec: 2061.06 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:02:21,564 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:21,564 EPOCH 1 done: loss 0.9389 - lr: 0.000030
2023-10-17 10:02:22,726 DEV : loss 0.21260005235671997 - f1-score (micro avg) 0.5972
2023-10-17 10:02:22,733 saving best model
2023-10-17 10:02:23,111 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:24,494 epoch 2 - iter 30/304 - loss 0.17504565 - time (sec): 1.38 - samples/sec: 2386.47 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:02:25,873 epoch 2 - iter 60/304 - loss 0.19569853 - time (sec): 2.76 - samples/sec: 2275.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:02:27,289 epoch 2 - iter 90/304 - loss 0.17193454 - time (sec): 4.18 - samples/sec: 2233.12 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:02:28,670 epoch 2 - iter 120/304 - loss 0.16934135 - time (sec): 5.56 - samples/sec: 2241.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:02:30,052 epoch 2 - iter 150/304 - loss 0.17176923 - time (sec): 6.94 - samples/sec: 2241.61 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:02:31,440 epoch 2 - iter 180/304 - loss 0.16894244 - time (sec): 8.33 - samples/sec: 2262.59 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:02:32,802 epoch 2 - iter 210/304 - loss 0.15988629 - time (sec): 9.69 - samples/sec: 2220.87 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:02:34,148 epoch 2 - iter 240/304 - loss 0.15375049 - time (sec): 11.03 - samples/sec: 2254.22 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:02:35,519 epoch 2 - iter 270/304 - loss 0.14916921 - time (sec): 12.41 - samples/sec: 2258.76 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:02:36,884 epoch 2 - iter 300/304 - loss 0.14765257 - time (sec): 13.77 - samples/sec: 2227.32 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:02:37,069 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:37,069 EPOCH 2 done: loss 0.1465 - lr: 0.000027
2023-10-17 10:02:37,993 DEV : loss 0.16393417119979858 - f1-score (micro avg) 0.7701
2023-10-17 10:02:37,999 saving best model
2023-10-17 10:02:38,446 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:39,816 epoch 3 - iter 30/304 - loss 0.10374105 - time (sec): 1.37 - samples/sec: 2245.87 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:02:41,184 epoch 3 - iter 60/304 - loss 0.08006234 - time (sec): 2.74 - samples/sec: 2234.77 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:02:42,568 epoch 3 - iter 90/304 - loss 0.08767063 - time (sec): 4.12 - samples/sec: 2256.37 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:02:43,944 epoch 3 - iter 120/304 - loss 0.09068705 - time (sec): 5.50 - samples/sec: 2235.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:02:45,302 epoch 3 - iter 150/304 - loss 0.09135349 - time (sec): 6.85 - samples/sec: 2205.45 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:02:46,662 epoch 3 - iter 180/304 - loss 0.08696189 - time (sec): 8.21 - samples/sec: 2184.25 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:02:48,049 epoch 3 - iter 210/304 - loss 0.08741350 - time (sec): 9.60 - samples/sec: 2183.73 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:02:49,439 epoch 3 - iter 240/304 - loss 0.08724010 - time (sec): 10.99 - samples/sec: 2213.31 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:02:50,823 epoch 3 - iter 270/304 - loss 0.09120280 - time (sec): 12.37 - samples/sec: 2226.75 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:02:52,184 epoch 3 - iter 300/304 - loss 0.09523846 - time (sec): 13.74 - samples/sec: 2225.39 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:02:52,362 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:52,362 EPOCH 3 done: loss 0.0944 - lr: 0.000023
2023-10-17 10:02:53,289 DEV : loss 0.16942133009433746 - f1-score (micro avg) 0.8286
2023-10-17 10:02:53,296 saving best model
2023-10-17 10:02:53,760 ----------------------------------------------------------------------------------------------------
2023-10-17 10:02:55,191 epoch 4 - iter 30/304 - loss 0.04129966 - time (sec): 1.43 - samples/sec: 2476.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:02:56,590 epoch 4 - iter 60/304 - loss 0.06484886 - time (sec): 2.83 - samples/sec: 2319.89 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:02:57,960 epoch 4 - iter 90/304 - loss 0.06438072 - time (sec): 4.20 - samples/sec: 2273.59 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:02:59,338 epoch 4 - iter 120/304 - loss 0.06760369 - time (sec): 5.57 - samples/sec: 2226.87 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:03:00,715 epoch 4 - iter 150/304 - loss 0.06231282 - time (sec): 6.95 - samples/sec: 2234.11 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:03:02,062 epoch 4 - iter 180/304 - loss 0.06648038 - time (sec): 8.30 - samples/sec: 2255.68 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:03:03,426 epoch 4 - iter 210/304 - loss 0.06648870 - time (sec): 9.66 - samples/sec: 2253.44 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:03:04,800 epoch 4 - iter 240/304 - loss 0.06551934 - time (sec): 11.04 - samples/sec: 2247.17 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:03:06,155 epoch 4 - iter 270/304 - loss 0.06434722 - time (sec): 12.39 - samples/sec: 2241.03 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:03:07,521 epoch 4 - iter 300/304 - loss 0.06561613 - time (sec): 13.76 - samples/sec: 2221.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:03:07,701 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:07,701 EPOCH 4 done: loss 0.0668 - lr: 0.000020
2023-10-17 10:03:08,890 DEV : loss 0.1767215132713318 - f1-score (micro avg) 0.8341
2023-10-17 10:03:08,900 saving best model
2023-10-17 10:03:09,405 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:11,078 epoch 5 - iter 30/304 - loss 0.05590618 - time (sec): 1.67 - samples/sec: 1906.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:03:12,658 epoch 5 - iter 60/304 - loss 0.05771895 - time (sec): 3.25 - samples/sec: 1878.22 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:03:14,217 epoch 5 - iter 90/304 - loss 0.05929665 - time (sec): 4.81 - samples/sec: 1968.46 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:03:15,769 epoch 5 - iter 120/304 - loss 0.05479179 - time (sec): 6.36 - samples/sec: 1974.78 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:03:17,304 epoch 5 - iter 150/304 - loss 0.05431980 - time (sec): 7.89 - samples/sec: 1985.62 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:03:18,698 epoch 5 - iter 180/304 - loss 0.05638198 - time (sec): 9.29 - samples/sec: 2031.94 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:03:20,046 epoch 5 - iter 210/304 - loss 0.05557100 - time (sec): 10.63 - samples/sec: 2041.16 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:03:21,448 epoch 5 - iter 240/304 - loss 0.05729684 - time (sec): 12.04 - samples/sec: 2075.15 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:03:22,833 epoch 5 - iter 270/304 - loss 0.05637455 - time (sec): 13.42 - samples/sec: 2074.00 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:03:24,197 epoch 5 - iter 300/304 - loss 0.05370632 - time (sec): 14.79 - samples/sec: 2077.13 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:03:24,375 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:24,375 EPOCH 5 done: loss 0.0533 - lr: 0.000017
2023-10-17 10:03:25,363 DEV : loss 0.17217454314231873 - f1-score (micro avg) 0.849
2023-10-17 10:03:25,369 saving best model
2023-10-17 10:03:25,859 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:27,287 epoch 6 - iter 30/304 - loss 0.04404318 - time (sec): 1.43 - samples/sec: 2140.42 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:03:28,747 epoch 6 - iter 60/304 - loss 0.03709370 - time (sec): 2.89 - samples/sec: 2158.44 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:03:30,146 epoch 6 - iter 90/304 - loss 0.03226865 - time (sec): 4.28 - samples/sec: 2100.40 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:03:31,480 epoch 6 - iter 120/304 - loss 0.03019694 - time (sec): 5.62 - samples/sec: 2202.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:03:32,801 epoch 6 - iter 150/304 - loss 0.02705743 - time (sec): 6.94 - samples/sec: 2191.22 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:03:34,118 epoch 6 - iter 180/304 - loss 0.03498976 - time (sec): 8.26 - samples/sec: 2232.42 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:03:35,433 epoch 6 - iter 210/304 - loss 0.03557122 - time (sec): 9.57 - samples/sec: 2205.32 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:03:36,741 epoch 6 - iter 240/304 - loss 0.03641619 - time (sec): 10.88 - samples/sec: 2265.09 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:03:38,065 epoch 6 - iter 270/304 - loss 0.03625595 - time (sec): 12.20 - samples/sec: 2276.50 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:03:39,381 epoch 6 - iter 300/304 - loss 0.03763049 - time (sec): 13.52 - samples/sec: 2275.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:03:39,549 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:39,549 EPOCH 6 done: loss 0.0374 - lr: 0.000013
2023-10-17 10:03:40,537 DEV : loss 0.18530553579330444 - f1-score (micro avg) 0.8527
2023-10-17 10:03:40,544 saving best model
2023-10-17 10:03:41,042 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:42,342 epoch 7 - iter 30/304 - loss 0.03976232 - time (sec): 1.30 - samples/sec: 2304.43 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:03:43,660 epoch 7 - iter 60/304 - loss 0.02405073 - time (sec): 2.62 - samples/sec: 2395.83 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:03:44,965 epoch 7 - iter 90/304 - loss 0.02777625 - time (sec): 3.92 - samples/sec: 2278.17 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:03:46,339 epoch 7 - iter 120/304 - loss 0.03062262 - time (sec): 5.29 - samples/sec: 2293.00 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:03:47,717 epoch 7 - iter 150/304 - loss 0.03593935 - time (sec): 6.67 - samples/sec: 2291.32 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:03:49,104 epoch 7 - iter 180/304 - loss 0.03452574 - time (sec): 8.06 - samples/sec: 2282.21 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:03:50,497 epoch 7 - iter 210/304 - loss 0.03181745 - time (sec): 9.45 - samples/sec: 2279.81 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:03:51,923 epoch 7 - iter 240/304 - loss 0.03048670 - time (sec): 10.88 - samples/sec: 2268.64 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:03:53,315 epoch 7 - iter 270/304 - loss 0.03055172 - time (sec): 12.27 - samples/sec: 2249.14 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:03:54,687 epoch 7 - iter 300/304 - loss 0.03050690 - time (sec): 13.64 - samples/sec: 2248.61 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:03:54,885 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:54,885 EPOCH 7 done: loss 0.0302 - lr: 0.000010
2023-10-17 10:03:55,864 DEV : loss 0.19813692569732666 - f1-score (micro avg) 0.8565
2023-10-17 10:03:55,870 saving best model
2023-10-17 10:03:56,356 ----------------------------------------------------------------------------------------------------
2023-10-17 10:03:58,046 epoch 8 - iter 30/304 - loss 0.03823214 - time (sec): 1.69 - samples/sec: 1691.31 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:03:59,768 epoch 8 - iter 60/304 - loss 0.02482736 - time (sec): 3.41 - samples/sec: 1704.41 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:04:01,384 epoch 8 - iter 90/304 - loss 0.01791624 - time (sec): 5.03 - samples/sec: 1781.49 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:04:02,988 epoch 8 - iter 120/304 - loss 0.01629216 - time (sec): 6.63 - samples/sec: 1797.58 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:04:04,571 epoch 8 - iter 150/304 - loss 0.01543196 - time (sec): 8.21 - samples/sec: 1841.85 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:04:06,151 epoch 8 - iter 180/304 - loss 0.01356072 - time (sec): 9.79 - samples/sec: 1857.55 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:04:07,705 epoch 8 - iter 210/304 - loss 0.01371063 - time (sec): 11.35 - samples/sec: 1872.20 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:04:09,297 epoch 8 - iter 240/304 - loss 0.01484184 - time (sec): 12.94 - samples/sec: 1895.52 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:04:10,861 epoch 8 - iter 270/304 - loss 0.02034814 - time (sec): 14.50 - samples/sec: 1902.70 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:04:12,436 epoch 8 - iter 300/304 - loss 0.02174712 - time (sec): 16.08 - samples/sec: 1902.59 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:04:12,647 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:12,647 EPOCH 8 done: loss 0.0215 - lr: 0.000007
2023-10-17 10:04:13,607 DEV : loss 0.19218282401561737 - f1-score (micro avg) 0.849
2023-10-17 10:04:13,616 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:15,165 epoch 9 - iter 30/304 - loss 0.01048409 - time (sec): 1.55 - samples/sec: 1950.67 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:04:16,690 epoch 9 - iter 60/304 - loss 0.01542063 - time (sec): 3.07 - samples/sec: 2014.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:04:18,203 epoch 9 - iter 90/304 - loss 0.01289720 - time (sec): 4.59 - samples/sec: 2041.18 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:04:19,591 epoch 9 - iter 120/304 - loss 0.01607044 - time (sec): 5.97 - samples/sec: 2067.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:04:20,997 epoch 9 - iter 150/304 - loss 0.01530081 - time (sec): 7.38 - samples/sec: 2054.90 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:04:22,540 epoch 9 - iter 180/304 - loss 0.01745658 - time (sec): 8.92 - samples/sec: 2057.70 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:04:24,203 epoch 9 - iter 210/304 - loss 0.01717509 - time (sec): 10.59 - samples/sec: 2031.59 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:04:25,826 epoch 9 - iter 240/304 - loss 0.01617675 - time (sec): 12.21 - samples/sec: 2003.32 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:04:27,432 epoch 9 - iter 270/304 - loss 0.01706249 - time (sec): 13.82 - samples/sec: 1993.05 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:04:29,064 epoch 9 - iter 300/304 - loss 0.01619386 - time (sec): 15.45 - samples/sec: 1982.08 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:04:29,272 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:29,272 EPOCH 9 done: loss 0.0161 - lr: 0.000003
2023-10-17 10:04:30,289 DEV : loss 0.1948387622833252 - f1-score (micro avg) 0.8487
2023-10-17 10:04:30,303 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:31,889 epoch 10 - iter 30/304 - loss 0.01311954 - time (sec): 1.58 - samples/sec: 2032.56 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:04:33,472 epoch 10 - iter 60/304 - loss 0.01765805 - time (sec): 3.17 - samples/sec: 1948.18 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:04:35,109 epoch 10 - iter 90/304 - loss 0.01320737 - time (sec): 4.80 - samples/sec: 1883.58 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:04:36,741 epoch 10 - iter 120/304 - loss 0.01166367 - time (sec): 6.44 - samples/sec: 1927.18 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:04:38,344 epoch 10 - iter 150/304 - loss 0.00951810 - time (sec): 8.04 - samples/sec: 1947.82 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:04:39,964 epoch 10 - iter 180/304 - loss 0.01010278 - time (sec): 9.66 - samples/sec: 1936.51 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:04:41,533 epoch 10 - iter 210/304 - loss 0.00992968 - time (sec): 11.23 - samples/sec: 1921.63 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:04:43,132 epoch 10 - iter 240/304 - loss 0.01162759 - time (sec): 12.83 - samples/sec: 1912.66 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:04:44,736 epoch 10 - iter 270/304 - loss 0.01275249 - time (sec): 14.43 - samples/sec: 1908.14 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:04:46,364 epoch 10 - iter 300/304 - loss 0.01301596 - time (sec): 16.06 - samples/sec: 1903.08 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:04:46,578 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:46,579 EPOCH 10 done: loss 0.0132 - lr: 0.000000
2023-10-17 10:04:47,632 DEV : loss 0.1933870017528534 - f1-score (micro avg) 0.8609
2023-10-17 10:04:47,640 saving best model
2023-10-17 10:04:48,444 ----------------------------------------------------------------------------------------------------
2023-10-17 10:04:48,446 Loading model from best epoch ...
2023-10-17 10:04:49,925 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-17 10:04:50,913
Results:
- F-score (micro) 0.8139
- F-score (macro) 0.6574
- Accuracy 0.691
By class:
precision recall f1-score support
scope 0.7548 0.7748 0.7647 151
pers 0.8182 0.9375 0.8738 96
work 0.8155 0.8842 0.8485 95
loc 1.0000 0.6667 0.8000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7876 0.8420 0.8139 348
macro avg 0.6777 0.6526 0.6574 348
weighted avg 0.7845 0.8420 0.8114 348
2023-10-17 10:04:50,914 ----------------------------------------------------------------------------------------------------