|
2023-10-17 20:05:58,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,316 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:05:58,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,316 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences |
|
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Train: 5901 sentences |
|
2023-10-17 20:05:58,317 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Training Params: |
|
2023-10-17 20:05:58,317 - learning_rate: "3e-05" |
|
2023-10-17 20:05:58,317 - mini_batch_size: "4" |
|
2023-10-17 20:05:58,317 - max_epochs: "10" |
|
2023-10-17 20:05:58,317 - shuffle: "True" |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Plugins: |
|
2023-10-17 20:05:58,317 - TensorboardLogger |
|
2023-10-17 20:05:58,317 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:05:58,317 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Computation: |
|
2023-10-17 20:05:58,317 - compute on device: cuda:0 |
|
2023-10-17 20:05:58,317 - embedding storage: none |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:05:58,317 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:06:05,703 epoch 1 - iter 147/1476 - loss 2.88376740 - time (sec): 7.38 - samples/sec: 2399.29 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:06:12,547 epoch 1 - iter 294/1476 - loss 1.83405663 - time (sec): 14.23 - samples/sec: 2329.66 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:06:20,094 epoch 1 - iter 441/1476 - loss 1.34685714 - time (sec): 21.78 - samples/sec: 2364.86 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:06:27,578 epoch 1 - iter 588/1476 - loss 1.08468691 - time (sec): 29.26 - samples/sec: 2373.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:06:34,521 epoch 1 - iter 735/1476 - loss 0.93275820 - time (sec): 36.20 - samples/sec: 2362.49 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:06:41,355 epoch 1 - iter 882/1476 - loss 0.83450367 - time (sec): 43.04 - samples/sec: 2330.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:06:48,377 epoch 1 - iter 1029/1476 - loss 0.75458015 - time (sec): 50.06 - samples/sec: 2317.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:06:55,693 epoch 1 - iter 1176/1476 - loss 0.68725226 - time (sec): 57.37 - samples/sec: 2303.13 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:07:03,442 epoch 1 - iter 1323/1476 - loss 0.63285062 - time (sec): 65.12 - samples/sec: 2281.72 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:07:10,621 epoch 1 - iter 1470/1476 - loss 0.58387807 - time (sec): 72.30 - samples/sec: 2294.27 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:07:10,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:07:10,883 EPOCH 1 done: loss 0.5825 - lr: 0.000030 |
|
2023-10-17 20:07:17,234 DEV : loss 0.12818463146686554 - f1-score (micro avg) 0.7213 |
|
2023-10-17 20:07:17,263 saving best model |
|
2023-10-17 20:07:17,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:07:24,643 epoch 2 - iter 147/1476 - loss 0.13820773 - time (sec): 7.01 - samples/sec: 2385.67 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:07:32,026 epoch 2 - iter 294/1476 - loss 0.13981224 - time (sec): 14.39 - samples/sec: 2427.68 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:07:39,398 epoch 2 - iter 441/1476 - loss 0.13892246 - time (sec): 21.76 - samples/sec: 2406.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:07:46,926 epoch 2 - iter 588/1476 - loss 0.13506201 - time (sec): 29.29 - samples/sec: 2319.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:07:54,318 epoch 2 - iter 735/1476 - loss 0.13369013 - time (sec): 36.68 - samples/sec: 2242.62 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:08:01,477 epoch 2 - iter 882/1476 - loss 0.13371218 - time (sec): 43.84 - samples/sec: 2227.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:08:09,013 epoch 2 - iter 1029/1476 - loss 0.13126252 - time (sec): 51.38 - samples/sec: 2221.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:08:16,524 epoch 2 - iter 1176/1476 - loss 0.13158893 - time (sec): 58.89 - samples/sec: 2215.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:08:24,525 epoch 2 - iter 1323/1476 - loss 0.13111484 - time (sec): 66.89 - samples/sec: 2220.09 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:08:31,884 epoch 2 - iter 1470/1476 - loss 0.13044166 - time (sec): 74.25 - samples/sec: 2233.50 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:08:32,152 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:32,153 EPOCH 2 done: loss 0.1302 - lr: 0.000027 |
|
2023-10-17 20:08:43,623 DEV : loss 0.11906815320253372 - f1-score (micro avg) 0.8161 |
|
2023-10-17 20:08:43,656 saving best model |
|
2023-10-17 20:08:44,148 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:08:51,602 epoch 3 - iter 147/1476 - loss 0.06487959 - time (sec): 7.45 - samples/sec: 2371.36 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:08:58,751 epoch 3 - iter 294/1476 - loss 0.07490643 - time (sec): 14.60 - samples/sec: 2407.47 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:09:05,781 epoch 3 - iter 441/1476 - loss 0.07115129 - time (sec): 21.63 - samples/sec: 2400.27 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:09:12,619 epoch 3 - iter 588/1476 - loss 0.07499880 - time (sec): 28.47 - samples/sec: 2387.22 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:09:19,700 epoch 3 - iter 735/1476 - loss 0.08049723 - time (sec): 35.55 - samples/sec: 2374.58 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:09:26,769 epoch 3 - iter 882/1476 - loss 0.08104214 - time (sec): 42.62 - samples/sec: 2339.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:09:34,338 epoch 3 - iter 1029/1476 - loss 0.08301973 - time (sec): 50.19 - samples/sec: 2344.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:09:41,499 epoch 3 - iter 1176/1476 - loss 0.08364485 - time (sec): 57.35 - samples/sec: 2332.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:09:48,739 epoch 3 - iter 1323/1476 - loss 0.08295442 - time (sec): 64.59 - samples/sec: 2321.23 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:09:56,409 epoch 3 - iter 1470/1476 - loss 0.08380446 - time (sec): 72.26 - samples/sec: 2296.93 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:09:56,681 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:09:56,681 EPOCH 3 done: loss 0.0838 - lr: 0.000023 |
|
2023-10-17 20:10:08,037 DEV : loss 0.1379304975271225 - f1-score (micro avg) 0.8223 |
|
2023-10-17 20:10:08,071 saving best model |
|
2023-10-17 20:10:08,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:10:15,658 epoch 4 - iter 147/1476 - loss 0.05625078 - time (sec): 7.11 - samples/sec: 2241.51 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:10:23,083 epoch 4 - iter 294/1476 - loss 0.05457959 - time (sec): 14.53 - samples/sec: 2321.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:10:30,071 epoch 4 - iter 441/1476 - loss 0.05927497 - time (sec): 21.52 - samples/sec: 2279.64 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:10:37,495 epoch 4 - iter 588/1476 - loss 0.05943946 - time (sec): 28.94 - samples/sec: 2265.71 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:10:45,002 epoch 4 - iter 735/1476 - loss 0.06174984 - time (sec): 36.45 - samples/sec: 2202.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:10:52,447 epoch 4 - iter 882/1476 - loss 0.05984732 - time (sec): 43.90 - samples/sec: 2211.74 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:10:59,329 epoch 4 - iter 1029/1476 - loss 0.05718144 - time (sec): 50.78 - samples/sec: 2222.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:11:06,848 epoch 4 - iter 1176/1476 - loss 0.05612735 - time (sec): 58.30 - samples/sec: 2247.18 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:11:13,792 epoch 4 - iter 1323/1476 - loss 0.05566318 - time (sec): 65.24 - samples/sec: 2254.84 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:11:21,672 epoch 4 - iter 1470/1476 - loss 0.05600539 - time (sec): 73.12 - samples/sec: 2266.68 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:11:21,955 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:21,955 EPOCH 4 done: loss 0.0559 - lr: 0.000020 |
|
2023-10-17 20:11:33,291 DEV : loss 0.16167429089546204 - f1-score (micro avg) 0.846 |
|
2023-10-17 20:11:33,323 saving best model |
|
2023-10-17 20:11:33,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:11:41,069 epoch 5 - iter 147/1476 - loss 0.03312893 - time (sec): 7.28 - samples/sec: 2445.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:11:47,781 epoch 5 - iter 294/1476 - loss 0.03390059 - time (sec): 13.99 - samples/sec: 2407.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:11:54,966 epoch 5 - iter 441/1476 - loss 0.03294051 - time (sec): 21.18 - samples/sec: 2384.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:12:02,147 epoch 5 - iter 588/1476 - loss 0.03900188 - time (sec): 28.36 - samples/sec: 2358.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:12:09,336 epoch 5 - iter 735/1476 - loss 0.03809314 - time (sec): 35.55 - samples/sec: 2359.36 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:12:16,735 epoch 5 - iter 882/1476 - loss 0.03709148 - time (sec): 42.95 - samples/sec: 2337.85 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:12:23,992 epoch 5 - iter 1029/1476 - loss 0.03763883 - time (sec): 50.21 - samples/sec: 2316.28 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:12:30,787 epoch 5 - iter 1176/1476 - loss 0.03885216 - time (sec): 57.00 - samples/sec: 2309.32 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:12:38,354 epoch 5 - iter 1323/1476 - loss 0.03762999 - time (sec): 64.57 - samples/sec: 2325.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:12:45,409 epoch 5 - iter 1470/1476 - loss 0.03723655 - time (sec): 71.62 - samples/sec: 2317.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:12:45,675 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:12:45,675 EPOCH 5 done: loss 0.0377 - lr: 0.000017 |
|
2023-10-17 20:12:57,231 DEV : loss 0.1790267527103424 - f1-score (micro avg) 0.8426 |
|
2023-10-17 20:12:57,261 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:13:04,516 epoch 6 - iter 147/1476 - loss 0.02474197 - time (sec): 7.25 - samples/sec: 2182.35 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:13:11,743 epoch 6 - iter 294/1476 - loss 0.02176561 - time (sec): 14.48 - samples/sec: 2281.17 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:13:19,002 epoch 6 - iter 441/1476 - loss 0.02055555 - time (sec): 21.74 - samples/sec: 2289.19 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:13:26,628 epoch 6 - iter 588/1476 - loss 0.02265555 - time (sec): 29.37 - samples/sec: 2240.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:13:33,937 epoch 6 - iter 735/1476 - loss 0.02383975 - time (sec): 36.68 - samples/sec: 2233.14 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:13:41,057 epoch 6 - iter 882/1476 - loss 0.02491593 - time (sec): 43.80 - samples/sec: 2228.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:13:48,543 epoch 6 - iter 1029/1476 - loss 0.02373826 - time (sec): 51.28 - samples/sec: 2242.09 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:13:55,584 epoch 6 - iter 1176/1476 - loss 0.02425560 - time (sec): 58.32 - samples/sec: 2255.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:14:02,711 epoch 6 - iter 1323/1476 - loss 0.02426721 - time (sec): 65.45 - samples/sec: 2257.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:14:10,120 epoch 6 - iter 1470/1476 - loss 0.02573153 - time (sec): 72.86 - samples/sec: 2275.89 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:14:10,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:14:10,406 EPOCH 6 done: loss 0.0256 - lr: 0.000013 |
|
2023-10-17 20:14:21,960 DEV : loss 0.1823473572731018 - f1-score (micro avg) 0.8415 |
|
2023-10-17 20:14:21,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:14:29,433 epoch 7 - iter 147/1476 - loss 0.01049509 - time (sec): 7.44 - samples/sec: 2270.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:14:36,187 epoch 7 - iter 294/1476 - loss 0.01510363 - time (sec): 14.19 - samples/sec: 2341.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:14:43,513 epoch 7 - iter 441/1476 - loss 0.01732408 - time (sec): 21.52 - samples/sec: 2376.45 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:14:50,969 epoch 7 - iter 588/1476 - loss 0.01690878 - time (sec): 28.97 - samples/sec: 2369.98 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:14:58,196 epoch 7 - iter 735/1476 - loss 0.02010782 - time (sec): 36.20 - samples/sec: 2330.85 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:15:05,260 epoch 7 - iter 882/1476 - loss 0.02004143 - time (sec): 43.27 - samples/sec: 2338.44 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:15:12,552 epoch 7 - iter 1029/1476 - loss 0.01839535 - time (sec): 50.56 - samples/sec: 2313.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:15:19,971 epoch 7 - iter 1176/1476 - loss 0.01888196 - time (sec): 57.98 - samples/sec: 2313.87 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:15:27,136 epoch 7 - iter 1323/1476 - loss 0.01831645 - time (sec): 65.14 - samples/sec: 2320.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:15:33,941 epoch 7 - iter 1470/1476 - loss 0.01784839 - time (sec): 71.95 - samples/sec: 2305.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:15:34,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:15:34,200 EPOCH 7 done: loss 0.0180 - lr: 0.000010 |
|
2023-10-17 20:15:45,643 DEV : loss 0.19402551651000977 - f1-score (micro avg) 0.8431 |
|
2023-10-17 20:15:45,677 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:15:52,842 epoch 8 - iter 147/1476 - loss 0.01320226 - time (sec): 7.16 - samples/sec: 2278.37 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:16:00,302 epoch 8 - iter 294/1476 - loss 0.01657591 - time (sec): 14.62 - samples/sec: 2333.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:16:07,287 epoch 8 - iter 441/1476 - loss 0.01541718 - time (sec): 21.61 - samples/sec: 2302.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:16:14,178 epoch 8 - iter 588/1476 - loss 0.01364886 - time (sec): 28.50 - samples/sec: 2306.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:16:21,411 epoch 8 - iter 735/1476 - loss 0.01449721 - time (sec): 35.73 - samples/sec: 2315.68 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:16:28,411 epoch 8 - iter 882/1476 - loss 0.01320813 - time (sec): 42.73 - samples/sec: 2299.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:16:36,244 epoch 8 - iter 1029/1476 - loss 0.01499181 - time (sec): 50.57 - samples/sec: 2327.41 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:16:43,224 epoch 8 - iter 1176/1476 - loss 0.01478780 - time (sec): 57.55 - samples/sec: 2319.55 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:16:50,451 epoch 8 - iter 1323/1476 - loss 0.01429358 - time (sec): 64.77 - samples/sec: 2320.02 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:16:57,506 epoch 8 - iter 1470/1476 - loss 0.01418919 - time (sec): 71.83 - samples/sec: 2303.65 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:16:57,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:16:57,853 EPOCH 8 done: loss 0.0141 - lr: 0.000007 |
|
2023-10-17 20:17:09,346 DEV : loss 0.20007802546024323 - f1-score (micro avg) 0.8496 |
|
2023-10-17 20:17:09,379 saving best model |
|
2023-10-17 20:17:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:17:17,475 epoch 9 - iter 147/1476 - loss 0.00436533 - time (sec): 7.61 - samples/sec: 2367.42 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:17:24,847 epoch 9 - iter 294/1476 - loss 0.00510740 - time (sec): 14.98 - samples/sec: 2429.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:17:32,428 epoch 9 - iter 441/1476 - loss 0.00763822 - time (sec): 22.56 - samples/sec: 2423.43 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:17:39,465 epoch 9 - iter 588/1476 - loss 0.00743456 - time (sec): 29.60 - samples/sec: 2361.44 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:17:47,117 epoch 9 - iter 735/1476 - loss 0.00689757 - time (sec): 37.25 - samples/sec: 2308.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:17:54,005 epoch 9 - iter 882/1476 - loss 0.00620989 - time (sec): 44.14 - samples/sec: 2316.95 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:18:00,853 epoch 9 - iter 1029/1476 - loss 0.00730072 - time (sec): 50.99 - samples/sec: 2301.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:18:07,914 epoch 9 - iter 1176/1476 - loss 0.00714572 - time (sec): 58.05 - samples/sec: 2289.46 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:18:15,625 epoch 9 - iter 1323/1476 - loss 0.00707158 - time (sec): 65.76 - samples/sec: 2299.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:18:22,423 epoch 9 - iter 1470/1476 - loss 0.00773284 - time (sec): 72.56 - samples/sec: 2283.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:18:22,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:22,726 EPOCH 9 done: loss 0.0078 - lr: 0.000003 |
|
2023-10-17 20:18:34,319 DEV : loss 0.2041151374578476 - f1-score (micro avg) 0.8596 |
|
2023-10-17 20:18:34,349 saving best model |
|
2023-10-17 20:18:34,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:18:42,621 epoch 10 - iter 147/1476 - loss 0.00709689 - time (sec): 7.79 - samples/sec: 2535.00 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:18:50,003 epoch 10 - iter 294/1476 - loss 0.00608595 - time (sec): 15.17 - samples/sec: 2439.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:18:57,062 epoch 10 - iter 441/1476 - loss 0.00472843 - time (sec): 22.23 - samples/sec: 2396.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:19:04,224 epoch 10 - iter 588/1476 - loss 0.00501675 - time (sec): 29.39 - samples/sec: 2314.37 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:19:11,140 epoch 10 - iter 735/1476 - loss 0.00493818 - time (sec): 36.31 - samples/sec: 2306.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:19:18,274 epoch 10 - iter 882/1476 - loss 0.00479554 - time (sec): 43.44 - samples/sec: 2297.04 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:19:25,625 epoch 10 - iter 1029/1476 - loss 0.00501958 - time (sec): 50.79 - samples/sec: 2305.35 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:19:32,682 epoch 10 - iter 1176/1476 - loss 0.00595793 - time (sec): 57.85 - samples/sec: 2291.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:19:39,780 epoch 10 - iter 1323/1476 - loss 0.00561353 - time (sec): 64.95 - samples/sec: 2284.04 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:19:47,478 epoch 10 - iter 1470/1476 - loss 0.00566830 - time (sec): 72.65 - samples/sec: 2283.76 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:19:47,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:47,748 EPOCH 10 done: loss 0.0057 - lr: 0.000000 |
|
2023-10-17 20:19:59,001 DEV : loss 0.2035822868347168 - f1-score (micro avg) 0.8602 |
|
2023-10-17 20:19:59,031 saving best model |
|
2023-10-17 20:19:59,889 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:19:59,891 Loading model from best epoch ... |
|
2023-10-17 20:20:01,237 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod |
|
2023-10-17 20:20:07,284 |
|
Results: |
|
- F-score (micro) 0.7934 |
|
- F-score (macro) 0.7114 |
|
- Accuracy 0.6758 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8474 0.8671 0.8571 858 |
|
pers 0.7487 0.8045 0.7756 537 |
|
org 0.5329 0.6136 0.5704 132 |
|
prod 0.7500 0.7377 0.7438 61 |
|
time 0.5625 0.6667 0.6102 54 |
|
|
|
micro avg 0.7730 0.8149 0.7934 1642 |
|
macro avg 0.6883 0.7379 0.7114 1642 |
|
weighted avg 0.7768 0.8149 0.7951 1642 |
|
|
|
2023-10-17 20:20:07,284 ---------------------------------------------------------------------------------------------------- |
|
|