|
2023-10-17 21:12:15,815 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,816 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 21:12:15,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,816 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences |
|
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator |
|
2023-10-17 21:12:15,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,816 Train: 5901 sentences |
|
2023-10-17 21:12:15,816 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 21:12:15,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,816 Training Params: |
|
2023-10-17 21:12:15,816 - learning_rate: "5e-05" |
|
2023-10-17 21:12:15,816 - mini_batch_size: "4" |
|
2023-10-17 21:12:15,816 - max_epochs: "10" |
|
2023-10-17 21:12:15,816 - shuffle: "True" |
|
2023-10-17 21:12:15,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,816 Plugins: |
|
2023-10-17 21:12:15,816 - TensorboardLogger |
|
2023-10-17 21:12:15,816 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 21:12:15,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,817 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 21:12:15,817 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 21:12:15,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,817 Computation: |
|
2023-10-17 21:12:15,817 - compute on device: cuda:0 |
|
2023-10-17 21:12:15,817 - embedding storage: none |
|
2023-10-17 21:12:15,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,817 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 21:12:15,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:12:15,817 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 21:12:22,757 epoch 1 - iter 147/1476 - loss 2.71106741 - time (sec): 6.94 - samples/sec: 2321.76 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 21:12:30,110 epoch 1 - iter 294/1476 - loss 1.53519710 - time (sec): 14.29 - samples/sec: 2468.89 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 21:12:37,137 epoch 1 - iter 441/1476 - loss 1.19741914 - time (sec): 21.32 - samples/sec: 2353.25 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 21:12:44,328 epoch 1 - iter 588/1476 - loss 0.98125061 - time (sec): 28.51 - samples/sec: 2350.64 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 21:12:51,162 epoch 1 - iter 735/1476 - loss 0.84406033 - time (sec): 35.34 - samples/sec: 2352.31 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 21:12:58,205 epoch 1 - iter 882/1476 - loss 0.74180713 - time (sec): 42.39 - samples/sec: 2351.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 21:13:05,428 epoch 1 - iter 1029/1476 - loss 0.66397153 - time (sec): 49.61 - samples/sec: 2348.85 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 21:13:12,269 epoch 1 - iter 1176/1476 - loss 0.60399966 - time (sec): 56.45 - samples/sec: 2341.54 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 21:13:19,290 epoch 1 - iter 1323/1476 - loss 0.55840279 - time (sec): 63.47 - samples/sec: 2347.10 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 21:13:26,560 epoch 1 - iter 1470/1476 - loss 0.52004827 - time (sec): 70.74 - samples/sec: 2342.55 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 21:13:26,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:13:26,880 EPOCH 1 done: loss 0.5189 - lr: 0.000050 |
|
2023-10-17 21:13:33,183 DEV : loss 0.15781794488430023 - f1-score (micro avg) 0.7134 |
|
2023-10-17 21:13:33,228 saving best model |
|
2023-10-17 21:13:33,643 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:13:40,926 epoch 2 - iter 147/1476 - loss 0.13618022 - time (sec): 7.28 - samples/sec: 2059.47 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 21:13:48,382 epoch 2 - iter 294/1476 - loss 0.14828668 - time (sec): 14.74 - samples/sec: 2265.73 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 21:13:55,425 epoch 2 - iter 441/1476 - loss 0.15814149 - time (sec): 21.78 - samples/sec: 2319.49 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 21:14:02,701 epoch 2 - iter 588/1476 - loss 0.15346701 - time (sec): 29.06 - samples/sec: 2360.96 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 21:14:09,918 epoch 2 - iter 735/1476 - loss 0.15116044 - time (sec): 36.27 - samples/sec: 2382.74 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 21:14:16,991 epoch 2 - iter 882/1476 - loss 0.14674645 - time (sec): 43.35 - samples/sec: 2377.52 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 21:14:23,920 epoch 2 - iter 1029/1476 - loss 0.14578389 - time (sec): 50.27 - samples/sec: 2352.17 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 21:14:30,697 epoch 2 - iter 1176/1476 - loss 0.14549586 - time (sec): 57.05 - samples/sec: 2339.49 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 21:14:37,676 epoch 2 - iter 1323/1476 - loss 0.14491984 - time (sec): 64.03 - samples/sec: 2336.70 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 21:14:44,608 epoch 2 - iter 1470/1476 - loss 0.14348353 - time (sec): 70.96 - samples/sec: 2336.48 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 21:14:44,869 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:14:44,870 EPOCH 2 done: loss 0.1433 - lr: 0.000044 |
|
2023-10-17 21:14:56,908 DEV : loss 0.13238579034805298 - f1-score (micro avg) 0.8057 |
|
2023-10-17 21:14:56,944 saving best model |
|
2023-10-17 21:14:57,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:15:04,234 epoch 3 - iter 147/1476 - loss 0.09099243 - time (sec): 6.81 - samples/sec: 2221.06 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 21:15:10,879 epoch 3 - iter 294/1476 - loss 0.09767210 - time (sec): 13.46 - samples/sec: 2276.72 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 21:15:17,857 epoch 3 - iter 441/1476 - loss 0.09619969 - time (sec): 20.43 - samples/sec: 2337.51 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 21:15:25,059 epoch 3 - iter 588/1476 - loss 0.09884334 - time (sec): 27.64 - samples/sec: 2358.50 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 21:15:32,052 epoch 3 - iter 735/1476 - loss 0.09130138 - time (sec): 34.63 - samples/sec: 2329.09 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 21:15:39,127 epoch 3 - iter 882/1476 - loss 0.09409109 - time (sec): 41.70 - samples/sec: 2324.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 21:15:46,434 epoch 3 - iter 1029/1476 - loss 0.09210907 - time (sec): 49.01 - samples/sec: 2359.09 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 21:15:53,396 epoch 3 - iter 1176/1476 - loss 0.09301213 - time (sec): 55.97 - samples/sec: 2370.49 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 21:16:00,199 epoch 3 - iter 1323/1476 - loss 0.09078236 - time (sec): 62.78 - samples/sec: 2380.47 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 21:16:07,142 epoch 3 - iter 1470/1476 - loss 0.09325701 - time (sec): 69.72 - samples/sec: 2378.16 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 21:16:07,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:16:07,408 EPOCH 3 done: loss 0.0935 - lr: 0.000039 |
|
2023-10-17 21:16:18,751 DEV : loss 0.17908549308776855 - f1-score (micro avg) 0.8022 |
|
2023-10-17 21:16:18,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:16:26,621 epoch 4 - iter 147/1476 - loss 0.05955487 - time (sec): 7.84 - samples/sec: 2174.55 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 21:16:34,138 epoch 4 - iter 294/1476 - loss 0.06464650 - time (sec): 15.36 - samples/sec: 2300.33 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 21:16:41,169 epoch 4 - iter 441/1476 - loss 0.06441669 - time (sec): 22.39 - samples/sec: 2318.35 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 21:16:48,222 epoch 4 - iter 588/1476 - loss 0.06965834 - time (sec): 29.44 - samples/sec: 2329.10 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 21:16:55,135 epoch 4 - iter 735/1476 - loss 0.07027131 - time (sec): 36.35 - samples/sec: 2311.10 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 21:17:02,168 epoch 4 - iter 882/1476 - loss 0.06940446 - time (sec): 43.39 - samples/sec: 2315.69 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 21:17:08,988 epoch 4 - iter 1029/1476 - loss 0.06906629 - time (sec): 50.21 - samples/sec: 2311.07 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 21:17:16,069 epoch 4 - iter 1176/1476 - loss 0.06960608 - time (sec): 57.29 - samples/sec: 2318.23 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 21:17:23,415 epoch 4 - iter 1323/1476 - loss 0.06848630 - time (sec): 64.63 - samples/sec: 2335.06 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 21:17:30,550 epoch 4 - iter 1470/1476 - loss 0.06867505 - time (sec): 71.77 - samples/sec: 2310.89 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 21:17:30,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:17:30,827 EPOCH 4 done: loss 0.0686 - lr: 0.000033 |
|
2023-10-17 21:17:42,009 DEV : loss 0.17883314192295074 - f1-score (micro avg) 0.8315 |
|
2023-10-17 21:17:42,038 saving best model |
|
2023-10-17 21:17:42,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:17:49,944 epoch 5 - iter 147/1476 - loss 0.04503414 - time (sec): 7.40 - samples/sec: 2375.63 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 21:17:57,061 epoch 5 - iter 294/1476 - loss 0.04077908 - time (sec): 14.51 - samples/sec: 2363.98 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 21:18:04,459 epoch 5 - iter 441/1476 - loss 0.04485968 - time (sec): 21.91 - samples/sec: 2394.21 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 21:18:11,546 epoch 5 - iter 588/1476 - loss 0.04489000 - time (sec): 29.00 - samples/sec: 2379.88 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 21:18:18,929 epoch 5 - iter 735/1476 - loss 0.04378441 - time (sec): 36.38 - samples/sec: 2359.70 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 21:18:26,184 epoch 5 - iter 882/1476 - loss 0.04187352 - time (sec): 43.64 - samples/sec: 2361.29 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 21:18:33,052 epoch 5 - iter 1029/1476 - loss 0.04478438 - time (sec): 50.50 - samples/sec: 2331.91 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 21:18:40,065 epoch 5 - iter 1176/1476 - loss 0.04667311 - time (sec): 57.52 - samples/sec: 2309.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 21:18:47,060 epoch 5 - iter 1323/1476 - loss 0.04707730 - time (sec): 64.51 - samples/sec: 2314.55 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 21:18:54,289 epoch 5 - iter 1470/1476 - loss 0.04804846 - time (sec): 71.74 - samples/sec: 2312.31 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 21:18:54,581 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:18:54,581 EPOCH 5 done: loss 0.0479 - lr: 0.000028 |
|
2023-10-17 21:19:05,740 DEV : loss 0.1932452768087387 - f1-score (micro avg) 0.8216 |
|
2023-10-17 21:19:05,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:19:13,041 epoch 6 - iter 147/1476 - loss 0.02692959 - time (sec): 7.27 - samples/sec: 2251.36 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 21:19:20,095 epoch 6 - iter 294/1476 - loss 0.03039685 - time (sec): 14.32 - samples/sec: 2258.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 21:19:26,924 epoch 6 - iter 441/1476 - loss 0.02969535 - time (sec): 21.15 - samples/sec: 2240.64 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 21:19:34,018 epoch 6 - iter 588/1476 - loss 0.02976128 - time (sec): 28.25 - samples/sec: 2269.04 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 21:19:41,263 epoch 6 - iter 735/1476 - loss 0.03299645 - time (sec): 35.49 - samples/sec: 2272.63 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 21:19:48,189 epoch 6 - iter 882/1476 - loss 0.03080088 - time (sec): 42.42 - samples/sec: 2277.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 21:19:55,156 epoch 6 - iter 1029/1476 - loss 0.03111720 - time (sec): 49.39 - samples/sec: 2274.80 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 21:20:02,179 epoch 6 - iter 1176/1476 - loss 0.03136825 - time (sec): 56.41 - samples/sec: 2267.49 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 21:20:09,961 epoch 6 - iter 1323/1476 - loss 0.03283562 - time (sec): 64.19 - samples/sec: 2315.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 21:20:17,596 epoch 6 - iter 1470/1476 - loss 0.03154629 - time (sec): 71.82 - samples/sec: 2298.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 21:20:18,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:20:18,023 EPOCH 6 done: loss 0.0318 - lr: 0.000022 |
|
2023-10-17 21:20:29,225 DEV : loss 0.20342403650283813 - f1-score (micro avg) 0.829 |
|
2023-10-17 21:20:29,256 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:20:36,413 epoch 7 - iter 147/1476 - loss 0.02891200 - time (sec): 7.16 - samples/sec: 2142.34 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 21:20:43,330 epoch 7 - iter 294/1476 - loss 0.02394202 - time (sec): 14.07 - samples/sec: 2229.98 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 21:20:50,316 epoch 7 - iter 441/1476 - loss 0.02015639 - time (sec): 21.06 - samples/sec: 2184.29 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 21:20:57,474 epoch 7 - iter 588/1476 - loss 0.02007236 - time (sec): 28.22 - samples/sec: 2211.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 21:21:04,503 epoch 7 - iter 735/1476 - loss 0.02231409 - time (sec): 35.25 - samples/sec: 2232.96 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 21:21:11,792 epoch 7 - iter 882/1476 - loss 0.02414592 - time (sec): 42.54 - samples/sec: 2270.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 21:21:19,804 epoch 7 - iter 1029/1476 - loss 0.02619634 - time (sec): 50.55 - samples/sec: 2320.73 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 21:21:27,164 epoch 7 - iter 1176/1476 - loss 0.02513161 - time (sec): 57.91 - samples/sec: 2299.35 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 21:21:34,416 epoch 7 - iter 1323/1476 - loss 0.02524479 - time (sec): 65.16 - samples/sec: 2297.30 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 21:21:41,691 epoch 7 - iter 1470/1476 - loss 0.02525241 - time (sec): 72.43 - samples/sec: 2292.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 21:21:41,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:21:41,978 EPOCH 7 done: loss 0.0252 - lr: 0.000017 |
|
2023-10-17 21:21:53,349 DEV : loss 0.2042791247367859 - f1-score (micro avg) 0.8458 |
|
2023-10-17 21:21:53,382 saving best model |
|
2023-10-17 21:21:53,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:22:00,762 epoch 8 - iter 147/1476 - loss 0.00937919 - time (sec): 6.89 - samples/sec: 2283.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 21:22:07,881 epoch 8 - iter 294/1476 - loss 0.00853811 - time (sec): 14.01 - samples/sec: 2258.90 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 21:22:14,881 epoch 8 - iter 441/1476 - loss 0.01070567 - time (sec): 21.01 - samples/sec: 2269.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 21:22:22,036 epoch 8 - iter 588/1476 - loss 0.01179930 - time (sec): 28.17 - samples/sec: 2246.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 21:22:30,116 epoch 8 - iter 735/1476 - loss 0.01355360 - time (sec): 36.24 - samples/sec: 2315.08 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 21:22:37,569 epoch 8 - iter 882/1476 - loss 0.01288325 - time (sec): 43.70 - samples/sec: 2340.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 21:22:44,608 epoch 8 - iter 1029/1476 - loss 0.01292629 - time (sec): 50.74 - samples/sec: 2330.05 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 21:22:51,814 epoch 8 - iter 1176/1476 - loss 0.01385110 - time (sec): 57.94 - samples/sec: 2329.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 21:22:58,993 epoch 8 - iter 1323/1476 - loss 0.01415473 - time (sec): 65.12 - samples/sec: 2309.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 21:23:05,874 epoch 8 - iter 1470/1476 - loss 0.01408906 - time (sec): 72.00 - samples/sec: 2299.77 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 21:23:06,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:23:06,190 EPOCH 8 done: loss 0.0140 - lr: 0.000011 |
|
2023-10-17 21:23:17,357 DEV : loss 0.23180881142616272 - f1-score (micro avg) 0.8352 |
|
2023-10-17 21:23:17,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:23:24,663 epoch 9 - iter 147/1476 - loss 0.01659959 - time (sec): 7.27 - samples/sec: 2475.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 21:23:31,742 epoch 9 - iter 294/1476 - loss 0.01094948 - time (sec): 14.35 - samples/sec: 2413.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 21:23:38,527 epoch 9 - iter 441/1476 - loss 0.00931768 - time (sec): 21.14 - samples/sec: 2376.05 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 21:23:45,851 epoch 9 - iter 588/1476 - loss 0.00991294 - time (sec): 28.46 - samples/sec: 2373.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 21:23:52,800 epoch 9 - iter 735/1476 - loss 0.01019933 - time (sec): 35.41 - samples/sec: 2348.46 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 21:24:00,113 epoch 9 - iter 882/1476 - loss 0.01047522 - time (sec): 42.72 - samples/sec: 2354.51 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 21:24:07,361 epoch 9 - iter 1029/1476 - loss 0.01030206 - time (sec): 49.97 - samples/sec: 2347.60 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 21:24:14,651 epoch 9 - iter 1176/1476 - loss 0.01068040 - time (sec): 57.26 - samples/sec: 2353.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 21:24:21,700 epoch 9 - iter 1323/1476 - loss 0.01029783 - time (sec): 64.31 - samples/sec: 2344.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 21:24:28,634 epoch 9 - iter 1470/1476 - loss 0.00979050 - time (sec): 71.25 - samples/sec: 2328.95 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 21:24:28,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:24:28,905 EPOCH 9 done: loss 0.0098 - lr: 0.000006 |
|
2023-10-17 21:24:40,143 DEV : loss 0.2325468510389328 - f1-score (micro avg) 0.8472 |
|
2023-10-17 21:24:40,174 saving best model |
|
2023-10-17 21:24:40,670 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:24:47,921 epoch 10 - iter 147/1476 - loss 0.00447119 - time (sec): 7.25 - samples/sec: 2284.31 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 21:24:55,320 epoch 10 - iter 294/1476 - loss 0.00607937 - time (sec): 14.64 - samples/sec: 2399.71 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 21:25:02,482 epoch 10 - iter 441/1476 - loss 0.00594049 - time (sec): 21.81 - samples/sec: 2332.13 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 21:25:09,895 epoch 10 - iter 588/1476 - loss 0.00571948 - time (sec): 29.22 - samples/sec: 2316.57 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 21:25:17,081 epoch 10 - iter 735/1476 - loss 0.00519373 - time (sec): 36.41 - samples/sec: 2296.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 21:25:23,922 epoch 10 - iter 882/1476 - loss 0.00678593 - time (sec): 43.25 - samples/sec: 2294.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 21:25:31,170 epoch 10 - iter 1029/1476 - loss 0.00637890 - time (sec): 50.49 - samples/sec: 2285.07 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 21:25:38,140 epoch 10 - iter 1176/1476 - loss 0.00583236 - time (sec): 57.47 - samples/sec: 2296.51 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 21:25:45,142 epoch 10 - iter 1323/1476 - loss 0.00563599 - time (sec): 64.47 - samples/sec: 2301.07 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 21:25:52,319 epoch 10 - iter 1470/1476 - loss 0.00529100 - time (sec): 71.64 - samples/sec: 2315.70 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 21:25:52,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:25:52,584 EPOCH 10 done: loss 0.0053 - lr: 0.000000 |
|
2023-10-17 21:26:03,720 DEV : loss 0.2309638112783432 - f1-score (micro avg) 0.8473 |
|
2023-10-17 21:26:03,750 saving best model |
|
2023-10-17 21:26:04,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 21:26:04,627 Loading model from best epoch ... |
|
2023-10-17 21:26:05,984 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod |
|
2023-10-17 21:26:12,598 |
|
Results: |
|
- F-score (micro) 0.805 |
|
- F-score (macro) 0.7103 |
|
- Accuracy 0.6948 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8445 0.8800 0.8619 858 |
|
pers 0.7666 0.8194 0.7921 537 |
|
org 0.6532 0.6136 0.6328 132 |
|
prod 0.6885 0.6885 0.6885 61 |
|
time 0.5312 0.6296 0.5763 54 |
|
|
|
micro avg 0.7874 0.8234 0.8050 1642 |
|
macro avg 0.6968 0.7262 0.7103 1642 |
|
weighted avg 0.7875 0.8234 0.8048 1642 |
|
|
|
2023-10-17 21:26:12,598 ---------------------------------------------------------------------------------------------------- |
|
|