stefan-it's picture
Upload folder using huggingface_hub
326c952
2023-10-17 11:08:01,482 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Train: 966 sentences
2023-10-17 11:08:01,483 (train_with_dev=False, train_with_test=False)
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Training Params:
2023-10-17 11:08:01,483 - learning_rate: "5e-05"
2023-10-17 11:08:01,483 - mini_batch_size: "8"
2023-10-17 11:08:01,483 - max_epochs: "10"
2023-10-17 11:08:01,483 - shuffle: "True"
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Plugins:
2023-10-17 11:08:01,483 - TensorboardLogger
2023-10-17 11:08:01,483 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 11:08:01,483 - metric: "('micro avg', 'f1-score')"
2023-10-17 11:08:01,483 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,483 Computation:
2023-10-17 11:08:01,483 - compute on device: cuda:0
2023-10-17 11:08:01,484 - embedding storage: none
2023-10-17 11:08:01,484 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,484 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 11:08:01,484 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,484 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:01,484 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 11:08:02,318 epoch 1 - iter 12/121 - loss 4.64640638 - time (sec): 0.83 - samples/sec: 3140.70 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:08:03,081 epoch 1 - iter 24/121 - loss 3.99522762 - time (sec): 1.60 - samples/sec: 3122.73 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:08:03,830 epoch 1 - iter 36/121 - loss 3.16441520 - time (sec): 2.35 - samples/sec: 3155.90 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:08:04,553 epoch 1 - iter 48/121 - loss 2.57702603 - time (sec): 3.07 - samples/sec: 3184.33 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:08:05,318 epoch 1 - iter 60/121 - loss 2.14240888 - time (sec): 3.83 - samples/sec: 3232.40 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:08:06,077 epoch 1 - iter 72/121 - loss 1.85994774 - time (sec): 4.59 - samples/sec: 3241.62 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:08:06,769 epoch 1 - iter 84/121 - loss 1.66472101 - time (sec): 5.28 - samples/sec: 3253.63 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:08:07,534 epoch 1 - iter 96/121 - loss 1.50200708 - time (sec): 6.05 - samples/sec: 3245.44 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:08:08,293 epoch 1 - iter 108/121 - loss 1.35604792 - time (sec): 6.81 - samples/sec: 3266.64 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:08:09,024 epoch 1 - iter 120/121 - loss 1.25714367 - time (sec): 7.54 - samples/sec: 3268.13 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:08:09,079 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:09,079 EPOCH 1 done: loss 1.2531 - lr: 0.000049
2023-10-17 11:08:09,987 DEV : loss 0.2125282734632492 - f1-score (micro avg) 0.5735
2023-10-17 11:08:09,994 saving best model
2023-10-17 11:08:10,387 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:11,148 epoch 2 - iter 12/121 - loss 0.21764231 - time (sec): 0.76 - samples/sec: 3515.23 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:08:11,940 epoch 2 - iter 24/121 - loss 0.21403451 - time (sec): 1.55 - samples/sec: 3305.25 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:08:12,715 epoch 2 - iter 36/121 - loss 0.20849589 - time (sec): 2.33 - samples/sec: 3163.91 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:08:13,518 epoch 2 - iter 48/121 - loss 0.21445341 - time (sec): 3.13 - samples/sec: 3181.97 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:08:14,201 epoch 2 - iter 60/121 - loss 0.21111530 - time (sec): 3.81 - samples/sec: 3192.03 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:08:14,970 epoch 2 - iter 72/121 - loss 0.20856125 - time (sec): 4.58 - samples/sec: 3197.45 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:08:15,747 epoch 2 - iter 84/121 - loss 0.20056727 - time (sec): 5.36 - samples/sec: 3229.91 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:08:16,510 epoch 2 - iter 96/121 - loss 0.19028230 - time (sec): 6.12 - samples/sec: 3241.23 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:08:17,257 epoch 2 - iter 108/121 - loss 0.18897016 - time (sec): 6.87 - samples/sec: 3246.99 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:08:18,016 epoch 2 - iter 120/121 - loss 0.18464719 - time (sec): 7.63 - samples/sec: 3230.93 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:08:18,062 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:18,063 EPOCH 2 done: loss 0.1849 - lr: 0.000045
2023-10-17 11:08:18,872 DEV : loss 0.1422046422958374 - f1-score (micro avg) 0.7617
2023-10-17 11:08:18,878 saving best model
2023-10-17 11:08:19,429 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:20,275 epoch 3 - iter 12/121 - loss 0.10190167 - time (sec): 0.84 - samples/sec: 2692.70 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:08:21,108 epoch 3 - iter 24/121 - loss 0.10884942 - time (sec): 1.68 - samples/sec: 2845.97 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:08:21,986 epoch 3 - iter 36/121 - loss 0.10662259 - time (sec): 2.55 - samples/sec: 2876.54 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:08:22,893 epoch 3 - iter 48/121 - loss 0.10825608 - time (sec): 3.46 - samples/sec: 2835.33 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:08:23,661 epoch 3 - iter 60/121 - loss 0.11156383 - time (sec): 4.23 - samples/sec: 2893.76 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:08:24,420 epoch 3 - iter 72/121 - loss 0.11438574 - time (sec): 4.99 - samples/sec: 2921.92 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:08:25,109 epoch 3 - iter 84/121 - loss 0.11338348 - time (sec): 5.68 - samples/sec: 2963.65 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:08:25,833 epoch 3 - iter 96/121 - loss 0.10924170 - time (sec): 6.40 - samples/sec: 3006.45 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:08:26,557 epoch 3 - iter 108/121 - loss 0.10890897 - time (sec): 7.13 - samples/sec: 3068.47 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:08:27,364 epoch 3 - iter 120/121 - loss 0.10887373 - time (sec): 7.93 - samples/sec: 3095.94 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:08:27,416 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:27,416 EPOCH 3 done: loss 0.1081 - lr: 0.000039
2023-10-17 11:08:28,182 DEV : loss 0.1380065679550171 - f1-score (micro avg) 0.8312
2023-10-17 11:08:28,187 saving best model
2023-10-17 11:08:28,750 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:29,522 epoch 4 - iter 12/121 - loss 0.07019196 - time (sec): 0.77 - samples/sec: 3201.37 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:08:30,263 epoch 4 - iter 24/121 - loss 0.08708994 - time (sec): 1.51 - samples/sec: 3150.58 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:08:31,016 epoch 4 - iter 36/121 - loss 0.07707486 - time (sec): 2.26 - samples/sec: 3236.51 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:08:31,780 epoch 4 - iter 48/121 - loss 0.08053888 - time (sec): 3.03 - samples/sec: 3197.85 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:08:32,554 epoch 4 - iter 60/121 - loss 0.07902996 - time (sec): 3.80 - samples/sec: 3187.67 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:08:33,392 epoch 4 - iter 72/121 - loss 0.07645033 - time (sec): 4.64 - samples/sec: 3185.72 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:08:34,119 epoch 4 - iter 84/121 - loss 0.07914731 - time (sec): 5.37 - samples/sec: 3177.40 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:08:34,864 epoch 4 - iter 96/121 - loss 0.08010322 - time (sec): 6.11 - samples/sec: 3219.41 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:08:35,606 epoch 4 - iter 108/121 - loss 0.07826876 - time (sec): 6.85 - samples/sec: 3227.24 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:08:36,361 epoch 4 - iter 120/121 - loss 0.07554766 - time (sec): 7.61 - samples/sec: 3226.33 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:08:36,413 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:36,413 EPOCH 4 done: loss 0.0750 - lr: 0.000034
2023-10-17 11:08:37,162 DEV : loss 0.15543414652347565 - f1-score (micro avg) 0.8178
2023-10-17 11:08:37,167 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:37,826 epoch 5 - iter 12/121 - loss 0.04766896 - time (sec): 0.66 - samples/sec: 3563.90 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:08:38,573 epoch 5 - iter 24/121 - loss 0.04508118 - time (sec): 1.40 - samples/sec: 3377.50 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:08:39,321 epoch 5 - iter 36/121 - loss 0.06730774 - time (sec): 2.15 - samples/sec: 3350.74 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:08:40,116 epoch 5 - iter 48/121 - loss 0.07034756 - time (sec): 2.95 - samples/sec: 3343.74 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:08:40,891 epoch 5 - iter 60/121 - loss 0.06704156 - time (sec): 3.72 - samples/sec: 3284.32 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:08:41,677 epoch 5 - iter 72/121 - loss 0.06279205 - time (sec): 4.51 - samples/sec: 3306.89 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:08:42,381 epoch 5 - iter 84/121 - loss 0.06617666 - time (sec): 5.21 - samples/sec: 3279.85 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:08:43,202 epoch 5 - iter 96/121 - loss 0.06529293 - time (sec): 6.03 - samples/sec: 3252.32 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:08:43,941 epoch 5 - iter 108/121 - loss 0.06344772 - time (sec): 6.77 - samples/sec: 3268.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:08:44,647 epoch 5 - iter 120/121 - loss 0.06037539 - time (sec): 7.48 - samples/sec: 3285.74 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:08:44,701 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:44,701 EPOCH 5 done: loss 0.0600 - lr: 0.000028
2023-10-17 11:08:45,476 DEV : loss 0.16548091173171997 - f1-score (micro avg) 0.8331
2023-10-17 11:08:45,481 saving best model
2023-10-17 11:08:46,020 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:46,755 epoch 6 - iter 12/121 - loss 0.03259283 - time (sec): 0.73 - samples/sec: 3292.80 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:08:47,480 epoch 6 - iter 24/121 - loss 0.03577944 - time (sec): 1.46 - samples/sec: 3333.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:08:48,318 epoch 6 - iter 36/121 - loss 0.03375939 - time (sec): 2.29 - samples/sec: 3286.49 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:08:49,095 epoch 6 - iter 48/121 - loss 0.03529970 - time (sec): 3.07 - samples/sec: 3239.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:08:49,861 epoch 6 - iter 60/121 - loss 0.03304757 - time (sec): 3.84 - samples/sec: 3201.69 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:08:50,619 epoch 6 - iter 72/121 - loss 0.03748447 - time (sec): 4.60 - samples/sec: 3193.26 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:08:51,383 epoch 6 - iter 84/121 - loss 0.03924613 - time (sec): 5.36 - samples/sec: 3226.65 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:08:52,169 epoch 6 - iter 96/121 - loss 0.03747713 - time (sec): 6.14 - samples/sec: 3243.99 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:08:52,902 epoch 6 - iter 108/121 - loss 0.03732658 - time (sec): 6.88 - samples/sec: 3240.78 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:08:53,620 epoch 6 - iter 120/121 - loss 0.03850823 - time (sec): 7.60 - samples/sec: 3240.53 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:08:53,667 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:53,667 EPOCH 6 done: loss 0.0383 - lr: 0.000022
2023-10-17 11:08:54,432 DEV : loss 0.19360701739788055 - f1-score (micro avg) 0.8263
2023-10-17 11:08:54,437 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:55,126 epoch 7 - iter 12/121 - loss 0.05617147 - time (sec): 0.69 - samples/sec: 3192.31 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:08:55,900 epoch 7 - iter 24/121 - loss 0.04013345 - time (sec): 1.46 - samples/sec: 3077.29 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:08:56,654 epoch 7 - iter 36/121 - loss 0.03935435 - time (sec): 2.22 - samples/sec: 3213.72 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:08:57,417 epoch 7 - iter 48/121 - loss 0.03899974 - time (sec): 2.98 - samples/sec: 3237.49 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:08:58,156 epoch 7 - iter 60/121 - loss 0.03729358 - time (sec): 3.72 - samples/sec: 3253.20 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:08:58,955 epoch 7 - iter 72/121 - loss 0.03709520 - time (sec): 4.52 - samples/sec: 3304.23 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:08:59,758 epoch 7 - iter 84/121 - loss 0.03509845 - time (sec): 5.32 - samples/sec: 3294.34 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:09:00,540 epoch 7 - iter 96/121 - loss 0.03302967 - time (sec): 6.10 - samples/sec: 3281.68 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:09:01,317 epoch 7 - iter 108/121 - loss 0.03142229 - time (sec): 6.88 - samples/sec: 3253.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:09:02,084 epoch 7 - iter 120/121 - loss 0.03049685 - time (sec): 7.65 - samples/sec: 3219.99 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:09:02,130 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:02,130 EPOCH 7 done: loss 0.0306 - lr: 0.000017
2023-10-17 11:09:02,900 DEV : loss 0.1948603093624115 - f1-score (micro avg) 0.8331
2023-10-17 11:09:02,905 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:03,630 epoch 8 - iter 12/121 - loss 0.01798295 - time (sec): 0.72 - samples/sec: 3173.02 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:09:04,365 epoch 8 - iter 24/121 - loss 0.01653155 - time (sec): 1.46 - samples/sec: 3229.96 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:09:05,147 epoch 8 - iter 36/121 - loss 0.02239749 - time (sec): 2.24 - samples/sec: 3246.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:09:05,835 epoch 8 - iter 48/121 - loss 0.02224355 - time (sec): 2.93 - samples/sec: 3179.90 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:09:06,568 epoch 8 - iter 60/121 - loss 0.02209139 - time (sec): 3.66 - samples/sec: 3271.64 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:09:07,344 epoch 8 - iter 72/121 - loss 0.02267916 - time (sec): 4.44 - samples/sec: 3268.05 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:09:08,155 epoch 8 - iter 84/121 - loss 0.02026633 - time (sec): 5.25 - samples/sec: 3227.66 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:09:08,924 epoch 8 - iter 96/121 - loss 0.02139598 - time (sec): 6.02 - samples/sec: 3241.90 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:09:09,703 epoch 8 - iter 108/121 - loss 0.02001586 - time (sec): 6.80 - samples/sec: 3249.84 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:09:10,428 epoch 8 - iter 120/121 - loss 0.02118325 - time (sec): 7.52 - samples/sec: 3270.15 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:09:10,476 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:10,477 EPOCH 8 done: loss 0.0215 - lr: 0.000011
2023-10-17 11:09:11,407 DEV : loss 0.21927376091480255 - f1-score (micro avg) 0.8323
2023-10-17 11:09:11,412 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:12,122 epoch 9 - iter 12/121 - loss 0.02132408 - time (sec): 0.71 - samples/sec: 3315.28 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:09:12,847 epoch 9 - iter 24/121 - loss 0.01773470 - time (sec): 1.43 - samples/sec: 3278.97 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:09:13,566 epoch 9 - iter 36/121 - loss 0.02313221 - time (sec): 2.15 - samples/sec: 3188.50 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:09:14,319 epoch 9 - iter 48/121 - loss 0.01886954 - time (sec): 2.91 - samples/sec: 3251.85 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:09:15,023 epoch 9 - iter 60/121 - loss 0.01968467 - time (sec): 3.61 - samples/sec: 3228.26 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:09:15,732 epoch 9 - iter 72/121 - loss 0.02178143 - time (sec): 4.32 - samples/sec: 3249.54 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:09:16,520 epoch 9 - iter 84/121 - loss 0.02212080 - time (sec): 5.11 - samples/sec: 3265.04 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:09:17,344 epoch 9 - iter 96/121 - loss 0.02053703 - time (sec): 5.93 - samples/sec: 3266.72 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:09:18,120 epoch 9 - iter 108/121 - loss 0.01868999 - time (sec): 6.71 - samples/sec: 3277.35 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:09:18,877 epoch 9 - iter 120/121 - loss 0.01827614 - time (sec): 7.46 - samples/sec: 3284.29 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:09:18,939 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:18,939 EPOCH 9 done: loss 0.0183 - lr: 0.000006
2023-10-17 11:09:19,740 DEV : loss 0.22246934473514557 - f1-score (micro avg) 0.8319
2023-10-17 11:09:19,747 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:20,492 epoch 10 - iter 12/121 - loss 0.00591043 - time (sec): 0.74 - samples/sec: 3213.24 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:09:21,241 epoch 10 - iter 24/121 - loss 0.01457823 - time (sec): 1.49 - samples/sec: 3338.82 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:09:22,020 epoch 10 - iter 36/121 - loss 0.01630667 - time (sec): 2.27 - samples/sec: 3205.55 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:09:22,738 epoch 10 - iter 48/121 - loss 0.02028258 - time (sec): 2.99 - samples/sec: 3259.08 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:09:23,503 epoch 10 - iter 60/121 - loss 0.01793374 - time (sec): 3.76 - samples/sec: 3264.73 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:09:24,276 epoch 10 - iter 72/121 - loss 0.01632167 - time (sec): 4.53 - samples/sec: 3244.65 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:09:25,096 epoch 10 - iter 84/121 - loss 0.01470658 - time (sec): 5.35 - samples/sec: 3185.43 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:09:25,921 epoch 10 - iter 96/121 - loss 0.01418145 - time (sec): 6.17 - samples/sec: 3171.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:09:26,658 epoch 10 - iter 108/121 - loss 0.01327139 - time (sec): 6.91 - samples/sec: 3204.74 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:09:27,373 epoch 10 - iter 120/121 - loss 0.01330474 - time (sec): 7.62 - samples/sec: 3225.61 - lr: 0.000000 - momentum: 0.000000
2023-10-17 11:09:27,427 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:27,427 EPOCH 10 done: loss 0.0132 - lr: 0.000000
2023-10-17 11:09:28,181 DEV : loss 0.22670365869998932 - f1-score (micro avg) 0.8413
2023-10-17 11:09:28,186 saving best model
2023-10-17 11:09:29,078 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:29,079 Loading model from best epoch ...
2023-10-17 11:09:30,511 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 11:09:31,188
Results:
- F-score (micro) 0.8092
- F-score (macro) 0.5501
- Accuracy 0.6986
By class:
precision recall f1-score support
pers 0.8652 0.8777 0.8714 139
scope 0.8085 0.8837 0.8444 129
work 0.6489 0.7625 0.7011 80
loc 0.6667 0.2222 0.3333 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7889 0.8306 0.8092 360
macro avg 0.5979 0.5492 0.5501 360
weighted avg 0.7847 0.8306 0.8032 360
2023-10-17 11:09:31,188 ----------------------------------------------------------------------------------------------------