stefan-it's picture
Upload folder using huggingface_hub
93a8540
2023-10-17 10:59:11,153 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,154 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:59:11,154 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Train: 966 sentences
2023-10-17 10:59:11,155 (train_with_dev=False, train_with_test=False)
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Training Params:
2023-10-17 10:59:11,155 - learning_rate: "5e-05"
2023-10-17 10:59:11,155 - mini_batch_size: "8"
2023-10-17 10:59:11,155 - max_epochs: "10"
2023-10-17 10:59:11,155 - shuffle: "True"
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Plugins:
2023-10-17 10:59:11,155 - TensorboardLogger
2023-10-17 10:59:11,155 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:59:11,155 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Computation:
2023-10-17 10:59:11,155 - compute on device: cuda:0
2023-10-17 10:59:11,155 - embedding storage: none
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,155 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:11,156 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:59:11,922 epoch 1 - iter 12/121 - loss 4.03984930 - time (sec): 0.77 - samples/sec: 3147.08 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:59:12,693 epoch 1 - iter 24/121 - loss 3.48446989 - time (sec): 1.54 - samples/sec: 3092.14 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:59:13,455 epoch 1 - iter 36/121 - loss 2.67791785 - time (sec): 2.30 - samples/sec: 3076.89 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:59:14,250 epoch 1 - iter 48/121 - loss 2.12343318 - time (sec): 3.09 - samples/sec: 3155.57 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:59:14,963 epoch 1 - iter 60/121 - loss 1.83639393 - time (sec): 3.81 - samples/sec: 3164.40 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:59:15,774 epoch 1 - iter 72/121 - loss 1.59197735 - time (sec): 4.62 - samples/sec: 3166.88 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:59:16,559 epoch 1 - iter 84/121 - loss 1.38890453 - time (sec): 5.40 - samples/sec: 3203.74 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:59:17,384 epoch 1 - iter 96/121 - loss 1.23617388 - time (sec): 6.23 - samples/sec: 3230.26 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:59:18,103 epoch 1 - iter 108/121 - loss 1.14774163 - time (sec): 6.95 - samples/sec: 3218.01 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:59:18,839 epoch 1 - iter 120/121 - loss 1.06464850 - time (sec): 7.68 - samples/sec: 3203.69 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:59:18,889 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:18,889 EPOCH 1 done: loss 1.0617 - lr: 0.000049
2023-10-17 10:59:19,489 DEV : loss 0.1961672157049179 - f1-score (micro avg) 0.6743
2023-10-17 10:59:19,496 saving best model
2023-10-17 10:59:19,852 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:20,572 epoch 2 - iter 12/121 - loss 0.15827254 - time (sec): 0.72 - samples/sec: 3192.69 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:59:21,293 epoch 2 - iter 24/121 - loss 0.17582748 - time (sec): 1.44 - samples/sec: 3244.43 - lr: 0.000049 - momentum: 0.000000
2023-10-17 10:59:22,031 epoch 2 - iter 36/121 - loss 0.17655963 - time (sec): 2.18 - samples/sec: 3262.17 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:59:22,756 epoch 2 - iter 48/121 - loss 0.17537386 - time (sec): 2.90 - samples/sec: 3279.04 - lr: 0.000048 - momentum: 0.000000
2023-10-17 10:59:23,488 epoch 2 - iter 60/121 - loss 0.17240823 - time (sec): 3.63 - samples/sec: 3367.13 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:59:24,179 epoch 2 - iter 72/121 - loss 0.16952158 - time (sec): 4.33 - samples/sec: 3361.95 - lr: 0.000047 - momentum: 0.000000
2023-10-17 10:59:24,925 epoch 2 - iter 84/121 - loss 0.17046832 - time (sec): 5.07 - samples/sec: 3379.50 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:59:25,682 epoch 2 - iter 96/121 - loss 0.16917879 - time (sec): 5.83 - samples/sec: 3376.28 - lr: 0.000046 - momentum: 0.000000
2023-10-17 10:59:26,409 epoch 2 - iter 108/121 - loss 0.17347387 - time (sec): 6.56 - samples/sec: 3375.46 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:59:27,143 epoch 2 - iter 120/121 - loss 0.16878218 - time (sec): 7.29 - samples/sec: 3369.89 - lr: 0.000045 - momentum: 0.000000
2023-10-17 10:59:27,212 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:27,212 EPOCH 2 done: loss 0.1677 - lr: 0.000045
2023-10-17 10:59:27,952 DEV : loss 0.1310185343027115 - f1-score (micro avg) 0.7902
2023-10-17 10:59:27,957 saving best model
2023-10-17 10:59:28,419 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:29,178 epoch 3 - iter 12/121 - loss 0.10826741 - time (sec): 0.76 - samples/sec: 3195.99 - lr: 0.000044 - momentum: 0.000000
2023-10-17 10:59:29,975 epoch 3 - iter 24/121 - loss 0.10014281 - time (sec): 1.55 - samples/sec: 3198.69 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:59:30,789 epoch 3 - iter 36/121 - loss 0.08852980 - time (sec): 2.37 - samples/sec: 3253.47 - lr: 0.000043 - momentum: 0.000000
2023-10-17 10:59:31,523 epoch 3 - iter 48/121 - loss 0.09054614 - time (sec): 3.10 - samples/sec: 3269.07 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:59:32,308 epoch 3 - iter 60/121 - loss 0.09464846 - time (sec): 3.89 - samples/sec: 3230.95 - lr: 0.000042 - momentum: 0.000000
2023-10-17 10:59:33,063 epoch 3 - iter 72/121 - loss 0.09297356 - time (sec): 4.64 - samples/sec: 3236.86 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:59:33,835 epoch 3 - iter 84/121 - loss 0.09444515 - time (sec): 5.41 - samples/sec: 3252.57 - lr: 0.000041 - momentum: 0.000000
2023-10-17 10:59:34,496 epoch 3 - iter 96/121 - loss 0.09242019 - time (sec): 6.07 - samples/sec: 3222.44 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:59:35,321 epoch 3 - iter 108/121 - loss 0.09432936 - time (sec): 6.90 - samples/sec: 3234.89 - lr: 0.000040 - momentum: 0.000000
2023-10-17 10:59:36,015 epoch 3 - iter 120/121 - loss 0.09658269 - time (sec): 7.59 - samples/sec: 3233.86 - lr: 0.000039 - momentum: 0.000000
2023-10-17 10:59:36,063 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:36,063 EPOCH 3 done: loss 0.0974 - lr: 0.000039
2023-10-17 10:59:36,968 DEV : loss 0.1393728405237198 - f1-score (micro avg) 0.7876
2023-10-17 10:59:36,973 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:37,680 epoch 4 - iter 12/121 - loss 0.08865884 - time (sec): 0.71 - samples/sec: 2936.04 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:59:38,402 epoch 4 - iter 24/121 - loss 0.08219893 - time (sec): 1.43 - samples/sec: 3193.06 - lr: 0.000038 - momentum: 0.000000
2023-10-17 10:59:39,132 epoch 4 - iter 36/121 - loss 0.07077621 - time (sec): 2.16 - samples/sec: 3224.72 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:59:39,924 epoch 4 - iter 48/121 - loss 0.07382872 - time (sec): 2.95 - samples/sec: 3239.62 - lr: 0.000037 - momentum: 0.000000
2023-10-17 10:59:40,673 epoch 4 - iter 60/121 - loss 0.07391026 - time (sec): 3.70 - samples/sec: 3303.16 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:59:41,351 epoch 4 - iter 72/121 - loss 0.07389219 - time (sec): 4.38 - samples/sec: 3292.30 - lr: 0.000036 - momentum: 0.000000
2023-10-17 10:59:42,120 epoch 4 - iter 84/121 - loss 0.07656404 - time (sec): 5.15 - samples/sec: 3267.91 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:59:42,924 epoch 4 - iter 96/121 - loss 0.07249895 - time (sec): 5.95 - samples/sec: 3252.09 - lr: 0.000035 - momentum: 0.000000
2023-10-17 10:59:43,744 epoch 4 - iter 108/121 - loss 0.07277711 - time (sec): 6.77 - samples/sec: 3235.49 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:59:44,483 epoch 4 - iter 120/121 - loss 0.07000959 - time (sec): 7.51 - samples/sec: 3279.76 - lr: 0.000034 - momentum: 0.000000
2023-10-17 10:59:44,536 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:44,537 EPOCH 4 done: loss 0.0697 - lr: 0.000034
2023-10-17 10:59:45,299 DEV : loss 0.14459875226020813 - f1-score (micro avg) 0.8271
2023-10-17 10:59:45,304 saving best model
2023-10-17 10:59:45,866 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:46,629 epoch 5 - iter 12/121 - loss 0.02887476 - time (sec): 0.76 - samples/sec: 2855.60 - lr: 0.000033 - momentum: 0.000000
2023-10-17 10:59:47,382 epoch 5 - iter 24/121 - loss 0.04306695 - time (sec): 1.51 - samples/sec: 2936.78 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:59:48,156 epoch 5 - iter 36/121 - loss 0.04116509 - time (sec): 2.29 - samples/sec: 3078.29 - lr: 0.000032 - momentum: 0.000000
2023-10-17 10:59:48,932 epoch 5 - iter 48/121 - loss 0.04444591 - time (sec): 3.06 - samples/sec: 3059.29 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:59:49,642 epoch 5 - iter 60/121 - loss 0.04296924 - time (sec): 3.77 - samples/sec: 3119.81 - lr: 0.000031 - momentum: 0.000000
2023-10-17 10:59:50,439 epoch 5 - iter 72/121 - loss 0.04428424 - time (sec): 4.57 - samples/sec: 3190.03 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:59:51,138 epoch 5 - iter 84/121 - loss 0.04849569 - time (sec): 5.27 - samples/sec: 3198.35 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:59:51,883 epoch 5 - iter 96/121 - loss 0.04770451 - time (sec): 6.01 - samples/sec: 3216.01 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:59:52,654 epoch 5 - iter 108/121 - loss 0.04792742 - time (sec): 6.78 - samples/sec: 3223.52 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:59:53,460 epoch 5 - iter 120/121 - loss 0.04620569 - time (sec): 7.59 - samples/sec: 3247.60 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:59:53,506 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:53,506 EPOCH 5 done: loss 0.0463 - lr: 0.000028
2023-10-17 10:59:54,274 DEV : loss 0.17430146038532257 - f1-score (micro avg) 0.8213
2023-10-17 10:59:54,279 ----------------------------------------------------------------------------------------------------
2023-10-17 10:59:55,044 epoch 6 - iter 12/121 - loss 0.01847800 - time (sec): 0.76 - samples/sec: 3208.68 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:59:55,824 epoch 6 - iter 24/121 - loss 0.03217283 - time (sec): 1.54 - samples/sec: 3204.69 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:59:56,586 epoch 6 - iter 36/121 - loss 0.02784672 - time (sec): 2.31 - samples/sec: 3258.56 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:59:57,301 epoch 6 - iter 48/121 - loss 0.03009212 - time (sec): 3.02 - samples/sec: 3219.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:59:58,064 epoch 6 - iter 60/121 - loss 0.02992772 - time (sec): 3.78 - samples/sec: 3228.02 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:59:58,730 epoch 6 - iter 72/121 - loss 0.03460899 - time (sec): 4.45 - samples/sec: 3206.50 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:59:59,551 epoch 6 - iter 84/121 - loss 0.03594070 - time (sec): 5.27 - samples/sec: 3236.97 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:00:00,296 epoch 6 - iter 96/121 - loss 0.03447924 - time (sec): 6.02 - samples/sec: 3248.21 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:00:01,069 epoch 6 - iter 108/121 - loss 0.03462584 - time (sec): 6.79 - samples/sec: 3230.49 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:00:01,843 epoch 6 - iter 120/121 - loss 0.03552862 - time (sec): 7.56 - samples/sec: 3248.07 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:00:01,896 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:01,896 EPOCH 6 done: loss 0.0353 - lr: 0.000022
2023-10-17 11:00:02,659 DEV : loss 0.18103647232055664 - f1-score (micro avg) 0.8313
2023-10-17 11:00:02,665 saving best model
2023-10-17 11:00:03,141 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:03,919 epoch 7 - iter 12/121 - loss 0.00538440 - time (sec): 0.78 - samples/sec: 3084.81 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:00:04,636 epoch 7 - iter 24/121 - loss 0.01679636 - time (sec): 1.49 - samples/sec: 3120.00 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:00:05,357 epoch 7 - iter 36/121 - loss 0.01730532 - time (sec): 2.21 - samples/sec: 3266.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:00:06,130 epoch 7 - iter 48/121 - loss 0.01620835 - time (sec): 2.99 - samples/sec: 3235.73 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:00:06,936 epoch 7 - iter 60/121 - loss 0.01692553 - time (sec): 3.79 - samples/sec: 3232.05 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:00:07,684 epoch 7 - iter 72/121 - loss 0.01923634 - time (sec): 4.54 - samples/sec: 3210.04 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:00:08,439 epoch 7 - iter 84/121 - loss 0.01934951 - time (sec): 5.30 - samples/sec: 3195.12 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:00:09,176 epoch 7 - iter 96/121 - loss 0.01857928 - time (sec): 6.03 - samples/sec: 3222.90 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:00:09,963 epoch 7 - iter 108/121 - loss 0.02199051 - time (sec): 6.82 - samples/sec: 3232.32 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:00:10,726 epoch 7 - iter 120/121 - loss 0.02166981 - time (sec): 7.58 - samples/sec: 3249.58 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:00:10,778 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:10,779 EPOCH 7 done: loss 0.0216 - lr: 0.000017
2023-10-17 11:00:11,528 DEV : loss 0.21778880059719086 - f1-score (micro avg) 0.8219
2023-10-17 11:00:11,533 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:12,264 epoch 8 - iter 12/121 - loss 0.02496822 - time (sec): 0.73 - samples/sec: 2937.07 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:00:13,009 epoch 8 - iter 24/121 - loss 0.02109802 - time (sec): 1.48 - samples/sec: 3251.41 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:00:13,747 epoch 8 - iter 36/121 - loss 0.01936223 - time (sec): 2.21 - samples/sec: 3235.36 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:00:14,553 epoch 8 - iter 48/121 - loss 0.01994877 - time (sec): 3.02 - samples/sec: 3318.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:00:15,315 epoch 8 - iter 60/121 - loss 0.01807436 - time (sec): 3.78 - samples/sec: 3314.06 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:00:16,041 epoch 8 - iter 72/121 - loss 0.01570418 - time (sec): 4.51 - samples/sec: 3324.75 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:00:16,740 epoch 8 - iter 84/121 - loss 0.01662327 - time (sec): 5.21 - samples/sec: 3297.61 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:00:17,484 epoch 8 - iter 96/121 - loss 0.01642530 - time (sec): 5.95 - samples/sec: 3299.09 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:00:18,233 epoch 8 - iter 108/121 - loss 0.01566260 - time (sec): 6.70 - samples/sec: 3321.79 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:00:19,000 epoch 8 - iter 120/121 - loss 0.01682453 - time (sec): 7.47 - samples/sec: 3296.10 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:00:19,049 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:19,049 EPOCH 8 done: loss 0.0168 - lr: 0.000011
2023-10-17 11:00:19,793 DEV : loss 0.20616522431373596 - f1-score (micro avg) 0.8471
2023-10-17 11:00:19,798 saving best model
2023-10-17 11:00:20,275 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:20,996 epoch 9 - iter 12/121 - loss 0.00858267 - time (sec): 0.72 - samples/sec: 3485.09 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:00:21,740 epoch 9 - iter 24/121 - loss 0.00850359 - time (sec): 1.46 - samples/sec: 3284.45 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:00:22,497 epoch 9 - iter 36/121 - loss 0.00987893 - time (sec): 2.22 - samples/sec: 3161.64 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:00:23,238 epoch 9 - iter 48/121 - loss 0.01135571 - time (sec): 2.96 - samples/sec: 3149.06 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:00:23,944 epoch 9 - iter 60/121 - loss 0.01031270 - time (sec): 3.66 - samples/sec: 3157.12 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:00:24,698 epoch 9 - iter 72/121 - loss 0.01012589 - time (sec): 4.42 - samples/sec: 3221.68 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:00:25,457 epoch 9 - iter 84/121 - loss 0.01006348 - time (sec): 5.18 - samples/sec: 3249.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:00:26,209 epoch 9 - iter 96/121 - loss 0.00959124 - time (sec): 5.93 - samples/sec: 3289.50 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:00:27,012 epoch 9 - iter 108/121 - loss 0.01110604 - time (sec): 6.73 - samples/sec: 3297.45 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:00:27,767 epoch 9 - iter 120/121 - loss 0.01135045 - time (sec): 7.49 - samples/sec: 3285.16 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:00:27,818 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:27,818 EPOCH 9 done: loss 0.0113 - lr: 0.000006
2023-10-17 11:00:28,558 DEV : loss 0.22430478036403656 - f1-score (micro avg) 0.8344
2023-10-17 11:00:28,563 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:29,271 epoch 10 - iter 12/121 - loss 0.00114275 - time (sec): 0.71 - samples/sec: 3349.89 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:00:30,014 epoch 10 - iter 24/121 - loss 0.00352745 - time (sec): 1.45 - samples/sec: 3148.72 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:00:30,762 epoch 10 - iter 36/121 - loss 0.00508865 - time (sec): 2.20 - samples/sec: 3258.73 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:00:31,488 epoch 10 - iter 48/121 - loss 0.00574434 - time (sec): 2.92 - samples/sec: 3315.20 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:00:32,294 epoch 10 - iter 60/121 - loss 0.01005490 - time (sec): 3.73 - samples/sec: 3408.82 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:00:33,000 epoch 10 - iter 72/121 - loss 0.00878139 - time (sec): 4.44 - samples/sec: 3351.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:00:33,751 epoch 10 - iter 84/121 - loss 0.00826920 - time (sec): 5.19 - samples/sec: 3290.92 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:00:34,538 epoch 10 - iter 96/121 - loss 0.00749286 - time (sec): 5.97 - samples/sec: 3298.47 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:00:35,317 epoch 10 - iter 108/121 - loss 0.00742557 - time (sec): 6.75 - samples/sec: 3321.00 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:00:36,004 epoch 10 - iter 120/121 - loss 0.00724469 - time (sec): 7.44 - samples/sec: 3303.38 - lr: 0.000000 - momentum: 0.000000
2023-10-17 11:00:36,061 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:36,062 EPOCH 10 done: loss 0.0072 - lr: 0.000000
2023-10-17 11:00:36,815 DEV : loss 0.2359585165977478 - f1-score (micro avg) 0.8375
2023-10-17 11:00:37,205 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:37,206 Loading model from best epoch ...
2023-10-17 11:00:38,619 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 11:00:39,465
Results:
- F-score (micro) 0.8174
- F-score (macro) 0.5799
- Accuracy 0.7075
By class:
precision recall f1-score support
pers 0.8652 0.8777 0.8714 139
scope 0.8321 0.8837 0.8571 129
work 0.6630 0.7625 0.7093 80
loc 0.7500 0.3333 0.4615 9
date 0.0000 0.0000 0.0000 3
micro avg 0.8021 0.8333 0.8174 360
macro avg 0.6221 0.5715 0.5799 360
weighted avg 0.7984 0.8333 0.8128 360
2023-10-17 11:00:39,466 ----------------------------------------------------------------------------------------------------