stefan-it's picture
Upload folder using huggingface_hub
6f4a369
2023-10-17 10:30:46,480 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,481 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 10:30:46,481 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,481 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-17 10:30:46,481 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,481 Train: 966 sentences
2023-10-17 10:30:46,481 (train_with_dev=False, train_with_test=False)
2023-10-17 10:30:46,481 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,481 Training Params:
2023-10-17 10:30:46,481 - learning_rate: "3e-05"
2023-10-17 10:30:46,481 - mini_batch_size: "8"
2023-10-17 10:30:46,482 - max_epochs: "10"
2023-10-17 10:30:46,482 - shuffle: "True"
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 Plugins:
2023-10-17 10:30:46,482 - TensorboardLogger
2023-10-17 10:30:46,482 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 10:30:46,482 - metric: "('micro avg', 'f1-score')"
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 Computation:
2023-10-17 10:30:46,482 - compute on device: cuda:0
2023-10-17 10:30:46,482 - embedding storage: none
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:46,482 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 10:30:47,180 epoch 1 - iter 12/121 - loss 3.32014436 - time (sec): 0.70 - samples/sec: 3341.33 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:30:47,895 epoch 1 - iter 24/121 - loss 3.03698276 - time (sec): 1.41 - samples/sec: 3164.77 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:30:48,614 epoch 1 - iter 36/121 - loss 2.60289546 - time (sec): 2.13 - samples/sec: 3210.20 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:30:49,403 epoch 1 - iter 48/121 - loss 2.12575385 - time (sec): 2.92 - samples/sec: 3218.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:30:50,203 epoch 1 - iter 60/121 - loss 1.79218853 - time (sec): 3.72 - samples/sec: 3229.86 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:30:50,952 epoch 1 - iter 72/121 - loss 1.58367041 - time (sec): 4.47 - samples/sec: 3227.29 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:30:51,711 epoch 1 - iter 84/121 - loss 1.41113578 - time (sec): 5.23 - samples/sec: 3230.06 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:30:52,492 epoch 1 - iter 96/121 - loss 1.26321860 - time (sec): 6.01 - samples/sec: 3242.76 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:30:53,219 epoch 1 - iter 108/121 - loss 1.16515662 - time (sec): 6.74 - samples/sec: 3258.23 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:30:54,001 epoch 1 - iter 120/121 - loss 1.07050963 - time (sec): 7.52 - samples/sec: 3269.52 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:30:54,065 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:54,065 EPOCH 1 done: loss 1.0649 - lr: 0.000030
2023-10-17 10:30:54,663 DEV : loss 0.2681349813938141 - f1-score (micro avg) 0.4935
2023-10-17 10:30:54,684 saving best model
2023-10-17 10:30:55,080 ----------------------------------------------------------------------------------------------------
2023-10-17 10:30:55,820 epoch 2 - iter 12/121 - loss 0.25245481 - time (sec): 0.74 - samples/sec: 3337.64 - lr: 0.000030 - momentum: 0.000000
2023-10-17 10:30:56,535 epoch 2 - iter 24/121 - loss 0.26119176 - time (sec): 1.45 - samples/sec: 3133.98 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:30:57,237 epoch 2 - iter 36/121 - loss 0.24934766 - time (sec): 2.16 - samples/sec: 3221.62 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:30:58,019 epoch 2 - iter 48/121 - loss 0.23822482 - time (sec): 2.94 - samples/sec: 3309.63 - lr: 0.000029 - momentum: 0.000000
2023-10-17 10:30:58,819 epoch 2 - iter 60/121 - loss 0.22963477 - time (sec): 3.74 - samples/sec: 3281.18 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:30:59,537 epoch 2 - iter 72/121 - loss 0.22640817 - time (sec): 4.46 - samples/sec: 3325.09 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:31:00,284 epoch 2 - iter 84/121 - loss 0.21803280 - time (sec): 5.20 - samples/sec: 3297.21 - lr: 0.000028 - momentum: 0.000000
2023-10-17 10:31:01,073 epoch 2 - iter 96/121 - loss 0.21012624 - time (sec): 5.99 - samples/sec: 3292.82 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:31:01,889 epoch 2 - iter 108/121 - loss 0.20287535 - time (sec): 6.81 - samples/sec: 3265.45 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:31:02,631 epoch 2 - iter 120/121 - loss 0.19816081 - time (sec): 7.55 - samples/sec: 3253.85 - lr: 0.000027 - momentum: 0.000000
2023-10-17 10:31:02,683 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:02,684 EPOCH 2 done: loss 0.1970 - lr: 0.000027
2023-10-17 10:31:03,620 DEV : loss 0.14390070736408234 - f1-score (micro avg) 0.7865
2023-10-17 10:31:03,626 saving best model
2023-10-17 10:31:04,142 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:04,908 epoch 3 - iter 12/121 - loss 0.12851785 - time (sec): 0.76 - samples/sec: 3266.94 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:31:05,676 epoch 3 - iter 24/121 - loss 0.11801002 - time (sec): 1.53 - samples/sec: 3172.63 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:31:06,544 epoch 3 - iter 36/121 - loss 0.11883823 - time (sec): 2.40 - samples/sec: 3152.48 - lr: 0.000026 - momentum: 0.000000
2023-10-17 10:31:07,247 epoch 3 - iter 48/121 - loss 0.11584504 - time (sec): 3.10 - samples/sec: 3236.72 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:31:07,977 epoch 3 - iter 60/121 - loss 0.11598349 - time (sec): 3.83 - samples/sec: 3231.54 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:31:08,729 epoch 3 - iter 72/121 - loss 0.11195592 - time (sec): 4.59 - samples/sec: 3307.42 - lr: 0.000025 - momentum: 0.000000
2023-10-17 10:31:09,437 epoch 3 - iter 84/121 - loss 0.11244840 - time (sec): 5.29 - samples/sec: 3263.56 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:31:10,139 epoch 3 - iter 96/121 - loss 0.11154599 - time (sec): 6.00 - samples/sec: 3284.81 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:31:10,902 epoch 3 - iter 108/121 - loss 0.11047003 - time (sec): 6.76 - samples/sec: 3316.08 - lr: 0.000024 - momentum: 0.000000
2023-10-17 10:31:11,650 epoch 3 - iter 120/121 - loss 0.11039208 - time (sec): 7.51 - samples/sec: 3286.62 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:31:11,702 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:11,702 EPOCH 3 done: loss 0.1103 - lr: 0.000023
2023-10-17 10:31:12,489 DEV : loss 0.12160782516002655 - f1-score (micro avg) 0.8291
2023-10-17 10:31:12,496 saving best model
2023-10-17 10:31:13,042 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:13,802 epoch 4 - iter 12/121 - loss 0.11078327 - time (sec): 0.76 - samples/sec: 3345.15 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:31:14,541 epoch 4 - iter 24/121 - loss 0.08753515 - time (sec): 1.49 - samples/sec: 3287.89 - lr: 0.000023 - momentum: 0.000000
2023-10-17 10:31:15,310 epoch 4 - iter 36/121 - loss 0.07917905 - time (sec): 2.26 - samples/sec: 3283.48 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:31:16,076 epoch 4 - iter 48/121 - loss 0.08826692 - time (sec): 3.03 - samples/sec: 3202.60 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:31:16,818 epoch 4 - iter 60/121 - loss 0.08527749 - time (sec): 3.77 - samples/sec: 3296.24 - lr: 0.000022 - momentum: 0.000000
2023-10-17 10:31:17,630 epoch 4 - iter 72/121 - loss 0.07703033 - time (sec): 4.58 - samples/sec: 3254.92 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:31:18,439 epoch 4 - iter 84/121 - loss 0.07353993 - time (sec): 5.39 - samples/sec: 3231.75 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:31:19,176 epoch 4 - iter 96/121 - loss 0.08026535 - time (sec): 6.13 - samples/sec: 3235.43 - lr: 0.000021 - momentum: 0.000000
2023-10-17 10:31:19,945 epoch 4 - iter 108/121 - loss 0.07860544 - time (sec): 6.90 - samples/sec: 3210.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:31:20,747 epoch 4 - iter 120/121 - loss 0.07637764 - time (sec): 7.70 - samples/sec: 3188.34 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:31:20,801 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:20,801 EPOCH 4 done: loss 0.0759 - lr: 0.000020
2023-10-17 10:31:21,561 DEV : loss 0.15159755945205688 - f1-score (micro avg) 0.8256
2023-10-17 10:31:21,567 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:22,406 epoch 5 - iter 12/121 - loss 0.04424314 - time (sec): 0.84 - samples/sec: 3332.19 - lr: 0.000020 - momentum: 0.000000
2023-10-17 10:31:23,111 epoch 5 - iter 24/121 - loss 0.04608236 - time (sec): 1.54 - samples/sec: 3209.53 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:31:23,879 epoch 5 - iter 36/121 - loss 0.05514976 - time (sec): 2.31 - samples/sec: 3254.88 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:31:24,663 epoch 5 - iter 48/121 - loss 0.05155823 - time (sec): 3.10 - samples/sec: 3216.27 - lr: 0.000019 - momentum: 0.000000
2023-10-17 10:31:25,390 epoch 5 - iter 60/121 - loss 0.04998764 - time (sec): 3.82 - samples/sec: 3231.78 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:31:26,176 epoch 5 - iter 72/121 - loss 0.04908976 - time (sec): 4.61 - samples/sec: 3215.16 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:31:26,888 epoch 5 - iter 84/121 - loss 0.05123180 - time (sec): 5.32 - samples/sec: 3284.23 - lr: 0.000018 - momentum: 0.000000
2023-10-17 10:31:27,640 epoch 5 - iter 96/121 - loss 0.05213783 - time (sec): 6.07 - samples/sec: 3239.37 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:31:28,342 epoch 5 - iter 108/121 - loss 0.05143224 - time (sec): 6.77 - samples/sec: 3261.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:31:29,078 epoch 5 - iter 120/121 - loss 0.05380168 - time (sec): 7.51 - samples/sec: 3269.60 - lr: 0.000017 - momentum: 0.000000
2023-10-17 10:31:29,148 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:29,148 EPOCH 5 done: loss 0.0535 - lr: 0.000017
2023-10-17 10:31:29,934 DEV : loss 0.1807931661605835 - f1-score (micro avg) 0.8184
2023-10-17 10:31:29,939 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:30,664 epoch 6 - iter 12/121 - loss 0.02907978 - time (sec): 0.72 - samples/sec: 3666.62 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:31:31,448 epoch 6 - iter 24/121 - loss 0.03751196 - time (sec): 1.51 - samples/sec: 3403.32 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:31:32,212 epoch 6 - iter 36/121 - loss 0.03357465 - time (sec): 2.27 - samples/sec: 3379.66 - lr: 0.000016 - momentum: 0.000000
2023-10-17 10:31:32,943 epoch 6 - iter 48/121 - loss 0.03308335 - time (sec): 3.00 - samples/sec: 3306.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:31:33,743 epoch 6 - iter 60/121 - loss 0.03111144 - time (sec): 3.80 - samples/sec: 3262.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:31:34,478 epoch 6 - iter 72/121 - loss 0.03577265 - time (sec): 4.54 - samples/sec: 3274.54 - lr: 0.000015 - momentum: 0.000000
2023-10-17 10:31:35,208 epoch 6 - iter 84/121 - loss 0.03581138 - time (sec): 5.27 - samples/sec: 3276.39 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:31:35,984 epoch 6 - iter 96/121 - loss 0.03882524 - time (sec): 6.04 - samples/sec: 3271.90 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:31:36,683 epoch 6 - iter 108/121 - loss 0.04161598 - time (sec): 6.74 - samples/sec: 3285.34 - lr: 0.000014 - momentum: 0.000000
2023-10-17 10:31:37,436 epoch 6 - iter 120/121 - loss 0.04249901 - time (sec): 7.50 - samples/sec: 3284.29 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:31:37,483 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:37,483 EPOCH 6 done: loss 0.0428 - lr: 0.000013
2023-10-17 10:31:38,252 DEV : loss 0.1748889535665512 - f1-score (micro avg) 0.8253
2023-10-17 10:31:38,257 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:39,002 epoch 7 - iter 12/121 - loss 0.03329241 - time (sec): 0.74 - samples/sec: 3548.32 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:31:39,762 epoch 7 - iter 24/121 - loss 0.03701914 - time (sec): 1.50 - samples/sec: 3354.87 - lr: 0.000013 - momentum: 0.000000
2023-10-17 10:31:40,548 epoch 7 - iter 36/121 - loss 0.03906978 - time (sec): 2.29 - samples/sec: 3286.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:31:41,283 epoch 7 - iter 48/121 - loss 0.03915569 - time (sec): 3.02 - samples/sec: 3288.91 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:31:42,096 epoch 7 - iter 60/121 - loss 0.03641657 - time (sec): 3.84 - samples/sec: 3299.32 - lr: 0.000012 - momentum: 0.000000
2023-10-17 10:31:42,798 epoch 7 - iter 72/121 - loss 0.03452311 - time (sec): 4.54 - samples/sec: 3329.70 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:31:43,580 epoch 7 - iter 84/121 - loss 0.03223815 - time (sec): 5.32 - samples/sec: 3301.68 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:31:44,284 epoch 7 - iter 96/121 - loss 0.03201468 - time (sec): 6.03 - samples/sec: 3266.72 - lr: 0.000011 - momentum: 0.000000
2023-10-17 10:31:45,041 epoch 7 - iter 108/121 - loss 0.03162600 - time (sec): 6.78 - samples/sec: 3269.40 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:31:45,765 epoch 7 - iter 120/121 - loss 0.03052726 - time (sec): 7.51 - samples/sec: 3270.34 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:31:45,821 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:45,821 EPOCH 7 done: loss 0.0307 - lr: 0.000010
2023-10-17 10:31:46,595 DEV : loss 0.19148583710193634 - f1-score (micro avg) 0.8285
2023-10-17 10:31:46,602 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:47,350 epoch 8 - iter 12/121 - loss 0.02759122 - time (sec): 0.75 - samples/sec: 3163.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 10:31:48,073 epoch 8 - iter 24/121 - loss 0.01779458 - time (sec): 1.47 - samples/sec: 3283.46 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:31:48,882 epoch 8 - iter 36/121 - loss 0.01975104 - time (sec): 2.28 - samples/sec: 3250.84 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:31:49,674 epoch 8 - iter 48/121 - loss 0.02004782 - time (sec): 3.07 - samples/sec: 3265.70 - lr: 0.000009 - momentum: 0.000000
2023-10-17 10:31:50,435 epoch 8 - iter 60/121 - loss 0.02196078 - time (sec): 3.83 - samples/sec: 3274.88 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:31:51,149 epoch 8 - iter 72/121 - loss 0.02076302 - time (sec): 4.55 - samples/sec: 3326.58 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:31:51,896 epoch 8 - iter 84/121 - loss 0.02046546 - time (sec): 5.29 - samples/sec: 3305.81 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:31:52,592 epoch 8 - iter 96/121 - loss 0.02295288 - time (sec): 5.99 - samples/sec: 3273.69 - lr: 0.000008 - momentum: 0.000000
2023-10-17 10:31:53,434 epoch 8 - iter 108/121 - loss 0.02253970 - time (sec): 6.83 - samples/sec: 3288.26 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:31:54,157 epoch 8 - iter 120/121 - loss 0.02232507 - time (sec): 7.55 - samples/sec: 3256.21 - lr: 0.000007 - momentum: 0.000000
2023-10-17 10:31:54,209 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:54,209 EPOCH 8 done: loss 0.0222 - lr: 0.000007
2023-10-17 10:31:54,991 DEV : loss 0.20282834768295288 - f1-score (micro avg) 0.8335
2023-10-17 10:31:54,998 saving best model
2023-10-17 10:31:55,515 ----------------------------------------------------------------------------------------------------
2023-10-17 10:31:56,245 epoch 9 - iter 12/121 - loss 0.02835444 - time (sec): 0.73 - samples/sec: 3199.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:31:57,028 epoch 9 - iter 24/121 - loss 0.02810493 - time (sec): 1.51 - samples/sec: 3014.12 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:31:57,834 epoch 9 - iter 36/121 - loss 0.02511085 - time (sec): 2.32 - samples/sec: 3074.24 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:31:58,596 epoch 9 - iter 48/121 - loss 0.01989365 - time (sec): 3.08 - samples/sec: 3162.59 - lr: 0.000006 - momentum: 0.000000
2023-10-17 10:31:59,361 epoch 9 - iter 60/121 - loss 0.02370575 - time (sec): 3.84 - samples/sec: 3136.55 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:32:00,170 epoch 9 - iter 72/121 - loss 0.02151435 - time (sec): 4.65 - samples/sec: 3130.56 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:32:00,899 epoch 9 - iter 84/121 - loss 0.02058558 - time (sec): 5.38 - samples/sec: 3137.75 - lr: 0.000005 - momentum: 0.000000
2023-10-17 10:32:01,638 epoch 9 - iter 96/121 - loss 0.01989155 - time (sec): 6.12 - samples/sec: 3183.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:32:02,392 epoch 9 - iter 108/121 - loss 0.01893313 - time (sec): 6.88 - samples/sec: 3207.02 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:32:03,161 epoch 9 - iter 120/121 - loss 0.01778422 - time (sec): 7.64 - samples/sec: 3210.96 - lr: 0.000004 - momentum: 0.000000
2023-10-17 10:32:03,225 ----------------------------------------------------------------------------------------------------
2023-10-17 10:32:03,226 EPOCH 9 done: loss 0.0176 - lr: 0.000004
2023-10-17 10:32:04,012 DEV : loss 0.21387448906898499 - f1-score (micro avg) 0.8369
2023-10-17 10:32:04,018 saving best model
2023-10-17 10:32:04,521 ----------------------------------------------------------------------------------------------------
2023-10-17 10:32:05,226 epoch 10 - iter 12/121 - loss 0.00693211 - time (sec): 0.70 - samples/sec: 3366.89 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:32:05,952 epoch 10 - iter 24/121 - loss 0.00979826 - time (sec): 1.43 - samples/sec: 3436.56 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:32:06,680 epoch 10 - iter 36/121 - loss 0.00987826 - time (sec): 2.16 - samples/sec: 3403.89 - lr: 0.000003 - momentum: 0.000000
2023-10-17 10:32:07,502 epoch 10 - iter 48/121 - loss 0.00849950 - time (sec): 2.98 - samples/sec: 3324.42 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:32:08,354 epoch 10 - iter 60/121 - loss 0.00902092 - time (sec): 3.83 - samples/sec: 3253.92 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:32:09,127 epoch 10 - iter 72/121 - loss 0.00916563 - time (sec): 4.60 - samples/sec: 3237.50 - lr: 0.000002 - momentum: 0.000000
2023-10-17 10:32:09,874 epoch 10 - iter 84/121 - loss 0.00985326 - time (sec): 5.35 - samples/sec: 3243.53 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:32:10,679 epoch 10 - iter 96/121 - loss 0.01078985 - time (sec): 6.16 - samples/sec: 3250.04 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:32:11,422 epoch 10 - iter 108/121 - loss 0.01288405 - time (sec): 6.90 - samples/sec: 3244.72 - lr: 0.000001 - momentum: 0.000000
2023-10-17 10:32:12,146 epoch 10 - iter 120/121 - loss 0.01283307 - time (sec): 7.62 - samples/sec: 3234.37 - lr: 0.000000 - momentum: 0.000000
2023-10-17 10:32:12,195 ----------------------------------------------------------------------------------------------------
2023-10-17 10:32:12,195 EPOCH 10 done: loss 0.0128 - lr: 0.000000
2023-10-17 10:32:12,981 DEV : loss 0.21676737070083618 - f1-score (micro avg) 0.8396
2023-10-17 10:32:12,986 saving best model
2023-10-17 10:32:13,945 ----------------------------------------------------------------------------------------------------
2023-10-17 10:32:13,947 Loading model from best epoch ...
2023-10-17 10:32:15,322 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 10:32:16,199
Results:
- F-score (micro) 0.8093
- F-score (macro) 0.5585
- Accuracy 0.7005
By class:
precision recall f1-score support
pers 0.8207 0.8561 0.8380 139
scope 0.8248 0.8760 0.8496 129
work 0.7000 0.7875 0.7412 80
loc 1.0000 0.2222 0.3636 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7941 0.8250 0.8093 360
macro avg 0.6691 0.5484 0.5585 360
weighted avg 0.7930 0.8250 0.8018 360
2023-10-17 10:32:16,199 ----------------------------------------------------------------------------------------------------