2023-10-17 15:55:52,625 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,626 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:55:52,626 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,626 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Train: 5777 sentences 2023-10-17 15:55:52,627 (train_with_dev=False, train_with_test=False) 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Training Params: 2023-10-17 15:55:52,627 - learning_rate: "5e-05" 2023-10-17 15:55:52,627 - mini_batch_size: "8" 2023-10-17 15:55:52,627 - max_epochs: "10" 2023-10-17 15:55:52,627 - shuffle: "True" 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Plugins: 2023-10-17 15:55:52,627 - TensorboardLogger 2023-10-17 15:55:52,627 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:55:52,627 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Computation: 2023-10-17 15:55:52,627 - compute on device: cuda:0 2023-10-17 15:55:52,627 - embedding storage: none 2023-10-17 15:55:52,627 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,627 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 15:55:52,628 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,628 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:55:52,628 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:55:58,075 epoch 1 - iter 72/723 - loss 2.37947798 - time (sec): 5.45 - samples/sec: 3407.16 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:56:03,002 epoch 1 - iter 144/723 - loss 1.44089682 - time (sec): 10.37 - samples/sec: 3320.20 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:56:08,099 epoch 1 - iter 216/723 - loss 1.01364540 - time (sec): 15.47 - samples/sec: 3367.82 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:56:13,081 epoch 1 - iter 288/723 - loss 0.80827500 - time (sec): 20.45 - samples/sec: 3338.59 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:56:18,269 epoch 1 - iter 360/723 - loss 0.67177544 - time (sec): 25.64 - samples/sec: 3373.62 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:56:23,923 epoch 1 - iter 432/723 - loss 0.57566837 - time (sec): 31.29 - samples/sec: 3371.60 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:56:29,072 epoch 1 - iter 504/723 - loss 0.51023937 - time (sec): 36.44 - samples/sec: 3378.90 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:56:34,679 epoch 1 - iter 576/723 - loss 0.46099667 - time (sec): 42.05 - samples/sec: 3356.09 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:56:39,840 epoch 1 - iter 648/723 - loss 0.42468783 - time (sec): 47.21 - samples/sec: 3357.68 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:56:44,665 epoch 1 - iter 720/723 - loss 0.39489656 - time (sec): 52.04 - samples/sec: 3375.89 - lr: 0.000050 - momentum: 0.000000 2023-10-17 15:56:44,835 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:56:44,836 EPOCH 1 done: loss 0.3940 - lr: 0.000050 2023-10-17 15:56:47,691 DEV : loss 0.08297927677631378 - f1-score (micro avg) 0.7587 2023-10-17 15:56:47,707 saving best model 2023-10-17 15:56:48,068 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:56:52,901 epoch 2 - iter 72/723 - loss 0.10260098 - time (sec): 4.83 - samples/sec: 3443.65 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:56:58,055 epoch 2 - iter 144/723 - loss 0.09803331 - time (sec): 9.99 - samples/sec: 3398.30 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:57:03,377 epoch 2 - iter 216/723 - loss 0.09312094 - time (sec): 15.31 - samples/sec: 3353.09 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:57:08,520 epoch 2 - iter 288/723 - loss 0.08851617 - time (sec): 20.45 - samples/sec: 3347.48 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:57:13,975 epoch 2 - iter 360/723 - loss 0.08633661 - time (sec): 25.91 - samples/sec: 3349.76 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:57:19,512 epoch 2 - iter 432/723 - loss 0.08348871 - time (sec): 31.44 - samples/sec: 3367.52 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:57:24,687 epoch 2 - iter 504/723 - loss 0.08371297 - time (sec): 36.62 - samples/sec: 3351.57 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:57:30,095 epoch 2 - iter 576/723 - loss 0.08416410 - time (sec): 42.03 - samples/sec: 3342.54 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:57:35,291 epoch 2 - iter 648/723 - loss 0.08466015 - time (sec): 47.22 - samples/sec: 3334.28 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:57:40,848 epoch 2 - iter 720/723 - loss 0.08309979 - time (sec): 52.78 - samples/sec: 3329.87 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:57:40,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:57:40,990 EPOCH 2 done: loss 0.0831 - lr: 0.000044 2023-10-17 15:57:45,313 DEV : loss 0.08019406348466873 - f1-score (micro avg) 0.7932 2023-10-17 15:57:45,342 saving best model 2023-10-17 15:57:45,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:57:51,210 epoch 3 - iter 72/723 - loss 0.06351360 - time (sec): 5.33 - samples/sec: 3259.26 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:57:56,293 epoch 3 - iter 144/723 - loss 0.06222447 - time (sec): 10.41 - samples/sec: 3325.69 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:58:01,784 epoch 3 - iter 216/723 - loss 0.05814085 - time (sec): 15.90 - samples/sec: 3370.01 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:58:06,838 epoch 3 - iter 288/723 - loss 0.05953396 - time (sec): 20.96 - samples/sec: 3374.65 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:58:11,731 epoch 3 - iter 360/723 - loss 0.05864225 - time (sec): 25.85 - samples/sec: 3384.29 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:58:16,914 epoch 3 - iter 432/723 - loss 0.05836093 - time (sec): 31.03 - samples/sec: 3393.36 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:58:22,123 epoch 3 - iter 504/723 - loss 0.05729257 - time (sec): 36.24 - samples/sec: 3367.08 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:58:27,144 epoch 3 - iter 576/723 - loss 0.05782325 - time (sec): 41.26 - samples/sec: 3380.00 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:58:32,502 epoch 3 - iter 648/723 - loss 0.05829220 - time (sec): 46.62 - samples/sec: 3381.03 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:58:37,805 epoch 3 - iter 720/723 - loss 0.05769844 - time (sec): 51.93 - samples/sec: 3385.52 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:58:37,963 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:37,963 EPOCH 3 done: loss 0.0579 - lr: 0.000039 2023-10-17 15:58:41,429 DEV : loss 0.07650606334209442 - f1-score (micro avg) 0.861 2023-10-17 15:58:41,449 saving best model 2023-10-17 15:58:42,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:58:47,673 epoch 4 - iter 72/723 - loss 0.03823239 - time (sec): 5.57 - samples/sec: 3273.99 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:58:53,016 epoch 4 - iter 144/723 - loss 0.04683747 - time (sec): 10.92 - samples/sec: 3257.37 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:58:58,126 epoch 4 - iter 216/723 - loss 0.04128104 - time (sec): 16.03 - samples/sec: 3301.87 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:59:03,292 epoch 4 - iter 288/723 - loss 0.04079176 - time (sec): 21.19 - samples/sec: 3317.40 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:59:08,209 epoch 4 - iter 360/723 - loss 0.04062909 - time (sec): 26.11 - samples/sec: 3328.56 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:59:13,592 epoch 4 - iter 432/723 - loss 0.04066133 - time (sec): 31.49 - samples/sec: 3318.86 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:59:19,610 epoch 4 - iter 504/723 - loss 0.04003049 - time (sec): 37.51 - samples/sec: 3260.74 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:59:24,910 epoch 4 - iter 576/723 - loss 0.03955197 - time (sec): 42.81 - samples/sec: 3267.84 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:59:30,268 epoch 4 - iter 648/723 - loss 0.03999115 - time (sec): 48.17 - samples/sec: 3272.18 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:59:35,731 epoch 4 - iter 720/723 - loss 0.04250353 - time (sec): 53.63 - samples/sec: 3277.71 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:59:35,904 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:59:35,904 EPOCH 4 done: loss 0.0424 - lr: 0.000033 2023-10-17 15:59:39,265 DEV : loss 0.07505105435848236 - f1-score (micro avg) 0.8508 2023-10-17 15:59:39,285 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:59:44,496 epoch 5 - iter 72/723 - loss 0.02322825 - time (sec): 5.21 - samples/sec: 3397.18 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:59:49,984 epoch 5 - iter 144/723 - loss 0.02522339 - time (sec): 10.70 - samples/sec: 3359.84 - lr: 0.000032 - momentum: 0.000000 2023-10-17 15:59:55,357 epoch 5 - iter 216/723 - loss 0.02748650 - time (sec): 16.07 - samples/sec: 3344.27 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:00:00,150 epoch 5 - iter 288/723 - loss 0.02774826 - time (sec): 20.86 - samples/sec: 3377.69 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:00:05,489 epoch 5 - iter 360/723 - loss 0.02809467 - time (sec): 26.20 - samples/sec: 3351.38 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:00:10,895 epoch 5 - iter 432/723 - loss 0.03187831 - time (sec): 31.61 - samples/sec: 3318.68 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:00:16,149 epoch 5 - iter 504/723 - loss 0.03247153 - time (sec): 36.86 - samples/sec: 3311.27 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:21,641 epoch 5 - iter 576/723 - loss 0.03216390 - time (sec): 42.35 - samples/sec: 3319.88 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:00:26,598 epoch 5 - iter 648/723 - loss 0.03247873 - time (sec): 47.31 - samples/sec: 3336.04 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:32,135 epoch 5 - iter 720/723 - loss 0.03226933 - time (sec): 52.85 - samples/sec: 3323.80 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:00:32,330 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:32,330 EPOCH 5 done: loss 0.0322 - lr: 0.000028 2023-10-17 16:00:36,192 DEV : loss 0.10498460382223129 - f1-score (micro avg) 0.8403 2023-10-17 16:00:36,211 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:00:41,618 epoch 6 - iter 72/723 - loss 0.02253230 - time (sec): 5.41 - samples/sec: 3237.77 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:46,772 epoch 6 - iter 144/723 - loss 0.02540519 - time (sec): 10.56 - samples/sec: 3223.17 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:00:52,011 epoch 6 - iter 216/723 - loss 0.02564947 - time (sec): 15.80 - samples/sec: 3265.77 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:00:57,844 epoch 6 - iter 288/723 - loss 0.02408256 - time (sec): 21.63 - samples/sec: 3226.31 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:01:03,415 epoch 6 - iter 360/723 - loss 0.02433276 - time (sec): 27.20 - samples/sec: 3234.82 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:01:08,535 epoch 6 - iter 432/723 - loss 0.02382939 - time (sec): 32.32 - samples/sec: 3230.90 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:13,688 epoch 6 - iter 504/723 - loss 0.02331952 - time (sec): 37.48 - samples/sec: 3270.35 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:01:18,912 epoch 6 - iter 576/723 - loss 0.02289724 - time (sec): 42.70 - samples/sec: 3256.96 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:24,284 epoch 6 - iter 648/723 - loss 0.02282930 - time (sec): 48.07 - samples/sec: 3260.65 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:01:29,834 epoch 6 - iter 720/723 - loss 0.02229213 - time (sec): 53.62 - samples/sec: 3272.85 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:01:30,026 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:01:30,027 EPOCH 6 done: loss 0.0222 - lr: 0.000022 2023-10-17 16:01:33,283 DEV : loss 0.10264434665441513 - f1-score (micro avg) 0.8651 2023-10-17 16:01:33,308 saving best model 2023-10-17 16:01:33,847 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:01:39,082 epoch 7 - iter 72/723 - loss 0.02479169 - time (sec): 5.23 - samples/sec: 3301.29 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:01:44,110 epoch 7 - iter 144/723 - loss 0.01936541 - time (sec): 10.26 - samples/sec: 3316.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:01:49,477 epoch 7 - iter 216/723 - loss 0.02032545 - time (sec): 15.63 - samples/sec: 3324.49 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:01:55,078 epoch 7 - iter 288/723 - loss 0.01949508 - time (sec): 21.23 - samples/sec: 3298.74 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:02:00,277 epoch 7 - iter 360/723 - loss 0.01828023 - time (sec): 26.43 - samples/sec: 3304.72 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:02:05,591 epoch 7 - iter 432/723 - loss 0.01757906 - time (sec): 31.74 - samples/sec: 3339.90 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:02:10,817 epoch 7 - iter 504/723 - loss 0.01601183 - time (sec): 36.97 - samples/sec: 3330.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:02:16,408 epoch 7 - iter 576/723 - loss 0.01523872 - time (sec): 42.56 - samples/sec: 3294.84 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:02:21,808 epoch 7 - iter 648/723 - loss 0.01596978 - time (sec): 47.96 - samples/sec: 3296.43 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:02:26,919 epoch 7 - iter 720/723 - loss 0.01592490 - time (sec): 53.07 - samples/sec: 3312.38 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:02:27,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:02:27,093 EPOCH 7 done: loss 0.0159 - lr: 0.000017 2023-10-17 16:02:30,303 DEV : loss 0.12677793204784393 - f1-score (micro avg) 0.8562 2023-10-17 16:02:30,320 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:02:35,341 epoch 8 - iter 72/723 - loss 0.00768027 - time (sec): 5.02 - samples/sec: 3216.93 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:02:40,992 epoch 8 - iter 144/723 - loss 0.00861469 - time (sec): 10.67 - samples/sec: 3237.83 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:02:45,889 epoch 8 - iter 216/723 - loss 0.00745035 - time (sec): 15.57 - samples/sec: 3359.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:02:51,120 epoch 8 - iter 288/723 - loss 0.00916909 - time (sec): 20.80 - samples/sec: 3319.12 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:02:56,348 epoch 8 - iter 360/723 - loss 0.00933421 - time (sec): 26.03 - samples/sec: 3307.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:03:02,032 epoch 8 - iter 432/723 - loss 0.00912919 - time (sec): 31.71 - samples/sec: 3297.77 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:03:07,157 epoch 8 - iter 504/723 - loss 0.00939804 - time (sec): 36.84 - samples/sec: 3322.55 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:03:12,227 epoch 8 - iter 576/723 - loss 0.00986990 - time (sec): 41.91 - samples/sec: 3323.69 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:03:17,717 epoch 8 - iter 648/723 - loss 0.00970328 - time (sec): 47.40 - samples/sec: 3330.36 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:03:22,900 epoch 8 - iter 720/723 - loss 0.01012613 - time (sec): 52.58 - samples/sec: 3337.38 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:03:23,135 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:23,136 EPOCH 8 done: loss 0.0101 - lr: 0.000011 2023-10-17 16:03:26,332 DEV : loss 0.1282780021429062 - f1-score (micro avg) 0.8682 2023-10-17 16:03:26,349 saving best model 2023-10-17 16:03:26,916 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:03:32,954 epoch 9 - iter 72/723 - loss 0.00707734 - time (sec): 6.03 - samples/sec: 3186.11 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:03:37,845 epoch 9 - iter 144/723 - loss 0.00625959 - time (sec): 10.92 - samples/sec: 3214.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:03:43,549 epoch 9 - iter 216/723 - loss 0.00647191 - time (sec): 16.62 - samples/sec: 3258.48 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:03:49,022 epoch 9 - iter 288/723 - loss 0.00700090 - time (sec): 22.10 - samples/sec: 3278.32 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:03:54,257 epoch 9 - iter 360/723 - loss 0.00711976 - time (sec): 27.33 - samples/sec: 3277.30 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:03:59,095 epoch 9 - iter 432/723 - loss 0.00675222 - time (sec): 32.17 - samples/sec: 3281.09 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:04:04,324 epoch 9 - iter 504/723 - loss 0.00726827 - time (sec): 37.40 - samples/sec: 3294.08 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:04:09,786 epoch 9 - iter 576/723 - loss 0.00751472 - time (sec): 42.86 - samples/sec: 3297.36 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:04:15,024 epoch 9 - iter 648/723 - loss 0.00706634 - time (sec): 48.10 - samples/sec: 3304.88 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:04:19,751 epoch 9 - iter 720/723 - loss 0.00724864 - time (sec): 52.82 - samples/sec: 3321.30 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:04:20,017 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:04:20,017 EPOCH 9 done: loss 0.0072 - lr: 0.000006 2023-10-17 16:04:23,192 DEV : loss 0.14683492481708527 - f1-score (micro avg) 0.8667 2023-10-17 16:04:23,209 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:04:28,333 epoch 10 - iter 72/723 - loss 0.00847938 - time (sec): 5.12 - samples/sec: 3406.42 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:04:33,664 epoch 10 - iter 144/723 - loss 0.00565771 - time (sec): 10.45 - samples/sec: 3353.37 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:04:38,677 epoch 10 - iter 216/723 - loss 0.00518803 - time (sec): 15.47 - samples/sec: 3342.28 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:04:43,557 epoch 10 - iter 288/723 - loss 0.00507785 - time (sec): 20.35 - samples/sec: 3339.28 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:04:49,150 epoch 10 - iter 360/723 - loss 0.00510449 - time (sec): 25.94 - samples/sec: 3341.65 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:04:54,672 epoch 10 - iter 432/723 - loss 0.00546394 - time (sec): 31.46 - samples/sec: 3349.75 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:04:59,864 epoch 10 - iter 504/723 - loss 0.00482428 - time (sec): 36.65 - samples/sec: 3329.56 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:05:05,555 epoch 10 - iter 576/723 - loss 0.00506029 - time (sec): 42.34 - samples/sec: 3308.61 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:05:10,771 epoch 10 - iter 648/723 - loss 0.00511610 - time (sec): 47.56 - samples/sec: 3324.46 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:05:16,035 epoch 10 - iter 720/723 - loss 0.00523258 - time (sec): 52.82 - samples/sec: 3326.99 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:05:16,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:16,193 EPOCH 10 done: loss 0.0052 - lr: 0.000000 2023-10-17 16:05:19,916 DEV : loss 0.14944744110107422 - f1-score (micro avg) 0.8688 2023-10-17 16:05:19,934 saving best model 2023-10-17 16:05:20,741 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:05:20,742 Loading model from best epoch ... 2023-10-17 16:05:22,172 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 16:05:25,188 Results: - F-score (micro) 0.8476 - F-score (macro) 0.7371 - Accuracy 0.7448 By class: precision recall f1-score support PER 0.8633 0.8257 0.8441 482 LOC 0.9458 0.8755 0.9093 458 ORG 0.4839 0.4348 0.4580 69 micro avg 0.8754 0.8216 0.8476 1009 macro avg 0.7643 0.7120 0.7371 1009 weighted avg 0.8748 0.8216 0.8473 1009 2023-10-17 16:05:25,188 ----------------------------------------------------------------------------------------------------