2023-10-17 15:06:19,081 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,082 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:06:19,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,082 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 15:06:19,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,082 Train: 7936 sentences 2023-10-17 15:06:19,082 (train_with_dev=False, train_with_test=False) 2023-10-17 15:06:19,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,082 Training Params: 2023-10-17 15:06:19,082 - learning_rate: "5e-05" 2023-10-17 15:06:19,082 - mini_batch_size: "8" 2023-10-17 15:06:19,082 - max_epochs: "10" 2023-10-17 15:06:19,082 - shuffle: "True" 2023-10-17 15:06:19,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 Plugins: 2023-10-17 15:06:19,083 - TensorboardLogger 2023-10-17 15:06:19,083 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:06:19,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:06:19,083 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:06:19,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 Computation: 2023-10-17 15:06:19,083 - compute on device: cuda:0 2023-10-17 15:06:19,083 - embedding storage: none 2023-10-17 15:06:19,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 15:06:19,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:06:19,083 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:06:24,942 epoch 1 - iter 99/992 - loss 2.38046963 - time (sec): 5.86 - samples/sec: 2753.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:06:31,168 epoch 1 - iter 198/992 - loss 1.37254739 - time (sec): 12.08 - samples/sec: 2777.31 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:06:37,005 epoch 1 - iter 297/992 - loss 1.00559337 - time (sec): 17.92 - samples/sec: 2749.10 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:06:42,788 epoch 1 - iter 396/992 - loss 0.80698866 - time (sec): 23.70 - samples/sec: 2751.86 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:06:48,423 epoch 1 - iter 495/992 - loss 0.68842093 - time (sec): 29.34 - samples/sec: 2747.14 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:06:54,254 epoch 1 - iter 594/992 - loss 0.59727087 - time (sec): 35.17 - samples/sec: 2756.67 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:07:00,603 epoch 1 - iter 693/992 - loss 0.52832358 - time (sec): 41.52 - samples/sec: 2742.95 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:07:06,517 epoch 1 - iter 792/992 - loss 0.47754192 - time (sec): 47.43 - samples/sec: 2751.53 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:07:12,414 epoch 1 - iter 891/992 - loss 0.43922362 - time (sec): 53.33 - samples/sec: 2756.93 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:07:18,496 epoch 1 - iter 990/992 - loss 0.40646285 - time (sec): 59.41 - samples/sec: 2754.47 - lr: 0.000050 - momentum: 0.000000 2023-10-17 15:07:18,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:07:18,609 EPOCH 1 done: loss 0.4058 - lr: 0.000050 2023-10-17 15:07:21,928 DEV : loss 0.08580786734819412 - f1-score (micro avg) 0.7132 2023-10-17 15:07:21,960 saving best model 2023-10-17 15:07:23,223 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:07:29,185 epoch 2 - iter 99/992 - loss 0.10472742 - time (sec): 5.96 - samples/sec: 2858.77 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:07:35,198 epoch 2 - iter 198/992 - loss 0.10274139 - time (sec): 11.97 - samples/sec: 2791.29 - lr: 0.000049 - momentum: 0.000000 2023-10-17 15:07:40,921 epoch 2 - iter 297/992 - loss 0.10453930 - time (sec): 17.70 - samples/sec: 2811.26 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:07:46,491 epoch 2 - iter 396/992 - loss 0.10618343 - time (sec): 23.27 - samples/sec: 2805.37 - lr: 0.000048 - momentum: 0.000000 2023-10-17 15:07:52,431 epoch 2 - iter 495/992 - loss 0.10477397 - time (sec): 29.21 - samples/sec: 2810.58 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:07:58,372 epoch 2 - iter 594/992 - loss 0.10424898 - time (sec): 35.15 - samples/sec: 2787.08 - lr: 0.000047 - momentum: 0.000000 2023-10-17 15:08:04,514 epoch 2 - iter 693/992 - loss 0.10574565 - time (sec): 41.29 - samples/sec: 2760.39 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:08:10,405 epoch 2 - iter 792/992 - loss 0.10502822 - time (sec): 47.18 - samples/sec: 2759.11 - lr: 0.000046 - momentum: 0.000000 2023-10-17 15:08:16,744 epoch 2 - iter 891/992 - loss 0.10440715 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000045 - momentum: 0.000000 2023-10-17 15:08:22,649 epoch 2 - iter 990/992 - loss 0.10635558 - time (sec): 59.42 - samples/sec: 2754.71 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:08:22,763 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:08:22,763 EPOCH 2 done: loss 0.1063 - lr: 0.000044 2023-10-17 15:08:26,289 DEV : loss 0.09309504926204681 - f1-score (micro avg) 0.7346 2023-10-17 15:08:26,310 saving best model 2023-10-17 15:08:26,957 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:08:33,050 epoch 3 - iter 99/992 - loss 0.07641069 - time (sec): 6.09 - samples/sec: 2495.54 - lr: 0.000044 - momentum: 0.000000 2023-10-17 15:08:39,473 epoch 3 - iter 198/992 - loss 0.07843575 - time (sec): 12.51 - samples/sec: 2516.95 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:08:45,808 epoch 3 - iter 297/992 - loss 0.07377962 - time (sec): 18.85 - samples/sec: 2566.19 - lr: 0.000043 - momentum: 0.000000 2023-10-17 15:08:52,077 epoch 3 - iter 396/992 - loss 0.07136994 - time (sec): 25.12 - samples/sec: 2594.72 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:08:58,402 epoch 3 - iter 495/992 - loss 0.07162863 - time (sec): 31.44 - samples/sec: 2593.96 - lr: 0.000042 - momentum: 0.000000 2023-10-17 15:09:04,568 epoch 3 - iter 594/992 - loss 0.07280519 - time (sec): 37.61 - samples/sec: 2622.46 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:09:10,352 epoch 3 - iter 693/992 - loss 0.07280608 - time (sec): 43.39 - samples/sec: 2636.16 - lr: 0.000041 - momentum: 0.000000 2023-10-17 15:09:16,277 epoch 3 - iter 792/992 - loss 0.07274453 - time (sec): 49.32 - samples/sec: 2646.29 - lr: 0.000040 - momentum: 0.000000 2023-10-17 15:09:22,902 epoch 3 - iter 891/992 - loss 0.07234744 - time (sec): 55.94 - samples/sec: 2637.53 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:09:28,682 epoch 3 - iter 990/992 - loss 0.07312508 - time (sec): 61.72 - samples/sec: 2651.40 - lr: 0.000039 - momentum: 0.000000 2023-10-17 15:09:28,790 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:28,790 EPOCH 3 done: loss 0.0730 - lr: 0.000039 2023-10-17 15:09:32,577 DEV : loss 0.10305587947368622 - f1-score (micro avg) 0.7591 2023-10-17 15:09:32,610 saving best model 2023-10-17 15:09:33,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:38,942 epoch 4 - iter 99/992 - loss 0.05222278 - time (sec): 5.85 - samples/sec: 2696.28 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:09:45,319 epoch 4 - iter 198/992 - loss 0.05464912 - time (sec): 12.23 - samples/sec: 2664.59 - lr: 0.000038 - momentum: 0.000000 2023-10-17 15:09:51,435 epoch 4 - iter 297/992 - loss 0.05795324 - time (sec): 18.34 - samples/sec: 2641.93 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:09:57,594 epoch 4 - iter 396/992 - loss 0.05516303 - time (sec): 24.50 - samples/sec: 2658.01 - lr: 0.000037 - momentum: 0.000000 2023-10-17 15:10:03,797 epoch 4 - iter 495/992 - loss 0.05508194 - time (sec): 30.70 - samples/sec: 2659.79 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:10:09,783 epoch 4 - iter 594/992 - loss 0.05615006 - time (sec): 36.69 - samples/sec: 2665.02 - lr: 0.000036 - momentum: 0.000000 2023-10-17 15:10:15,945 epoch 4 - iter 693/992 - loss 0.05660499 - time (sec): 42.85 - samples/sec: 2666.84 - lr: 0.000035 - momentum: 0.000000 2023-10-17 15:10:21,870 epoch 4 - iter 792/992 - loss 0.05723636 - time (sec): 48.78 - samples/sec: 2675.19 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:10:27,762 epoch 4 - iter 891/992 - loss 0.05771745 - time (sec): 54.67 - samples/sec: 2694.56 - lr: 0.000034 - momentum: 0.000000 2023-10-17 15:10:33,702 epoch 4 - iter 990/992 - loss 0.05696544 - time (sec): 60.61 - samples/sec: 2700.46 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:10:33,832 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:33,832 EPOCH 4 done: loss 0.0572 - lr: 0.000033 2023-10-17 15:10:37,423 DEV : loss 0.1290876865386963 - f1-score (micro avg) 0.7365 2023-10-17 15:10:37,453 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:43,108 epoch 5 - iter 99/992 - loss 0.04105148 - time (sec): 5.65 - samples/sec: 2830.85 - lr: 0.000033 - momentum: 0.000000 2023-10-17 15:10:49,178 epoch 5 - iter 198/992 - loss 0.04148932 - time (sec): 11.72 - samples/sec: 2798.97 - lr: 0.000032 - momentum: 0.000000 2023-10-17 15:10:55,666 epoch 5 - iter 297/992 - loss 0.03980325 - time (sec): 18.21 - samples/sec: 2752.45 - lr: 0.000032 - momentum: 0.000000 2023-10-17 15:11:01,437 epoch 5 - iter 396/992 - loss 0.04154139 - time (sec): 23.98 - samples/sec: 2764.28 - lr: 0.000031 - momentum: 0.000000 2023-10-17 15:11:07,548 epoch 5 - iter 495/992 - loss 0.04253159 - time (sec): 30.09 - samples/sec: 2780.57 - lr: 0.000031 - momentum: 0.000000 2023-10-17 15:11:13,348 epoch 5 - iter 594/992 - loss 0.04313180 - time (sec): 35.89 - samples/sec: 2771.86 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:11:19,562 epoch 5 - iter 693/992 - loss 0.04278345 - time (sec): 42.11 - samples/sec: 2762.89 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:25,360 epoch 5 - iter 792/992 - loss 0.04253978 - time (sec): 47.91 - samples/sec: 2753.94 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:31,032 epoch 5 - iter 891/992 - loss 0.04250696 - time (sec): 53.58 - samples/sec: 2755.88 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:11:37,126 epoch 5 - iter 990/992 - loss 0.04181300 - time (sec): 59.67 - samples/sec: 2742.54 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:11:37,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:37,239 EPOCH 5 done: loss 0.0417 - lr: 0.000028 2023-10-17 15:11:41,340 DEV : loss 0.1687757819890976 - f1-score (micro avg) 0.7644 2023-10-17 15:11:41,374 saving best model 2023-10-17 15:11:41,959 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:48,364 epoch 6 - iter 99/992 - loss 0.02946611 - time (sec): 6.40 - samples/sec: 2586.47 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:11:54,568 epoch 6 - iter 198/992 - loss 0.03061488 - time (sec): 12.61 - samples/sec: 2654.29 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:12:00,552 epoch 6 - iter 297/992 - loss 0.02985251 - time (sec): 18.59 - samples/sec: 2698.65 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:06,580 epoch 6 - iter 396/992 - loss 0.02840164 - time (sec): 24.62 - samples/sec: 2714.73 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:12,475 epoch 6 - iter 495/992 - loss 0.02922618 - time (sec): 30.51 - samples/sec: 2725.65 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:12:18,608 epoch 6 - iter 594/992 - loss 0.02995890 - time (sec): 36.65 - samples/sec: 2740.43 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:12:24,344 epoch 6 - iter 693/992 - loss 0.03016712 - time (sec): 42.38 - samples/sec: 2735.06 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:12:30,413 epoch 6 - iter 792/992 - loss 0.03158247 - time (sec): 48.45 - samples/sec: 2714.37 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:12:36,424 epoch 6 - iter 891/992 - loss 0.03092480 - time (sec): 54.46 - samples/sec: 2713.22 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:12:42,336 epoch 6 - iter 990/992 - loss 0.03100699 - time (sec): 60.37 - samples/sec: 2710.72 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:12:42,446 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:12:42,447 EPOCH 6 done: loss 0.0311 - lr: 0.000022 2023-10-17 15:12:46,001 DEV : loss 0.18693169951438904 - f1-score (micro avg) 0.7604 2023-10-17 15:12:46,022 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:12:51,972 epoch 7 - iter 99/992 - loss 0.02441708 - time (sec): 5.95 - samples/sec: 2670.95 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:12:58,124 epoch 7 - iter 198/992 - loss 0.02229387 - time (sec): 12.10 - samples/sec: 2687.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:13:04,348 epoch 7 - iter 297/992 - loss 0.02315922 - time (sec): 18.32 - samples/sec: 2667.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:13:10,631 epoch 7 - iter 396/992 - loss 0.02270701 - time (sec): 24.61 - samples/sec: 2698.30 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:13:17,126 epoch 7 - iter 495/992 - loss 0.02221474 - time (sec): 31.10 - samples/sec: 2711.28 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:13:22,985 epoch 7 - iter 594/992 - loss 0.02215952 - time (sec): 36.96 - samples/sec: 2718.05 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:13:28,764 epoch 7 - iter 693/992 - loss 0.02222955 - time (sec): 42.74 - samples/sec: 2718.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:13:34,713 epoch 7 - iter 792/992 - loss 0.02198392 - time (sec): 48.69 - samples/sec: 2715.09 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:13:40,679 epoch 7 - iter 891/992 - loss 0.02263738 - time (sec): 54.66 - samples/sec: 2711.54 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:13:46,476 epoch 7 - iter 990/992 - loss 0.02209691 - time (sec): 60.45 - samples/sec: 2706.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:13:46,588 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:13:46,588 EPOCH 7 done: loss 0.0223 - lr: 0.000017 2023-10-17 15:13:50,192 DEV : loss 0.20075371861457825 - f1-score (micro avg) 0.7563 2023-10-17 15:13:50,226 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:13:56,194 epoch 8 - iter 99/992 - loss 0.01434494 - time (sec): 5.97 - samples/sec: 2726.76 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:14:01,983 epoch 8 - iter 198/992 - loss 0.01221880 - time (sec): 11.75 - samples/sec: 2747.78 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:14:08,169 epoch 8 - iter 297/992 - loss 0.01358537 - time (sec): 17.94 - samples/sec: 2718.92 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:14:14,284 epoch 8 - iter 396/992 - loss 0.01240926 - time (sec): 24.06 - samples/sec: 2714.16 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:14:20,536 epoch 8 - iter 495/992 - loss 0.01395736 - time (sec): 30.31 - samples/sec: 2686.43 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:14:26,357 epoch 8 - iter 594/992 - loss 0.01412399 - time (sec): 36.13 - samples/sec: 2688.93 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:14:32,373 epoch 8 - iter 693/992 - loss 0.01400427 - time (sec): 42.15 - samples/sec: 2696.74 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:14:38,421 epoch 8 - iter 792/992 - loss 0.01472146 - time (sec): 48.19 - samples/sec: 2706.17 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:14:44,134 epoch 8 - iter 891/992 - loss 0.01436421 - time (sec): 53.91 - samples/sec: 2713.39 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:14:50,402 epoch 8 - iter 990/992 - loss 0.01442694 - time (sec): 60.17 - samples/sec: 2719.16 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:14:50,524 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:14:50,525 EPOCH 8 done: loss 0.0146 - lr: 0.000011 2023-10-17 15:14:54,214 DEV : loss 0.22761167585849762 - f1-score (micro avg) 0.7606 2023-10-17 15:14:54,240 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:00,161 epoch 9 - iter 99/992 - loss 0.00799092 - time (sec): 5.92 - samples/sec: 2843.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:15:06,079 epoch 9 - iter 198/992 - loss 0.00828030 - time (sec): 11.84 - samples/sec: 2852.74 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:15:12,439 epoch 9 - iter 297/992 - loss 0.00916347 - time (sec): 18.20 - samples/sec: 2757.11 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:15:18,232 epoch 9 - iter 396/992 - loss 0.01105875 - time (sec): 23.99 - samples/sec: 2764.75 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:15:24,475 epoch 9 - iter 495/992 - loss 0.01018621 - time (sec): 30.23 - samples/sec: 2754.32 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:15:30,268 epoch 9 - iter 594/992 - loss 0.01068382 - time (sec): 36.03 - samples/sec: 2751.33 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:15:36,445 epoch 9 - iter 693/992 - loss 0.01008544 - time (sec): 42.20 - samples/sec: 2739.54 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:15:42,355 epoch 9 - iter 792/992 - loss 0.00998382 - time (sec): 48.11 - samples/sec: 2739.36 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:15:48,413 epoch 9 - iter 891/992 - loss 0.00988621 - time (sec): 54.17 - samples/sec: 2728.23 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:15:54,351 epoch 9 - iter 990/992 - loss 0.00985186 - time (sec): 60.11 - samples/sec: 2723.30 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:15:54,459 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:54,459 EPOCH 9 done: loss 0.0098 - lr: 0.000006 2023-10-17 15:15:58,980 DEV : loss 0.24459530413150787 - f1-score (micro avg) 0.7596 2023-10-17 15:15:59,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:05,391 epoch 10 - iter 99/992 - loss 0.00479346 - time (sec): 6.38 - samples/sec: 2649.86 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:16:11,650 epoch 10 - iter 198/992 - loss 0.00564539 - time (sec): 12.64 - samples/sec: 2639.92 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:16:17,940 epoch 10 - iter 297/992 - loss 0.00670906 - time (sec): 18.93 - samples/sec: 2650.03 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:16:24,031 epoch 10 - iter 396/992 - loss 0.00703610 - time (sec): 25.02 - samples/sec: 2619.19 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:16:30,132 epoch 10 - iter 495/992 - loss 0.00702067 - time (sec): 31.12 - samples/sec: 2652.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:16:36,083 epoch 10 - iter 594/992 - loss 0.00679611 - time (sec): 37.07 - samples/sec: 2651.23 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:16:41,761 epoch 10 - iter 693/992 - loss 0.00690832 - time (sec): 42.75 - samples/sec: 2680.33 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:16:47,817 epoch 10 - iter 792/992 - loss 0.00687466 - time (sec): 48.81 - samples/sec: 2673.00 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:16:53,876 epoch 10 - iter 891/992 - loss 0.00684382 - time (sec): 54.87 - samples/sec: 2661.32 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:17:00,571 epoch 10 - iter 990/992 - loss 0.00637386 - time (sec): 61.56 - samples/sec: 2659.92 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:17:00,687 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:17:00,687 EPOCH 10 done: loss 0.0064 - lr: 0.000000 2023-10-17 15:17:04,344 DEV : loss 0.24962207674980164 - f1-score (micro avg) 0.7634 2023-10-17 15:17:04,822 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:17:04,823 Loading model from best epoch ... 2023-10-17 15:17:06,444 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 15:17:10,062 Results: - F-score (micro) 0.7653 - F-score (macro) 0.665 - Accuracy 0.6489 By class: precision recall f1-score support LOC 0.7975 0.8779 0.8358 655 PER 0.6768 0.7982 0.7325 223 ORG 0.4554 0.4016 0.4268 127 micro avg 0.7336 0.8000 0.7653 1005 macro avg 0.6432 0.6925 0.6650 1005 weighted avg 0.7275 0.8000 0.7612 1005 2023-10-17 15:17:10,062 ----------------------------------------------------------------------------------------------------