|
2023-10-17 16:32:46,824 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,826 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 16:32:46,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,826 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-17 16:32:46,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,826 Train: 3575 sentences |
|
2023-10-17 16:32:46,827 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 16:32:46,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,827 Training Params: |
|
2023-10-17 16:32:46,827 - learning_rate: "5e-05" |
|
2023-10-17 16:32:46,827 - mini_batch_size: "4" |
|
2023-10-17 16:32:46,827 - max_epochs: "10" |
|
2023-10-17 16:32:46,827 - shuffle: "True" |
|
2023-10-17 16:32:46,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,827 Plugins: |
|
2023-10-17 16:32:46,827 - TensorboardLogger |
|
2023-10-17 16:32:46,827 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 16:32:46,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,827 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 16:32:46,827 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 16:32:46,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,828 Computation: |
|
2023-10-17 16:32:46,828 - compute on device: cuda:0 |
|
2023-10-17 16:32:46,828 - embedding storage: none |
|
2023-10-17 16:32:46,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,828 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 16:32:46,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:46,828 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 16:32:53,873 epoch 1 - iter 89/894 - loss 3.11410852 - time (sec): 7.04 - samples/sec: 1268.20 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:33:00,942 epoch 1 - iter 178/894 - loss 1.92954730 - time (sec): 14.11 - samples/sec: 1229.02 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:33:08,109 epoch 1 - iter 267/894 - loss 1.46044506 - time (sec): 21.28 - samples/sec: 1176.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:33:15,623 epoch 1 - iter 356/894 - loss 1.15857887 - time (sec): 28.79 - samples/sec: 1202.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:33:22,680 epoch 1 - iter 445/894 - loss 0.99646697 - time (sec): 35.85 - samples/sec: 1201.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 16:33:29,890 epoch 1 - iter 534/894 - loss 0.88147494 - time (sec): 43.06 - samples/sec: 1200.57 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 16:33:36,882 epoch 1 - iter 623/894 - loss 0.79966928 - time (sec): 50.05 - samples/sec: 1191.36 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 16:33:43,748 epoch 1 - iter 712/894 - loss 0.73134493 - time (sec): 56.92 - samples/sec: 1203.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 16:33:50,870 epoch 1 - iter 801/894 - loss 0.67281431 - time (sec): 64.04 - samples/sec: 1216.42 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 16:33:57,724 epoch 1 - iter 890/894 - loss 0.62977212 - time (sec): 70.89 - samples/sec: 1216.64 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 16:33:58,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:33:58,032 EPOCH 1 done: loss 0.6287 - lr: 0.000050 |
|
2023-10-17 16:34:04,362 DEV : loss 0.20323888957500458 - f1-score (micro avg) 0.6421 |
|
2023-10-17 16:34:04,418 saving best model |
|
2023-10-17 16:34:04,952 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:34:11,744 epoch 2 - iter 89/894 - loss 0.16949173 - time (sec): 6.79 - samples/sec: 1248.93 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 16:34:19,004 epoch 2 - iter 178/894 - loss 0.17515983 - time (sec): 14.05 - samples/sec: 1281.61 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 16:34:25,969 epoch 2 - iter 267/894 - loss 0.17426985 - time (sec): 21.01 - samples/sec: 1246.46 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 16:34:33,351 epoch 2 - iter 356/894 - loss 0.17013608 - time (sec): 28.40 - samples/sec: 1246.77 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 16:34:40,525 epoch 2 - iter 445/894 - loss 0.16745688 - time (sec): 35.57 - samples/sec: 1239.00 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 16:34:47,700 epoch 2 - iter 534/894 - loss 0.16144297 - time (sec): 42.75 - samples/sec: 1217.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 16:34:54,935 epoch 2 - iter 623/894 - loss 0.15623398 - time (sec): 49.98 - samples/sec: 1222.12 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 16:35:02,179 epoch 2 - iter 712/894 - loss 0.15158756 - time (sec): 57.23 - samples/sec: 1228.90 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 16:35:09,277 epoch 2 - iter 801/894 - loss 0.15137973 - time (sec): 64.32 - samples/sec: 1219.42 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 16:35:16,421 epoch 2 - iter 890/894 - loss 0.15238150 - time (sec): 71.47 - samples/sec: 1207.65 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 16:35:16,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:35:16,735 EPOCH 2 done: loss 0.1522 - lr: 0.000044 |
|
2023-10-17 16:35:28,076 DEV : loss 0.18639299273490906 - f1-score (micro avg) 0.6714 |
|
2023-10-17 16:35:28,132 saving best model |
|
2023-10-17 16:35:29,529 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:35:36,530 epoch 3 - iter 89/894 - loss 0.11203303 - time (sec): 7.00 - samples/sec: 1150.94 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 16:35:43,592 epoch 3 - iter 178/894 - loss 0.10878561 - time (sec): 14.06 - samples/sec: 1194.19 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 16:35:50,849 epoch 3 - iter 267/894 - loss 0.09676878 - time (sec): 21.32 - samples/sec: 1209.94 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 16:35:57,786 epoch 3 - iter 356/894 - loss 0.09079084 - time (sec): 28.25 - samples/sec: 1205.20 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 16:36:04,925 epoch 3 - iter 445/894 - loss 0.08840102 - time (sec): 35.39 - samples/sec: 1221.78 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 16:36:12,130 epoch 3 - iter 534/894 - loss 0.09039810 - time (sec): 42.60 - samples/sec: 1208.26 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 16:36:19,527 epoch 3 - iter 623/894 - loss 0.08822105 - time (sec): 49.99 - samples/sec: 1209.13 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 16:36:26,922 epoch 3 - iter 712/894 - loss 0.08881662 - time (sec): 57.39 - samples/sec: 1201.43 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 16:36:34,295 epoch 3 - iter 801/894 - loss 0.09266436 - time (sec): 64.76 - samples/sec: 1194.34 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 16:36:41,729 epoch 3 - iter 890/894 - loss 0.09358859 - time (sec): 72.20 - samples/sec: 1194.00 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 16:36:42,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:36:42,065 EPOCH 3 done: loss 0.0937 - lr: 0.000039 |
|
2023-10-17 16:36:53,475 DEV : loss 0.17432451248168945 - f1-score (micro avg) 0.7389 |
|
2023-10-17 16:36:53,531 saving best model |
|
2023-10-17 16:36:54,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:37:02,198 epoch 4 - iter 89/894 - loss 0.05862225 - time (sec): 7.27 - samples/sec: 1317.64 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 16:37:09,386 epoch 4 - iter 178/894 - loss 0.05796894 - time (sec): 14.46 - samples/sec: 1320.39 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 16:37:16,288 epoch 4 - iter 267/894 - loss 0.05799161 - time (sec): 21.36 - samples/sec: 1272.90 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 16:37:23,185 epoch 4 - iter 356/894 - loss 0.05680783 - time (sec): 28.26 - samples/sec: 1239.46 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 16:37:30,107 epoch 4 - iter 445/894 - loss 0.05803381 - time (sec): 35.18 - samples/sec: 1236.96 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 16:37:37,232 epoch 4 - iter 534/894 - loss 0.05629713 - time (sec): 42.31 - samples/sec: 1236.71 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 16:37:44,144 epoch 4 - iter 623/894 - loss 0.05739990 - time (sec): 49.22 - samples/sec: 1231.42 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 16:37:51,253 epoch 4 - iter 712/894 - loss 0.05819088 - time (sec): 56.33 - samples/sec: 1231.58 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 16:37:58,226 epoch 4 - iter 801/894 - loss 0.05824462 - time (sec): 63.30 - samples/sec: 1229.49 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 16:38:05,069 epoch 4 - iter 890/894 - loss 0.06068208 - time (sec): 70.14 - samples/sec: 1228.00 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 16:38:05,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:38:05,382 EPOCH 4 done: loss 0.0612 - lr: 0.000033 |
|
2023-10-17 16:38:16,763 DEV : loss 0.17829285562038422 - f1-score (micro avg) 0.7578 |
|
2023-10-17 16:38:16,817 saving best model |
|
2023-10-17 16:38:18,207 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:38:25,151 epoch 5 - iter 89/894 - loss 0.02982003 - time (sec): 6.94 - samples/sec: 1208.40 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 16:38:32,390 epoch 5 - iter 178/894 - loss 0.03304612 - time (sec): 14.18 - samples/sec: 1271.34 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 16:38:39,495 epoch 5 - iter 267/894 - loss 0.03586922 - time (sec): 21.28 - samples/sec: 1258.57 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 16:38:47,102 epoch 5 - iter 356/894 - loss 0.04145233 - time (sec): 28.89 - samples/sec: 1216.66 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 16:38:54,547 epoch 5 - iter 445/894 - loss 0.04280853 - time (sec): 36.33 - samples/sec: 1189.27 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 16:39:02,581 epoch 5 - iter 534/894 - loss 0.04154607 - time (sec): 44.37 - samples/sec: 1176.45 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 16:39:09,838 epoch 5 - iter 623/894 - loss 0.04203572 - time (sec): 51.63 - samples/sec: 1175.56 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:39:17,085 epoch 5 - iter 712/894 - loss 0.04172114 - time (sec): 58.87 - samples/sec: 1184.70 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:39:24,099 epoch 5 - iter 801/894 - loss 0.04233064 - time (sec): 65.89 - samples/sec: 1182.10 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 16:39:31,558 epoch 5 - iter 890/894 - loss 0.04024411 - time (sec): 73.35 - samples/sec: 1176.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 16:39:31,872 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:39:31,872 EPOCH 5 done: loss 0.0403 - lr: 0.000028 |
|
2023-10-17 16:39:43,356 DEV : loss 0.2713957130908966 - f1-score (micro avg) 0.7753 |
|
2023-10-17 16:39:43,411 saving best model |
|
2023-10-17 16:39:44,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:39:51,905 epoch 6 - iter 89/894 - loss 0.03784356 - time (sec): 7.10 - samples/sec: 1252.78 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:39:59,301 epoch 6 - iter 178/894 - loss 0.03078874 - time (sec): 14.50 - samples/sec: 1213.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:40:06,519 epoch 6 - iter 267/894 - loss 0.02968316 - time (sec): 21.72 - samples/sec: 1191.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 16:40:13,613 epoch 6 - iter 356/894 - loss 0.03060611 - time (sec): 28.81 - samples/sec: 1188.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 16:40:21,175 epoch 6 - iter 445/894 - loss 0.02585728 - time (sec): 36.37 - samples/sec: 1182.02 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 16:40:28,273 epoch 6 - iter 534/894 - loss 0.02507947 - time (sec): 43.47 - samples/sec: 1177.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:40:35,216 epoch 6 - iter 623/894 - loss 0.02444122 - time (sec): 50.41 - samples/sec: 1173.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:40:42,613 epoch 6 - iter 712/894 - loss 0.02585665 - time (sec): 57.81 - samples/sec: 1180.03 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 16:40:49,698 epoch 6 - iter 801/894 - loss 0.02528235 - time (sec): 64.89 - samples/sec: 1180.84 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 16:40:57,157 epoch 6 - iter 890/894 - loss 0.02483238 - time (sec): 72.35 - samples/sec: 1191.34 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 16:40:57,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:40:57,481 EPOCH 6 done: loss 0.0247 - lr: 0.000022 |
|
2023-10-17 16:41:08,509 DEV : loss 0.23799937963485718 - f1-score (micro avg) 0.7797 |
|
2023-10-17 16:41:08,567 saving best model |
|
2023-10-17 16:41:09,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:41:17,048 epoch 7 - iter 89/894 - loss 0.01634402 - time (sec): 7.07 - samples/sec: 1230.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 16:41:23,790 epoch 7 - iter 178/894 - loss 0.01132312 - time (sec): 13.82 - samples/sec: 1189.82 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:41:30,778 epoch 7 - iter 267/894 - loss 0.01278808 - time (sec): 20.80 - samples/sec: 1202.61 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:41:37,945 epoch 7 - iter 356/894 - loss 0.01168855 - time (sec): 27.97 - samples/sec: 1219.29 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:41:44,979 epoch 7 - iter 445/894 - loss 0.01495033 - time (sec): 35.00 - samples/sec: 1220.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 16:41:52,075 epoch 7 - iter 534/894 - loss 0.01544760 - time (sec): 42.10 - samples/sec: 1226.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 16:41:59,111 epoch 7 - iter 623/894 - loss 0.01402268 - time (sec): 49.14 - samples/sec: 1224.39 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:42:06,783 epoch 7 - iter 712/894 - loss 0.01470202 - time (sec): 56.81 - samples/sec: 1220.18 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:42:13,897 epoch 7 - iter 801/894 - loss 0.01567910 - time (sec): 63.92 - samples/sec: 1222.31 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 16:42:20,822 epoch 7 - iter 890/894 - loss 0.01542654 - time (sec): 70.85 - samples/sec: 1215.22 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 16:42:21,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:42:21,144 EPOCH 7 done: loss 0.0155 - lr: 0.000017 |
|
2023-10-17 16:42:32,075 DEV : loss 0.2671242356300354 - f1-score (micro avg) 0.7727 |
|
2023-10-17 16:42:32,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:42:38,984 epoch 8 - iter 89/894 - loss 0.00642232 - time (sec): 6.85 - samples/sec: 1286.01 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 16:42:45,825 epoch 8 - iter 178/894 - loss 0.01547951 - time (sec): 13.69 - samples/sec: 1244.24 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 16:42:52,646 epoch 8 - iter 267/894 - loss 0.01183362 - time (sec): 20.51 - samples/sec: 1237.58 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:42:59,577 epoch 8 - iter 356/894 - loss 0.01316590 - time (sec): 27.44 - samples/sec: 1224.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 16:43:06,861 epoch 8 - iter 445/894 - loss 0.01148861 - time (sec): 34.72 - samples/sec: 1255.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 16:43:13,946 epoch 8 - iter 534/894 - loss 0.01218895 - time (sec): 41.81 - samples/sec: 1255.45 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 16:43:20,701 epoch 8 - iter 623/894 - loss 0.01256129 - time (sec): 48.56 - samples/sec: 1269.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 16:43:27,458 epoch 8 - iter 712/894 - loss 0.01163940 - time (sec): 55.32 - samples/sec: 1259.03 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:43:34,177 epoch 8 - iter 801/894 - loss 0.01183209 - time (sec): 62.04 - samples/sec: 1257.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:43:40,862 epoch 8 - iter 890/894 - loss 0.01144474 - time (sec): 68.72 - samples/sec: 1255.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 16:43:41,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:43:41,155 EPOCH 8 done: loss 0.0115 - lr: 0.000011 |
|
2023-10-17 16:43:52,982 DEV : loss 0.25983676314353943 - f1-score (micro avg) 0.7832 |
|
2023-10-17 16:43:53,059 saving best model |
|
2023-10-17 16:43:54,471 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:44:01,588 epoch 9 - iter 89/894 - loss 0.00948902 - time (sec): 7.11 - samples/sec: 1194.84 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 16:44:08,811 epoch 9 - iter 178/894 - loss 0.00788875 - time (sec): 14.34 - samples/sec: 1195.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:44:16,000 epoch 9 - iter 267/894 - loss 0.00936076 - time (sec): 21.53 - samples/sec: 1154.95 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:44:23,123 epoch 9 - iter 356/894 - loss 0.00847747 - time (sec): 28.65 - samples/sec: 1180.69 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:44:30,220 epoch 9 - iter 445/894 - loss 0.00842446 - time (sec): 35.75 - samples/sec: 1194.06 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 16:44:37,433 epoch 9 - iter 534/894 - loss 0.00877574 - time (sec): 42.96 - samples/sec: 1201.69 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 16:44:44,464 epoch 9 - iter 623/894 - loss 0.00788580 - time (sec): 49.99 - samples/sec: 1201.20 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 16:44:51,571 epoch 9 - iter 712/894 - loss 0.00767381 - time (sec): 57.10 - samples/sec: 1204.23 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 16:44:58,906 epoch 9 - iter 801/894 - loss 0.00698938 - time (sec): 64.43 - samples/sec: 1205.26 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:45:06,127 epoch 9 - iter 890/894 - loss 0.00663373 - time (sec): 71.65 - samples/sec: 1203.66 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:45:06,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:45:06,433 EPOCH 9 done: loss 0.0066 - lr: 0.000006 |
|
2023-10-17 16:45:18,138 DEV : loss 0.2585032284259796 - f1-score (micro avg) 0.7948 |
|
2023-10-17 16:45:18,200 saving best model |
|
2023-10-17 16:45:19,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:45:26,775 epoch 10 - iter 89/894 - loss 0.00559870 - time (sec): 7.13 - samples/sec: 1289.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:45:33,768 epoch 10 - iter 178/894 - loss 0.00539965 - time (sec): 14.12 - samples/sec: 1226.33 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 16:45:40,756 epoch 10 - iter 267/894 - loss 0.00432438 - time (sec): 21.11 - samples/sec: 1202.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 16:45:47,738 epoch 10 - iter 356/894 - loss 0.00382682 - time (sec): 28.09 - samples/sec: 1211.09 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:45:54,757 epoch 10 - iter 445/894 - loss 0.00379707 - time (sec): 35.11 - samples/sec: 1216.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:46:02,045 epoch 10 - iter 534/894 - loss 0.00433668 - time (sec): 42.40 - samples/sec: 1232.01 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 16:46:09,021 epoch 10 - iter 623/894 - loss 0.00409876 - time (sec): 49.38 - samples/sec: 1213.61 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 16:46:16,175 epoch 10 - iter 712/894 - loss 0.00359853 - time (sec): 56.53 - samples/sec: 1215.64 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 16:46:23,161 epoch 10 - iter 801/894 - loss 0.00367913 - time (sec): 63.52 - samples/sec: 1210.80 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 16:46:30,292 epoch 10 - iter 890/894 - loss 0.00340485 - time (sec): 70.65 - samples/sec: 1218.69 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 16:46:30,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:46:30,607 EPOCH 10 done: loss 0.0034 - lr: 0.000000 |
|
2023-10-17 16:46:42,254 DEV : loss 0.27636414766311646 - f1-score (micro avg) 0.7941 |
|
2023-10-17 16:46:42,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:46:42,846 Loading model from best epoch ... |
|
2023-10-17 16:46:45,143 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-17 16:46:51,229 |
|
Results: |
|
- F-score (micro) 0.7627 |
|
- F-score (macro) 0.6782 |
|
- Accuracy 0.6355 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8344 0.8540 0.8441 596 |
|
pers 0.7230 0.7838 0.7522 333 |
|
org 0.5345 0.4697 0.5000 132 |
|
prod 0.6909 0.5758 0.6281 66 |
|
time 0.6600 0.6735 0.6667 49 |
|
|
|
micro avg 0.7576 0.7679 0.7627 1176 |
|
macro avg 0.6886 0.6713 0.6782 1176 |
|
weighted avg 0.7539 0.7679 0.7599 1176 |
|
|
|
2023-10-17 16:46:51,230 ---------------------------------------------------------------------------------------------------- |
|
|