stefan-it commited on
Commit
f829746
1 Parent(s): 8c6d320

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f70feeaf985eecaa9987919d70a75a47d639486cf6ba1ba28a5ba0912bd1434
3
+ size 440954373
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 19:22:53 0.0000 0.4233 0.1612 0.2026 0.6117 0.3044 0.1804
3
+ 2 19:30:06 0.0000 0.1775 0.1659 0.2386 0.5246 0.3280 0.1969
4
+ 3 19:37:11 0.0000 0.1278 0.2368 0.2510 0.4924 0.3325 0.2008
5
+ 4 19:44:17 0.0000 0.0896 0.3756 0.2362 0.5663 0.3333 0.2008
6
+ 5 19:51:30 0.0000 0.0657 0.3731 0.2761 0.5701 0.3721 0.2294
7
+ 6 19:58:44 0.0000 0.0444 0.3857 0.2899 0.6439 0.3998 0.2506
8
+ 7 20:05:42 0.0000 0.0312 0.4771 0.2636 0.6231 0.3705 0.2286
9
+ 8 20:12:48 0.0000 0.0217 0.4999 0.2510 0.6023 0.3543 0.2162
10
+ 9 20:20:05 0.0000 0.0155 0.5256 0.2680 0.6212 0.3744 0.2315
11
+ 10 20:27:22 0.0000 0.0092 0.5250 0.2668 0.6155 0.3723 0.2298
runs/events.out.tfevents.1697570143.3ae7c61396a7.1160.10 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13f0009ddf1c46450fc32547b399236982603bef8ac12fe5c56fab4858c0ac81
3
+ size 2923780
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 19:15:43,844 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 19:15:43,846 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=17, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 19:15:43,846 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
48
+ - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
49
+ 2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 19:15:43,846 Train: 20847 sentences
51
+ 2023-10-17 19:15:43,846 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 19:15:43,846 Training Params:
54
+ 2023-10-17 19:15:43,847 - learning_rate: "3e-05"
55
+ 2023-10-17 19:15:43,847 - mini_batch_size: "4"
56
+ 2023-10-17 19:15:43,847 - max_epochs: "10"
57
+ 2023-10-17 19:15:43,847 - shuffle: "True"
58
+ 2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 19:15:43,847 Plugins:
60
+ 2023-10-17 19:15:43,847 - TensorboardLogger
61
+ 2023-10-17 19:15:43,847 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 19:15:43,847 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 19:15:43,847 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 19:15:43,847 Computation:
67
+ 2023-10-17 19:15:43,847 - compute on device: cuda:0
68
+ 2023-10-17 19:15:43,847 - embedding storage: none
69
+ 2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 19:15:43,848 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
71
+ 2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 19:15:43,848 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 19:16:25,838 epoch 1 - iter 521/5212 - loss 1.79081099 - time (sec): 41.99 - samples/sec: 897.70 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 19:17:08,595 epoch 1 - iter 1042/5212 - loss 1.11359267 - time (sec): 84.75 - samples/sec: 881.25 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 19:17:49,971 epoch 1 - iter 1563/5212 - loss 0.86063766 - time (sec): 126.12 - samples/sec: 874.30 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 19:18:32,756 epoch 1 - iter 2084/5212 - loss 0.72254731 - time (sec): 168.91 - samples/sec: 862.33 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 19:19:14,275 epoch 1 - iter 2605/5212 - loss 0.63113368 - time (sec): 210.42 - samples/sec: 854.34 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 19:19:55,829 epoch 1 - iter 3126/5212 - loss 0.56274254 - time (sec): 251.98 - samples/sec: 861.56 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 19:20:37,475 epoch 1 - iter 3647/5212 - loss 0.51180163 - time (sec): 293.63 - samples/sec: 863.11 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 19:21:20,549 epoch 1 - iter 4168/5212 - loss 0.47801506 - time (sec): 336.70 - samples/sec: 861.50 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 19:22:02,701 epoch 1 - iter 4689/5212 - loss 0.44903165 - time (sec): 378.85 - samples/sec: 858.92 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 19:22:45,402 epoch 1 - iter 5210/5212 - loss 0.42328665 - time (sec): 421.55 - samples/sec: 871.56 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 19:22:45,568 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 19:22:45,568 EPOCH 1 done: loss 0.4233 - lr: 0.000030
86
+ 2023-10-17 19:22:53,561 DEV : loss 0.16118471324443817 - f1-score (micro avg) 0.3044
87
+ 2023-10-17 19:22:53,624 saving best model
88
+ 2023-10-17 19:22:54,240 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 19:23:37,950 epoch 2 - iter 521/5212 - loss 0.19745124 - time (sec): 43.71 - samples/sec: 841.63 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 19:24:20,587 epoch 2 - iter 1042/5212 - loss 0.18718357 - time (sec): 86.34 - samples/sec: 886.43 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 19:25:01,266 epoch 2 - iter 1563/5212 - loss 0.18426743 - time (sec): 127.02 - samples/sec: 883.54 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 19:25:43,672 epoch 2 - iter 2084/5212 - loss 0.18503011 - time (sec): 169.43 - samples/sec: 873.71 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 19:26:26,956 epoch 2 - iter 2605/5212 - loss 0.18746087 - time (sec): 212.71 - samples/sec: 860.20 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 19:27:10,029 epoch 2 - iter 3126/5212 - loss 0.18636111 - time (sec): 255.79 - samples/sec: 859.99 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 19:27:50,405 epoch 2 - iter 3647/5212 - loss 0.18553488 - time (sec): 296.16 - samples/sec: 859.53 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 19:28:31,258 epoch 2 - iter 4168/5212 - loss 0.18074757 - time (sec): 337.02 - samples/sec: 866.96 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 19:29:12,346 epoch 2 - iter 4689/5212 - loss 0.17930858 - time (sec): 378.10 - samples/sec: 871.71 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 19:29:53,662 epoch 2 - iter 5210/5212 - loss 0.17753408 - time (sec): 419.42 - samples/sec: 875.95 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 19:29:53,812 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 19:29:53,812 EPOCH 2 done: loss 0.1775 - lr: 0.000027
101
+ 2023-10-17 19:30:05,947 DEV : loss 0.1658860296010971 - f1-score (micro avg) 0.328
102
+ 2023-10-17 19:30:06,006 saving best model
103
+ 2023-10-17 19:30:07,423 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 19:30:48,361 epoch 3 - iter 521/5212 - loss 0.12724045 - time (sec): 40.93 - samples/sec: 873.85 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 19:31:30,342 epoch 3 - iter 1042/5212 - loss 0.12659364 - time (sec): 82.91 - samples/sec: 885.08 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 19:32:11,449 epoch 3 - iter 1563/5212 - loss 0.13086892 - time (sec): 124.02 - samples/sec: 885.97 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 19:32:51,804 epoch 3 - iter 2084/5212 - loss 0.13601022 - time (sec): 164.38 - samples/sec: 885.41 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 19:33:33,570 epoch 3 - iter 2605/5212 - loss 0.13335118 - time (sec): 206.14 - samples/sec: 884.09 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 19:34:14,705 epoch 3 - iter 3126/5212 - loss 0.13049815 - time (sec): 247.28 - samples/sec: 895.14 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 19:34:54,896 epoch 3 - iter 3647/5212 - loss 0.13086755 - time (sec): 287.47 - samples/sec: 896.68 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 19:35:36,596 epoch 3 - iter 4168/5212 - loss 0.12984620 - time (sec): 329.17 - samples/sec: 897.45 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 19:36:17,637 epoch 3 - iter 4689/5212 - loss 0.12848690 - time (sec): 370.21 - samples/sec: 887.38 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 19:37:00,276 epoch 3 - iter 5210/5212 - loss 0.12780510 - time (sec): 412.85 - samples/sec: 889.84 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 19:37:00,434 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 19:37:00,434 EPOCH 3 done: loss 0.1278 - lr: 0.000023
116
+ 2023-10-17 19:37:11,317 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3325
117
+ 2023-10-17 19:37:11,372 saving best model
118
+ 2023-10-17 19:37:13,638 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 19:37:55,294 epoch 4 - iter 521/5212 - loss 0.08561438 - time (sec): 41.65 - samples/sec: 889.99 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 19:38:37,303 epoch 4 - iter 1042/5212 - loss 0.08487661 - time (sec): 83.66 - samples/sec: 902.31 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-17 19:39:17,623 epoch 4 - iter 1563/5212 - loss 0.08733029 - time (sec): 123.98 - samples/sec: 892.46 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 19:39:57,086 epoch 4 - iter 2084/5212 - loss 0.09068160 - time (sec): 163.44 - samples/sec: 900.17 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 19:40:38,544 epoch 4 - iter 2605/5212 - loss 0.08826575 - time (sec): 204.90 - samples/sec: 895.06 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-17 19:41:20,337 epoch 4 - iter 3126/5212 - loss 0.08927590 - time (sec): 246.70 - samples/sec: 890.13 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 19:42:02,639 epoch 4 - iter 3647/5212 - loss 0.09103052 - time (sec): 289.00 - samples/sec: 885.89 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 19:42:44,152 epoch 4 - iter 4168/5212 - loss 0.09120563 - time (sec): 330.51 - samples/sec: 889.32 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-17 19:43:24,805 epoch 4 - iter 4689/5212 - loss 0.09067149 - time (sec): 371.16 - samples/sec: 887.48 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 19:44:06,153 epoch 4 - iter 5210/5212 - loss 0.08963326 - time (sec): 412.51 - samples/sec: 890.27 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-17 19:44:06,304 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 19:44:06,304 EPOCH 4 done: loss 0.0896 - lr: 0.000020
131
+ 2023-10-17 19:44:17,266 DEV : loss 0.37559980154037476 - f1-score (micro avg) 0.3333
132
+ 2023-10-17 19:44:17,320 saving best model
133
+ 2023-10-17 19:44:18,755 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-17 19:45:00,565 epoch 5 - iter 521/5212 - loss 0.07091184 - time (sec): 41.81 - samples/sec: 870.41 - lr: 0.000020 - momentum: 0.000000
135
+ 2023-10-17 19:45:41,701 epoch 5 - iter 1042/5212 - loss 0.06709683 - time (sec): 82.94 - samples/sec: 876.92 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 19:46:25,201 epoch 5 - iter 1563/5212 - loss 0.06271210 - time (sec): 126.44 - samples/sec: 878.30 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-17 19:47:08,265 epoch 5 - iter 2084/5212 - loss 0.06183598 - time (sec): 169.51 - samples/sec: 876.76 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-17 19:47:51,557 epoch 5 - iter 2605/5212 - loss 0.06620122 - time (sec): 212.80 - samples/sec: 873.44 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 19:48:32,575 epoch 5 - iter 3126/5212 - loss 0.06541861 - time (sec): 253.82 - samples/sec: 879.21 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-17 19:49:13,264 epoch 5 - iter 3647/5212 - loss 0.06535621 - time (sec): 294.51 - samples/sec: 881.37 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-17 19:49:55,190 epoch 5 - iter 4168/5212 - loss 0.06634610 - time (sec): 336.43 - samples/sec: 874.02 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 19:50:36,970 epoch 5 - iter 4689/5212 - loss 0.06558366 - time (sec): 378.21 - samples/sec: 878.27 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-17 19:51:19,264 epoch 5 - iter 5210/5212 - loss 0.06576385 - time (sec): 420.50 - samples/sec: 873.25 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-17 19:51:19,446 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-17 19:51:19,447 EPOCH 5 done: loss 0.0657 - lr: 0.000017
146
+ 2023-10-17 19:51:30,543 DEV : loss 0.37306851148605347 - f1-score (micro avg) 0.3721
147
+ 2023-10-17 19:51:30,599 saving best model
148
+ 2023-10-17 19:51:32,021 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-17 19:52:13,905 epoch 6 - iter 521/5212 - loss 0.04071800 - time (sec): 41.88 - samples/sec: 899.21 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-17 19:52:56,216 epoch 6 - iter 1042/5212 - loss 0.04067970 - time (sec): 84.19 - samples/sec: 851.43 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-17 19:53:38,996 epoch 6 - iter 1563/5212 - loss 0.04392675 - time (sec): 126.97 - samples/sec: 843.09 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-17 19:54:20,689 epoch 6 - iter 2084/5212 - loss 0.04624917 - time (sec): 168.66 - samples/sec: 841.75 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-17 19:55:02,737 epoch 6 - iter 2605/5212 - loss 0.04664802 - time (sec): 210.71 - samples/sec: 842.84 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-17 19:55:43,236 epoch 6 - iter 3126/5212 - loss 0.04611913 - time (sec): 251.21 - samples/sec: 856.24 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-17 19:56:25,765 epoch 6 - iter 3647/5212 - loss 0.04513070 - time (sec): 293.74 - samples/sec: 857.27 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-17 19:57:08,519 epoch 6 - iter 4168/5212 - loss 0.04430926 - time (sec): 336.49 - samples/sec: 867.81 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-17 19:57:51,149 epoch 6 - iter 4689/5212 - loss 0.04422683 - time (sec): 379.12 - samples/sec: 876.18 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-17 19:58:32,733 epoch 6 - iter 5210/5212 - loss 0.04435344 - time (sec): 420.71 - samples/sec: 873.18 - lr: 0.000013 - momentum: 0.000000
159
+ 2023-10-17 19:58:32,886 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-17 19:58:32,887 EPOCH 6 done: loss 0.0444 - lr: 0.000013
161
+ 2023-10-17 19:58:44,404 DEV : loss 0.38573741912841797 - f1-score (micro avg) 0.3998
162
+ 2023-10-17 19:58:44,465 saving best model
163
+ 2023-10-17 19:58:45,880 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-17 19:59:27,983 epoch 7 - iter 521/5212 - loss 0.03311573 - time (sec): 42.10 - samples/sec: 920.02 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-17 20:00:08,876 epoch 7 - iter 1042/5212 - loss 0.02677280 - time (sec): 82.99 - samples/sec: 903.20 - lr: 0.000013 - momentum: 0.000000
166
+ 2023-10-17 20:00:50,108 epoch 7 - iter 1563/5212 - loss 0.03149182 - time (sec): 124.22 - samples/sec: 906.02 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-17 20:01:30,385 epoch 7 - iter 2084/5212 - loss 0.03097008 - time (sec): 164.50 - samples/sec: 895.57 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-17 20:02:12,354 epoch 7 - iter 2605/5212 - loss 0.03138727 - time (sec): 206.47 - samples/sec: 889.41 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-17 20:02:50,517 epoch 7 - iter 3126/5212 - loss 0.03238816 - time (sec): 244.63 - samples/sec: 907.86 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-17 20:03:28,571 epoch 7 - iter 3647/5212 - loss 0.03240198 - time (sec): 282.69 - samples/sec: 910.52 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-17 20:04:09,623 epoch 7 - iter 4168/5212 - loss 0.03231063 - time (sec): 323.74 - samples/sec: 903.97 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-17 20:04:50,358 epoch 7 - iter 4689/5212 - loss 0.03217252 - time (sec): 364.47 - samples/sec: 902.06 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-17 20:05:30,790 epoch 7 - iter 5210/5212 - loss 0.03119367 - time (sec): 404.90 - samples/sec: 907.24 - lr: 0.000010 - momentum: 0.000000
174
+ 2023-10-17 20:05:30,938 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-17 20:05:30,939 EPOCH 7 done: loss 0.0312 - lr: 0.000010
176
+ 2023-10-17 20:05:42,794 DEV : loss 0.4771367907524109 - f1-score (micro avg) 0.3705
177
+ 2023-10-17 20:05:42,853 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-17 20:06:23,460 epoch 8 - iter 521/5212 - loss 0.02998807 - time (sec): 40.61 - samples/sec: 888.36 - lr: 0.000010 - momentum: 0.000000
179
+ 2023-10-17 20:07:05,326 epoch 8 - iter 1042/5212 - loss 0.02453114 - time (sec): 82.47 - samples/sec: 879.19 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-17 20:07:46,564 epoch 8 - iter 1563/5212 - loss 0.02200622 - time (sec): 123.71 - samples/sec: 889.00 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-17 20:08:28,073 epoch 8 - iter 2084/5212 - loss 0.02307900 - time (sec): 165.22 - samples/sec: 888.53 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-17 20:09:09,714 epoch 8 - iter 2605/5212 - loss 0.02297405 - time (sec): 206.86 - samples/sec: 888.20 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-17 20:09:51,048 epoch 8 - iter 3126/5212 - loss 0.02283079 - time (sec): 248.19 - samples/sec: 883.30 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-17 20:10:33,437 epoch 8 - iter 3647/5212 - loss 0.02212019 - time (sec): 290.58 - samples/sec: 882.42 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-17 20:11:13,823 epoch 8 - iter 4168/5212 - loss 0.02167924 - time (sec): 330.97 - samples/sec: 883.31 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-17 20:11:54,496 epoch 8 - iter 4689/5212 - loss 0.02165052 - time (sec): 371.64 - samples/sec: 884.04 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-17 20:12:36,344 epoch 8 - iter 5210/5212 - loss 0.02175733 - time (sec): 413.49 - samples/sec: 888.16 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-17 20:12:36,512 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-17 20:12:36,512 EPOCH 8 done: loss 0.0217 - lr: 0.000007
190
+ 2023-10-17 20:12:48,699 DEV : loss 0.4998532235622406 - f1-score (micro avg) 0.3543
191
+ 2023-10-17 20:12:48,755 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-17 20:13:31,960 epoch 9 - iter 521/5212 - loss 0.00863824 - time (sec): 43.20 - samples/sec: 948.05 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-17 20:14:14,947 epoch 9 - iter 1042/5212 - loss 0.01171617 - time (sec): 86.19 - samples/sec: 908.03 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-17 20:14:56,079 epoch 9 - iter 1563/5212 - loss 0.01668874 - time (sec): 127.32 - samples/sec: 891.92 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-17 20:15:37,604 epoch 9 - iter 2084/5212 - loss 0.01665217 - time (sec): 168.85 - samples/sec: 894.47 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-17 20:16:21,106 epoch 9 - iter 2605/5212 - loss 0.01635385 - time (sec): 212.35 - samples/sec: 882.36 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-17 20:17:04,978 epoch 9 - iter 3126/5212 - loss 0.01550677 - time (sec): 256.22 - samples/sec: 872.71 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-17 20:17:47,321 epoch 9 - iter 3647/5212 - loss 0.01517082 - time (sec): 298.56 - samples/sec: 868.72 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-17 20:18:29,546 epoch 9 - iter 4168/5212 - loss 0.01578016 - time (sec): 340.79 - samples/sec: 872.86 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-17 20:19:12,396 epoch 9 - iter 4689/5212 - loss 0.01560147 - time (sec): 383.64 - samples/sec: 872.79 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-17 20:19:53,494 epoch 9 - iter 5210/5212 - loss 0.01545797 - time (sec): 424.74 - samples/sec: 864.75 - lr: 0.000003 - momentum: 0.000000
202
+ 2023-10-17 20:19:53,648 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 20:19:53,648 EPOCH 9 done: loss 0.0155 - lr: 0.000003
204
+ 2023-10-17 20:20:05,733 DEV : loss 0.5256008505821228 - f1-score (micro avg) 0.3744
205
+ 2023-10-17 20:20:05,821 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-17 20:20:49,600 epoch 10 - iter 521/5212 - loss 0.00845977 - time (sec): 43.78 - samples/sec: 868.52 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-17 20:21:32,317 epoch 10 - iter 1042/5212 - loss 0.00922324 - time (sec): 86.49 - samples/sec: 897.78 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-17 20:22:12,750 epoch 10 - iter 1563/5212 - loss 0.00972860 - time (sec): 126.93 - samples/sec: 874.26 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-17 20:22:54,760 epoch 10 - iter 2084/5212 - loss 0.00952950 - time (sec): 168.94 - samples/sec: 881.10 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-17 20:23:39,004 epoch 10 - iter 2605/5212 - loss 0.00943308 - time (sec): 213.18 - samples/sec: 860.79 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-17 20:24:21,919 epoch 10 - iter 3126/5212 - loss 0.00987377 - time (sec): 256.09 - samples/sec: 859.62 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-17 20:25:08,498 epoch 10 - iter 3647/5212 - loss 0.00962367 - time (sec): 302.67 - samples/sec: 857.97 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-17 20:25:49,520 epoch 10 - iter 4168/5212 - loss 0.00965467 - time (sec): 343.70 - samples/sec: 862.47 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-17 20:26:29,257 epoch 10 - iter 4689/5212 - loss 0.00958044 - time (sec): 383.43 - samples/sec: 861.07 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-17 20:27:09,922 epoch 10 - iter 5210/5212 - loss 0.00919733 - time (sec): 424.10 - samples/sec: 866.15 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-17 20:27:10,080 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 20:27:10,080 EPOCH 10 done: loss 0.0092 - lr: 0.000000
218
+ 2023-10-17 20:27:22,342 DEV : loss 0.5250148177146912 - f1-score (micro avg) 0.3723
219
+ 2023-10-17 20:27:22,998 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-17 20:27:23,000 Loading model from best epoch ...
221
+ 2023-10-17 20:27:25,469 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
222
+ 2023-10-17 20:27:44,580
223
+ Results:
224
+ - F-score (micro) 0.4828
225
+ - F-score (macro) 0.3302
226
+ - Accuracy 0.3225
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ LOC 0.4639 0.6293 0.5341 1214
232
+ PER 0.4487 0.4926 0.4696 808
233
+ ORG 0.3199 0.3144 0.3171 353
234
+ HumanProd 0.0000 0.0000 0.0000 15
235
+
236
+ micro avg 0.4416 0.5326 0.4828 2390
237
+ macro avg 0.3081 0.3591 0.3302 2390
238
+ weighted avg 0.4346 0.5326 0.4769 2390
239
+
240
+ 2023-10-17 20:27:44,580 ----------------------------------------------------------------------------------------------------