Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697570143.3ae7c61396a7.1160.10 +3 -0
- test.tsv +0 -0
- training.log +240 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7f70feeaf985eecaa9987919d70a75a47d639486cf6ba1ba28a5ba0912bd1434
|
3 |
+
size 440954373
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 19:22:53 0.0000 0.4233 0.1612 0.2026 0.6117 0.3044 0.1804
|
3 |
+
2 19:30:06 0.0000 0.1775 0.1659 0.2386 0.5246 0.3280 0.1969
|
4 |
+
3 19:37:11 0.0000 0.1278 0.2368 0.2510 0.4924 0.3325 0.2008
|
5 |
+
4 19:44:17 0.0000 0.0896 0.3756 0.2362 0.5663 0.3333 0.2008
|
6 |
+
5 19:51:30 0.0000 0.0657 0.3731 0.2761 0.5701 0.3721 0.2294
|
7 |
+
6 19:58:44 0.0000 0.0444 0.3857 0.2899 0.6439 0.3998 0.2506
|
8 |
+
7 20:05:42 0.0000 0.0312 0.4771 0.2636 0.6231 0.3705 0.2286
|
9 |
+
8 20:12:48 0.0000 0.0217 0.4999 0.2510 0.6023 0.3543 0.2162
|
10 |
+
9 20:20:05 0.0000 0.0155 0.5256 0.2680 0.6212 0.3744 0.2315
|
11 |
+
10 20:27:22 0.0000 0.0092 0.5250 0.2668 0.6155 0.3723 0.2298
|
runs/events.out.tfevents.1697570143.3ae7c61396a7.1160.10
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:13f0009ddf1c46450fc32547b399236982603bef8ac12fe5c56fab4858c0ac81
|
3 |
+
size 2923780
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 19:15:43,844 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 19:15:43,846 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 19:15:43,846 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
|
49 |
+
2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 19:15:43,846 Train: 20847 sentences
|
51 |
+
2023-10-17 19:15:43,846 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 19:15:43,846 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 19:15:43,846 Training Params:
|
54 |
+
2023-10-17 19:15:43,847 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 19:15:43,847 - mini_batch_size: "4"
|
56 |
+
2023-10-17 19:15:43,847 - max_epochs: "10"
|
57 |
+
2023-10-17 19:15:43,847 - shuffle: "True"
|
58 |
+
2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 19:15:43,847 Plugins:
|
60 |
+
2023-10-17 19:15:43,847 - TensorboardLogger
|
61 |
+
2023-10-17 19:15:43,847 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 19:15:43,847 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 19:15:43,847 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 19:15:43,847 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 19:15:43,847 Computation:
|
67 |
+
2023-10-17 19:15:43,847 - compute on device: cuda:0
|
68 |
+
2023-10-17 19:15:43,847 - embedding storage: none
|
69 |
+
2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 19:15:43,848 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
|
71 |
+
2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 19:15:43,848 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 19:15:43,848 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 19:16:25,838 epoch 1 - iter 521/5212 - loss 1.79081099 - time (sec): 41.99 - samples/sec: 897.70 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 19:17:08,595 epoch 1 - iter 1042/5212 - loss 1.11359267 - time (sec): 84.75 - samples/sec: 881.25 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 19:17:49,971 epoch 1 - iter 1563/5212 - loss 0.86063766 - time (sec): 126.12 - samples/sec: 874.30 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 19:18:32,756 epoch 1 - iter 2084/5212 - loss 0.72254731 - time (sec): 168.91 - samples/sec: 862.33 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 19:19:14,275 epoch 1 - iter 2605/5212 - loss 0.63113368 - time (sec): 210.42 - samples/sec: 854.34 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 19:19:55,829 epoch 1 - iter 3126/5212 - loss 0.56274254 - time (sec): 251.98 - samples/sec: 861.56 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 19:20:37,475 epoch 1 - iter 3647/5212 - loss 0.51180163 - time (sec): 293.63 - samples/sec: 863.11 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 19:21:20,549 epoch 1 - iter 4168/5212 - loss 0.47801506 - time (sec): 336.70 - samples/sec: 861.50 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 19:22:02,701 epoch 1 - iter 4689/5212 - loss 0.44903165 - time (sec): 378.85 - samples/sec: 858.92 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 19:22:45,402 epoch 1 - iter 5210/5212 - loss 0.42328665 - time (sec): 421.55 - samples/sec: 871.56 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 19:22:45,568 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 19:22:45,568 EPOCH 1 done: loss 0.4233 - lr: 0.000030
|
86 |
+
2023-10-17 19:22:53,561 DEV : loss 0.16118471324443817 - f1-score (micro avg) 0.3044
|
87 |
+
2023-10-17 19:22:53,624 saving best model
|
88 |
+
2023-10-17 19:22:54,240 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 19:23:37,950 epoch 2 - iter 521/5212 - loss 0.19745124 - time (sec): 43.71 - samples/sec: 841.63 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 19:24:20,587 epoch 2 - iter 1042/5212 - loss 0.18718357 - time (sec): 86.34 - samples/sec: 886.43 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 19:25:01,266 epoch 2 - iter 1563/5212 - loss 0.18426743 - time (sec): 127.02 - samples/sec: 883.54 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 19:25:43,672 epoch 2 - iter 2084/5212 - loss 0.18503011 - time (sec): 169.43 - samples/sec: 873.71 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 19:26:26,956 epoch 2 - iter 2605/5212 - loss 0.18746087 - time (sec): 212.71 - samples/sec: 860.20 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 19:27:10,029 epoch 2 - iter 3126/5212 - loss 0.18636111 - time (sec): 255.79 - samples/sec: 859.99 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 19:27:50,405 epoch 2 - iter 3647/5212 - loss 0.18553488 - time (sec): 296.16 - samples/sec: 859.53 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 19:28:31,258 epoch 2 - iter 4168/5212 - loss 0.18074757 - time (sec): 337.02 - samples/sec: 866.96 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 19:29:12,346 epoch 2 - iter 4689/5212 - loss 0.17930858 - time (sec): 378.10 - samples/sec: 871.71 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 19:29:53,662 epoch 2 - iter 5210/5212 - loss 0.17753408 - time (sec): 419.42 - samples/sec: 875.95 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 19:29:53,812 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 19:29:53,812 EPOCH 2 done: loss 0.1775 - lr: 0.000027
|
101 |
+
2023-10-17 19:30:05,947 DEV : loss 0.1658860296010971 - f1-score (micro avg) 0.328
|
102 |
+
2023-10-17 19:30:06,006 saving best model
|
103 |
+
2023-10-17 19:30:07,423 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 19:30:48,361 epoch 3 - iter 521/5212 - loss 0.12724045 - time (sec): 40.93 - samples/sec: 873.85 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 19:31:30,342 epoch 3 - iter 1042/5212 - loss 0.12659364 - time (sec): 82.91 - samples/sec: 885.08 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 19:32:11,449 epoch 3 - iter 1563/5212 - loss 0.13086892 - time (sec): 124.02 - samples/sec: 885.97 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 19:32:51,804 epoch 3 - iter 2084/5212 - loss 0.13601022 - time (sec): 164.38 - samples/sec: 885.41 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 19:33:33,570 epoch 3 - iter 2605/5212 - loss 0.13335118 - time (sec): 206.14 - samples/sec: 884.09 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 19:34:14,705 epoch 3 - iter 3126/5212 - loss 0.13049815 - time (sec): 247.28 - samples/sec: 895.14 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 19:34:54,896 epoch 3 - iter 3647/5212 - loss 0.13086755 - time (sec): 287.47 - samples/sec: 896.68 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 19:35:36,596 epoch 3 - iter 4168/5212 - loss 0.12984620 - time (sec): 329.17 - samples/sec: 897.45 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 19:36:17,637 epoch 3 - iter 4689/5212 - loss 0.12848690 - time (sec): 370.21 - samples/sec: 887.38 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 19:37:00,276 epoch 3 - iter 5210/5212 - loss 0.12780510 - time (sec): 412.85 - samples/sec: 889.84 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 19:37:00,434 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 19:37:00,434 EPOCH 3 done: loss 0.1278 - lr: 0.000023
|
116 |
+
2023-10-17 19:37:11,317 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3325
|
117 |
+
2023-10-17 19:37:11,372 saving best model
|
118 |
+
2023-10-17 19:37:13,638 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 19:37:55,294 epoch 4 - iter 521/5212 - loss 0.08561438 - time (sec): 41.65 - samples/sec: 889.99 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 19:38:37,303 epoch 4 - iter 1042/5212 - loss 0.08487661 - time (sec): 83.66 - samples/sec: 902.31 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 19:39:17,623 epoch 4 - iter 1563/5212 - loss 0.08733029 - time (sec): 123.98 - samples/sec: 892.46 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 19:39:57,086 epoch 4 - iter 2084/5212 - loss 0.09068160 - time (sec): 163.44 - samples/sec: 900.17 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 19:40:38,544 epoch 4 - iter 2605/5212 - loss 0.08826575 - time (sec): 204.90 - samples/sec: 895.06 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 19:41:20,337 epoch 4 - iter 3126/5212 - loss 0.08927590 - time (sec): 246.70 - samples/sec: 890.13 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 19:42:02,639 epoch 4 - iter 3647/5212 - loss 0.09103052 - time (sec): 289.00 - samples/sec: 885.89 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 19:42:44,152 epoch 4 - iter 4168/5212 - loss 0.09120563 - time (sec): 330.51 - samples/sec: 889.32 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 19:43:24,805 epoch 4 - iter 4689/5212 - loss 0.09067149 - time (sec): 371.16 - samples/sec: 887.48 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 19:44:06,153 epoch 4 - iter 5210/5212 - loss 0.08963326 - time (sec): 412.51 - samples/sec: 890.27 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 19:44:06,304 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 19:44:06,304 EPOCH 4 done: loss 0.0896 - lr: 0.000020
|
131 |
+
2023-10-17 19:44:17,266 DEV : loss 0.37559980154037476 - f1-score (micro avg) 0.3333
|
132 |
+
2023-10-17 19:44:17,320 saving best model
|
133 |
+
2023-10-17 19:44:18,755 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 19:45:00,565 epoch 5 - iter 521/5212 - loss 0.07091184 - time (sec): 41.81 - samples/sec: 870.41 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-17 19:45:41,701 epoch 5 - iter 1042/5212 - loss 0.06709683 - time (sec): 82.94 - samples/sec: 876.92 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 19:46:25,201 epoch 5 - iter 1563/5212 - loss 0.06271210 - time (sec): 126.44 - samples/sec: 878.30 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 19:47:08,265 epoch 5 - iter 2084/5212 - loss 0.06183598 - time (sec): 169.51 - samples/sec: 876.76 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-17 19:47:51,557 epoch 5 - iter 2605/5212 - loss 0.06620122 - time (sec): 212.80 - samples/sec: 873.44 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 19:48:32,575 epoch 5 - iter 3126/5212 - loss 0.06541861 - time (sec): 253.82 - samples/sec: 879.21 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 19:49:13,264 epoch 5 - iter 3647/5212 - loss 0.06535621 - time (sec): 294.51 - samples/sec: 881.37 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-17 19:49:55,190 epoch 5 - iter 4168/5212 - loss 0.06634610 - time (sec): 336.43 - samples/sec: 874.02 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 19:50:36,970 epoch 5 - iter 4689/5212 - loss 0.06558366 - time (sec): 378.21 - samples/sec: 878.27 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 19:51:19,264 epoch 5 - iter 5210/5212 - loss 0.06576385 - time (sec): 420.50 - samples/sec: 873.25 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-17 19:51:19,446 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 19:51:19,447 EPOCH 5 done: loss 0.0657 - lr: 0.000017
|
146 |
+
2023-10-17 19:51:30,543 DEV : loss 0.37306851148605347 - f1-score (micro avg) 0.3721
|
147 |
+
2023-10-17 19:51:30,599 saving best model
|
148 |
+
2023-10-17 19:51:32,021 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-17 19:52:13,905 epoch 6 - iter 521/5212 - loss 0.04071800 - time (sec): 41.88 - samples/sec: 899.21 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 19:52:56,216 epoch 6 - iter 1042/5212 - loss 0.04067970 - time (sec): 84.19 - samples/sec: 851.43 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-17 19:53:38,996 epoch 6 - iter 1563/5212 - loss 0.04392675 - time (sec): 126.97 - samples/sec: 843.09 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-17 19:54:20,689 epoch 6 - iter 2084/5212 - loss 0.04624917 - time (sec): 168.66 - samples/sec: 841.75 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 19:55:02,737 epoch 6 - iter 2605/5212 - loss 0.04664802 - time (sec): 210.71 - samples/sec: 842.84 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-17 19:55:43,236 epoch 6 - iter 3126/5212 - loss 0.04611913 - time (sec): 251.21 - samples/sec: 856.24 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-17 19:56:25,765 epoch 6 - iter 3647/5212 - loss 0.04513070 - time (sec): 293.74 - samples/sec: 857.27 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 19:57:08,519 epoch 6 - iter 4168/5212 - loss 0.04430926 - time (sec): 336.49 - samples/sec: 867.81 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-17 19:57:51,149 epoch 6 - iter 4689/5212 - loss 0.04422683 - time (sec): 379.12 - samples/sec: 876.18 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-17 19:58:32,733 epoch 6 - iter 5210/5212 - loss 0.04435344 - time (sec): 420.71 - samples/sec: 873.18 - lr: 0.000013 - momentum: 0.000000
|
159 |
+
2023-10-17 19:58:32,886 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-17 19:58:32,887 EPOCH 6 done: loss 0.0444 - lr: 0.000013
|
161 |
+
2023-10-17 19:58:44,404 DEV : loss 0.38573741912841797 - f1-score (micro avg) 0.3998
|
162 |
+
2023-10-17 19:58:44,465 saving best model
|
163 |
+
2023-10-17 19:58:45,880 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-17 19:59:27,983 epoch 7 - iter 521/5212 - loss 0.03311573 - time (sec): 42.10 - samples/sec: 920.02 - lr: 0.000013 - momentum: 0.000000
|
165 |
+
2023-10-17 20:00:08,876 epoch 7 - iter 1042/5212 - loss 0.02677280 - time (sec): 82.99 - samples/sec: 903.20 - lr: 0.000013 - momentum: 0.000000
|
166 |
+
2023-10-17 20:00:50,108 epoch 7 - iter 1563/5212 - loss 0.03149182 - time (sec): 124.22 - samples/sec: 906.02 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-17 20:01:30,385 epoch 7 - iter 2084/5212 - loss 0.03097008 - time (sec): 164.50 - samples/sec: 895.57 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-17 20:02:12,354 epoch 7 - iter 2605/5212 - loss 0.03138727 - time (sec): 206.47 - samples/sec: 889.41 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-17 20:02:50,517 epoch 7 - iter 3126/5212 - loss 0.03238816 - time (sec): 244.63 - samples/sec: 907.86 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-17 20:03:28,571 epoch 7 - iter 3647/5212 - loss 0.03240198 - time (sec): 282.69 - samples/sec: 910.52 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-17 20:04:09,623 epoch 7 - iter 4168/5212 - loss 0.03231063 - time (sec): 323.74 - samples/sec: 903.97 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-17 20:04:50,358 epoch 7 - iter 4689/5212 - loss 0.03217252 - time (sec): 364.47 - samples/sec: 902.06 - lr: 0.000010 - momentum: 0.000000
|
173 |
+
2023-10-17 20:05:30,790 epoch 7 - iter 5210/5212 - loss 0.03119367 - time (sec): 404.90 - samples/sec: 907.24 - lr: 0.000010 - momentum: 0.000000
|
174 |
+
2023-10-17 20:05:30,938 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 20:05:30,939 EPOCH 7 done: loss 0.0312 - lr: 0.000010
|
176 |
+
2023-10-17 20:05:42,794 DEV : loss 0.4771367907524109 - f1-score (micro avg) 0.3705
|
177 |
+
2023-10-17 20:05:42,853 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-17 20:06:23,460 epoch 8 - iter 521/5212 - loss 0.02998807 - time (sec): 40.61 - samples/sec: 888.36 - lr: 0.000010 - momentum: 0.000000
|
179 |
+
2023-10-17 20:07:05,326 epoch 8 - iter 1042/5212 - loss 0.02453114 - time (sec): 82.47 - samples/sec: 879.19 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-17 20:07:46,564 epoch 8 - iter 1563/5212 - loss 0.02200622 - time (sec): 123.71 - samples/sec: 889.00 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-17 20:08:28,073 epoch 8 - iter 2084/5212 - loss 0.02307900 - time (sec): 165.22 - samples/sec: 888.53 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-17 20:09:09,714 epoch 8 - iter 2605/5212 - loss 0.02297405 - time (sec): 206.86 - samples/sec: 888.20 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-17 20:09:51,048 epoch 8 - iter 3126/5212 - loss 0.02283079 - time (sec): 248.19 - samples/sec: 883.30 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-17 20:10:33,437 epoch 8 - iter 3647/5212 - loss 0.02212019 - time (sec): 290.58 - samples/sec: 882.42 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-17 20:11:13,823 epoch 8 - iter 4168/5212 - loss 0.02167924 - time (sec): 330.97 - samples/sec: 883.31 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-17 20:11:54,496 epoch 8 - iter 4689/5212 - loss 0.02165052 - time (sec): 371.64 - samples/sec: 884.04 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-17 20:12:36,344 epoch 8 - iter 5210/5212 - loss 0.02175733 - time (sec): 413.49 - samples/sec: 888.16 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-17 20:12:36,512 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 20:12:36,512 EPOCH 8 done: loss 0.0217 - lr: 0.000007
|
190 |
+
2023-10-17 20:12:48,699 DEV : loss 0.4998532235622406 - f1-score (micro avg) 0.3543
|
191 |
+
2023-10-17 20:12:48,755 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-17 20:13:31,960 epoch 9 - iter 521/5212 - loss 0.00863824 - time (sec): 43.20 - samples/sec: 948.05 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-17 20:14:14,947 epoch 9 - iter 1042/5212 - loss 0.01171617 - time (sec): 86.19 - samples/sec: 908.03 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-17 20:14:56,079 epoch 9 - iter 1563/5212 - loss 0.01668874 - time (sec): 127.32 - samples/sec: 891.92 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-17 20:15:37,604 epoch 9 - iter 2084/5212 - loss 0.01665217 - time (sec): 168.85 - samples/sec: 894.47 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-17 20:16:21,106 epoch 9 - iter 2605/5212 - loss 0.01635385 - time (sec): 212.35 - samples/sec: 882.36 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-17 20:17:04,978 epoch 9 - iter 3126/5212 - loss 0.01550677 - time (sec): 256.22 - samples/sec: 872.71 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-17 20:17:47,321 epoch 9 - iter 3647/5212 - loss 0.01517082 - time (sec): 298.56 - samples/sec: 868.72 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-17 20:18:29,546 epoch 9 - iter 4168/5212 - loss 0.01578016 - time (sec): 340.79 - samples/sec: 872.86 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-17 20:19:12,396 epoch 9 - iter 4689/5212 - loss 0.01560147 - time (sec): 383.64 - samples/sec: 872.79 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-17 20:19:53,494 epoch 9 - iter 5210/5212 - loss 0.01545797 - time (sec): 424.74 - samples/sec: 864.75 - lr: 0.000003 - momentum: 0.000000
|
202 |
+
2023-10-17 20:19:53,648 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 20:19:53,648 EPOCH 9 done: loss 0.0155 - lr: 0.000003
|
204 |
+
2023-10-17 20:20:05,733 DEV : loss 0.5256008505821228 - f1-score (micro avg) 0.3744
|
205 |
+
2023-10-17 20:20:05,821 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-17 20:20:49,600 epoch 10 - iter 521/5212 - loss 0.00845977 - time (sec): 43.78 - samples/sec: 868.52 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-17 20:21:32,317 epoch 10 - iter 1042/5212 - loss 0.00922324 - time (sec): 86.49 - samples/sec: 897.78 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-17 20:22:12,750 epoch 10 - iter 1563/5212 - loss 0.00972860 - time (sec): 126.93 - samples/sec: 874.26 - lr: 0.000002 - momentum: 0.000000
|
209 |
+
2023-10-17 20:22:54,760 epoch 10 - iter 2084/5212 - loss 0.00952950 - time (sec): 168.94 - samples/sec: 881.10 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-17 20:23:39,004 epoch 10 - iter 2605/5212 - loss 0.00943308 - time (sec): 213.18 - samples/sec: 860.79 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-17 20:24:21,919 epoch 10 - iter 3126/5212 - loss 0.00987377 - time (sec): 256.09 - samples/sec: 859.62 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-17 20:25:08,498 epoch 10 - iter 3647/5212 - loss 0.00962367 - time (sec): 302.67 - samples/sec: 857.97 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-17 20:25:49,520 epoch 10 - iter 4168/5212 - loss 0.00965467 - time (sec): 343.70 - samples/sec: 862.47 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-17 20:26:29,257 epoch 10 - iter 4689/5212 - loss 0.00958044 - time (sec): 383.43 - samples/sec: 861.07 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-10-17 20:27:09,922 epoch 10 - iter 5210/5212 - loss 0.00919733 - time (sec): 424.10 - samples/sec: 866.15 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-17 20:27:10,080 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 20:27:10,080 EPOCH 10 done: loss 0.0092 - lr: 0.000000
|
218 |
+
2023-10-17 20:27:22,342 DEV : loss 0.5250148177146912 - f1-score (micro avg) 0.3723
|
219 |
+
2023-10-17 20:27:22,998 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-17 20:27:23,000 Loading model from best epoch ...
|
221 |
+
2023-10-17 20:27:25,469 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
222 |
+
2023-10-17 20:27:44,580
|
223 |
+
Results:
|
224 |
+
- F-score (micro) 0.4828
|
225 |
+
- F-score (macro) 0.3302
|
226 |
+
- Accuracy 0.3225
|
227 |
+
|
228 |
+
By class:
|
229 |
+
precision recall f1-score support
|
230 |
+
|
231 |
+
LOC 0.4639 0.6293 0.5341 1214
|
232 |
+
PER 0.4487 0.4926 0.4696 808
|
233 |
+
ORG 0.3199 0.3144 0.3171 353
|
234 |
+
HumanProd 0.0000 0.0000 0.0000 15
|
235 |
+
|
236 |
+
micro avg 0.4416 0.5326 0.4828 2390
|
237 |
+
macro avg 0.3081 0.3591 0.3302 2390
|
238 |
+
weighted avg 0.4346 0.5326 0.4769 2390
|
239 |
+
|
240 |
+
2023-10-17 20:27:44,580 ----------------------------------------------------------------------------------------------------
|