Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697556787.bce904bcef33.2251.1 +3 -0
- test.tsv +0 -0
- training.log +237 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:21637249da9545709800d0baf3e5973558e44ddfb1eabfbe5aa7d4e7fd92c126
|
3 |
+
size 440941957
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 15:34:19 0.0000 0.3344 0.1086 0.7626 0.5775 0.6573 0.4982
|
3 |
+
2 15:35:34 0.0000 0.0984 0.1335 0.8687 0.5537 0.6763 0.5174
|
4 |
+
3 15:36:48 0.0000 0.0681 0.0901 0.8939 0.8264 0.8588 0.7619
|
5 |
+
4 15:38:02 0.0000 0.0505 0.0932 0.8637 0.8512 0.8574 0.7616
|
6 |
+
5 15:39:15 0.0000 0.0395 0.1163 0.8821 0.8192 0.8495 0.7488
|
7 |
+
6 15:40:29 0.0000 0.0278 0.1344 0.8807 0.8161 0.8472 0.7474
|
8 |
+
7 15:41:43 0.0000 0.0193 0.1361 0.8884 0.8140 0.8496 0.7476
|
9 |
+
8 15:42:56 0.0000 0.0140 0.1618 0.9119 0.7913 0.8473 0.7422
|
10 |
+
9 15:44:10 0.0000 0.0087 0.1401 0.9037 0.8244 0.8622 0.7666
|
11 |
+
10 15:45:23 0.0000 0.0050 0.1626 0.9062 0.8089 0.8548 0.7558
|
runs/events.out.tfevents.1697556787.bce904bcef33.2251.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e3e41e17429f14ea3120abe24a467854c0052be5739ac5249a45e758214bc538
|
3 |
+
size 808480
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,237 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 15:33:07,867 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 15:33:07,868 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 15:33:07,868 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 15:33:07,868 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
48 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
49 |
+
2023-10-17 15:33:07,868 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 15:33:07,868 Train: 5777 sentences
|
51 |
+
2023-10-17 15:33:07,868 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 15:33:07,868 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 15:33:07,868 Training Params:
|
54 |
+
2023-10-17 15:33:07,869 - learning_rate: "5e-05"
|
55 |
+
2023-10-17 15:33:07,869 - mini_batch_size: "4"
|
56 |
+
2023-10-17 15:33:07,869 - max_epochs: "10"
|
57 |
+
2023-10-17 15:33:07,869 - shuffle: "True"
|
58 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 15:33:07,869 Plugins:
|
60 |
+
2023-10-17 15:33:07,869 - TensorboardLogger
|
61 |
+
2023-10-17 15:33:07,869 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 15:33:07,869 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 15:33:07,869 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 15:33:07,869 Computation:
|
67 |
+
2023-10-17 15:33:07,869 - compute on device: cuda:0
|
68 |
+
2023-10-17 15:33:07,869 - embedding storage: none
|
69 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 15:33:07,869 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
|
71 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 15:33:07,869 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 15:33:07,869 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 15:33:14,818 epoch 1 - iter 144/1445 - loss 1.92593897 - time (sec): 6.95 - samples/sec: 2670.87 - lr: 0.000005 - momentum: 0.000000
|
75 |
+
2023-10-17 15:33:21,656 epoch 1 - iter 288/1445 - loss 1.14848659 - time (sec): 13.79 - samples/sec: 2498.28 - lr: 0.000010 - momentum: 0.000000
|
76 |
+
2023-10-17 15:33:28,548 epoch 1 - iter 432/1445 - loss 0.81417525 - time (sec): 20.68 - samples/sec: 2519.75 - lr: 0.000015 - momentum: 0.000000
|
77 |
+
2023-10-17 15:33:35,383 epoch 1 - iter 576/1445 - loss 0.65607071 - time (sec): 27.51 - samples/sec: 2481.77 - lr: 0.000020 - momentum: 0.000000
|
78 |
+
2023-10-17 15:33:42,441 epoch 1 - iter 720/1445 - loss 0.55106262 - time (sec): 34.57 - samples/sec: 2502.18 - lr: 0.000025 - momentum: 0.000000
|
79 |
+
2023-10-17 15:33:49,471 epoch 1 - iter 864/1445 - loss 0.47665927 - time (sec): 41.60 - samples/sec: 2536.30 - lr: 0.000030 - momentum: 0.000000
|
80 |
+
2023-10-17 15:33:56,060 epoch 1 - iter 1008/1445 - loss 0.42564629 - time (sec): 48.19 - samples/sec: 2555.27 - lr: 0.000035 - momentum: 0.000000
|
81 |
+
2023-10-17 15:34:02,950 epoch 1 - iter 1152/1445 - loss 0.38691122 - time (sec): 55.08 - samples/sec: 2562.18 - lr: 0.000040 - momentum: 0.000000
|
82 |
+
2023-10-17 15:34:09,702 epoch 1 - iter 1296/1445 - loss 0.35796946 - time (sec): 61.83 - samples/sec: 2563.72 - lr: 0.000045 - momentum: 0.000000
|
83 |
+
2023-10-17 15:34:16,350 epoch 1 - iter 1440/1445 - loss 0.33514967 - time (sec): 68.48 - samples/sec: 2565.26 - lr: 0.000050 - momentum: 0.000000
|
84 |
+
2023-10-17 15:34:16,569 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 15:34:16,569 EPOCH 1 done: loss 0.3344 - lr: 0.000050
|
86 |
+
2023-10-17 15:34:19,829 DEV : loss 0.10861274600028992 - f1-score (micro avg) 0.6573
|
87 |
+
2023-10-17 15:34:19,861 saving best model
|
88 |
+
2023-10-17 15:34:20,273 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 15:34:27,269 epoch 2 - iter 144/1445 - loss 0.11625048 - time (sec): 6.99 - samples/sec: 2379.25 - lr: 0.000049 - momentum: 0.000000
|
90 |
+
2023-10-17 15:34:34,119 epoch 2 - iter 288/1445 - loss 0.11444173 - time (sec): 13.84 - samples/sec: 2451.31 - lr: 0.000049 - momentum: 0.000000
|
91 |
+
2023-10-17 15:34:41,181 epoch 2 - iter 432/1445 - loss 0.10811992 - time (sec): 20.91 - samples/sec: 2455.20 - lr: 0.000048 - momentum: 0.000000
|
92 |
+
2023-10-17 15:34:48,042 epoch 2 - iter 576/1445 - loss 0.10325462 - time (sec): 27.77 - samples/sec: 2465.58 - lr: 0.000048 - momentum: 0.000000
|
93 |
+
2023-10-17 15:34:55,055 epoch 2 - iter 720/1445 - loss 0.09805219 - time (sec): 34.78 - samples/sec: 2495.12 - lr: 0.000047 - momentum: 0.000000
|
94 |
+
2023-10-17 15:35:02,213 epoch 2 - iter 864/1445 - loss 0.09411125 - time (sec): 41.94 - samples/sec: 2524.85 - lr: 0.000047 - momentum: 0.000000
|
95 |
+
2023-10-17 15:35:08,997 epoch 2 - iter 1008/1445 - loss 0.09339206 - time (sec): 48.72 - samples/sec: 2518.97 - lr: 0.000046 - momentum: 0.000000
|
96 |
+
2023-10-17 15:35:16,245 epoch 2 - iter 1152/1445 - loss 0.09392903 - time (sec): 55.97 - samples/sec: 2509.82 - lr: 0.000046 - momentum: 0.000000
|
97 |
+
2023-10-17 15:35:23,150 epoch 2 - iter 1296/1445 - loss 0.09494132 - time (sec): 62.87 - samples/sec: 2504.18 - lr: 0.000045 - momentum: 0.000000
|
98 |
+
2023-10-17 15:35:30,230 epoch 2 - iter 1440/1445 - loss 0.09822877 - time (sec): 69.95 - samples/sec: 2512.31 - lr: 0.000044 - momentum: 0.000000
|
99 |
+
2023-10-17 15:35:30,450 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 15:35:30,450 EPOCH 2 done: loss 0.0984 - lr: 0.000044
|
101 |
+
2023-10-17 15:35:34,251 DEV : loss 0.13345900177955627 - f1-score (micro avg) 0.6763
|
102 |
+
2023-10-17 15:35:34,270 saving best model
|
103 |
+
2023-10-17 15:35:34,739 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 15:35:41,733 epoch 3 - iter 144/1445 - loss 0.06944847 - time (sec): 6.99 - samples/sec: 2484.40 - lr: 0.000044 - momentum: 0.000000
|
105 |
+
2023-10-17 15:35:48,620 epoch 3 - iter 288/1445 - loss 0.06628234 - time (sec): 13.88 - samples/sec: 2494.99 - lr: 0.000043 - momentum: 0.000000
|
106 |
+
2023-10-17 15:35:55,938 epoch 3 - iter 432/1445 - loss 0.06588793 - time (sec): 21.20 - samples/sec: 2528.44 - lr: 0.000043 - momentum: 0.000000
|
107 |
+
2023-10-17 15:36:02,862 epoch 3 - iter 576/1445 - loss 0.06519428 - time (sec): 28.12 - samples/sec: 2515.02 - lr: 0.000042 - momentum: 0.000000
|
108 |
+
2023-10-17 15:36:09,625 epoch 3 - iter 720/1445 - loss 0.06602713 - time (sec): 34.88 - samples/sec: 2507.94 - lr: 0.000042 - momentum: 0.000000
|
109 |
+
2023-10-17 15:36:16,739 epoch 3 - iter 864/1445 - loss 0.06637663 - time (sec): 42.00 - samples/sec: 2507.44 - lr: 0.000041 - momentum: 0.000000
|
110 |
+
2023-10-17 15:36:23,756 epoch 3 - iter 1008/1445 - loss 0.06655629 - time (sec): 49.02 - samples/sec: 2489.67 - lr: 0.000041 - momentum: 0.000000
|
111 |
+
2023-10-17 15:36:30,816 epoch 3 - iter 1152/1445 - loss 0.06748428 - time (sec): 56.08 - samples/sec: 2487.24 - lr: 0.000040 - momentum: 0.000000
|
112 |
+
2023-10-17 15:36:38,008 epoch 3 - iter 1296/1445 - loss 0.06799725 - time (sec): 63.27 - samples/sec: 2491.50 - lr: 0.000039 - momentum: 0.000000
|
113 |
+
2023-10-17 15:36:44,928 epoch 3 - iter 1440/1445 - loss 0.06778051 - time (sec): 70.19 - samples/sec: 2504.64 - lr: 0.000039 - momentum: 0.000000
|
114 |
+
2023-10-17 15:36:45,142 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 15:36:45,143 EPOCH 3 done: loss 0.0681 - lr: 0.000039
|
116 |
+
2023-10-17 15:36:48,337 DEV : loss 0.09013378620147705 - f1-score (micro avg) 0.8588
|
117 |
+
2023-10-17 15:36:48,354 saving best model
|
118 |
+
2023-10-17 15:36:48,816 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 15:36:55,898 epoch 4 - iter 144/1445 - loss 0.04676885 - time (sec): 7.08 - samples/sec: 2579.33 - lr: 0.000038 - momentum: 0.000000
|
120 |
+
2023-10-17 15:37:02,866 epoch 4 - iter 288/1445 - loss 0.05260260 - time (sec): 14.04 - samples/sec: 2532.21 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-17 15:37:09,745 epoch 4 - iter 432/1445 - loss 0.04637383 - time (sec): 20.92 - samples/sec: 2529.30 - lr: 0.000037 - momentum: 0.000000
|
122 |
+
2023-10-17 15:37:16,682 epoch 4 - iter 576/1445 - loss 0.04944995 - time (sec): 27.86 - samples/sec: 2523.60 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-17 15:37:23,487 epoch 4 - iter 720/1445 - loss 0.05010050 - time (sec): 34.66 - samples/sec: 2507.18 - lr: 0.000036 - momentum: 0.000000
|
124 |
+
2023-10-17 15:37:30,605 epoch 4 - iter 864/1445 - loss 0.05081688 - time (sec): 41.78 - samples/sec: 2501.61 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-17 15:37:37,605 epoch 4 - iter 1008/1445 - loss 0.05019035 - time (sec): 48.78 - samples/sec: 2507.35 - lr: 0.000035 - momentum: 0.000000
|
126 |
+
2023-10-17 15:37:44,603 epoch 4 - iter 1152/1445 - loss 0.04964440 - time (sec): 55.78 - samples/sec: 2508.06 - lr: 0.000034 - momentum: 0.000000
|
127 |
+
2023-10-17 15:37:51,644 epoch 4 - iter 1296/1445 - loss 0.04892615 - time (sec): 62.82 - samples/sec: 2509.01 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-17 15:37:58,723 epoch 4 - iter 1440/1445 - loss 0.05060487 - time (sec): 69.90 - samples/sec: 2514.90 - lr: 0.000033 - momentum: 0.000000
|
129 |
+
2023-10-17 15:37:58,946 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 15:37:58,946 EPOCH 4 done: loss 0.0505 - lr: 0.000033
|
131 |
+
2023-10-17 15:38:02,111 DEV : loss 0.09323444962501526 - f1-score (micro avg) 0.8574
|
132 |
+
2023-10-17 15:38:02,127 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-17 15:38:09,095 epoch 5 - iter 144/1445 - loss 0.03072807 - time (sec): 6.97 - samples/sec: 2539.84 - lr: 0.000033 - momentum: 0.000000
|
134 |
+
2023-10-17 15:38:16,057 epoch 5 - iter 288/1445 - loss 0.03358255 - time (sec): 13.93 - samples/sec: 2580.24 - lr: 0.000032 - momentum: 0.000000
|
135 |
+
2023-10-17 15:38:23,517 epoch 5 - iter 432/1445 - loss 0.03446332 - time (sec): 21.39 - samples/sec: 2512.59 - lr: 0.000032 - momentum: 0.000000
|
136 |
+
2023-10-17 15:38:30,301 epoch 5 - iter 576/1445 - loss 0.03497844 - time (sec): 28.17 - samples/sec: 2501.32 - lr: 0.000031 - momentum: 0.000000
|
137 |
+
2023-10-17 15:38:37,245 epoch 5 - iter 720/1445 - loss 0.03494447 - time (sec): 35.12 - samples/sec: 2500.49 - lr: 0.000031 - momentum: 0.000000
|
138 |
+
2023-10-17 15:38:44,177 epoch 5 - iter 864/1445 - loss 0.03849124 - time (sec): 42.05 - samples/sec: 2494.59 - lr: 0.000030 - momentum: 0.000000
|
139 |
+
2023-10-17 15:38:51,092 epoch 5 - iter 1008/1445 - loss 0.03868840 - time (sec): 48.96 - samples/sec: 2492.81 - lr: 0.000029 - momentum: 0.000000
|
140 |
+
2023-10-17 15:38:58,128 epoch 5 - iter 1152/1445 - loss 0.03863802 - time (sec): 56.00 - samples/sec: 2510.90 - lr: 0.000029 - momentum: 0.000000
|
141 |
+
2023-10-17 15:39:04,931 epoch 5 - iter 1296/1445 - loss 0.03989823 - time (sec): 62.80 - samples/sec: 2513.11 - lr: 0.000028 - momentum: 0.000000
|
142 |
+
2023-10-17 15:39:12,028 epoch 5 - iter 1440/1445 - loss 0.03957638 - time (sec): 69.90 - samples/sec: 2512.94 - lr: 0.000028 - momentum: 0.000000
|
143 |
+
2023-10-17 15:39:12,247 ----------------------------------------------------------------------------------------------------
|
144 |
+
2023-10-17 15:39:12,248 EPOCH 5 done: loss 0.0395 - lr: 0.000028
|
145 |
+
2023-10-17 15:39:15,509 DEV : loss 0.11628924310207367 - f1-score (micro avg) 0.8495
|
146 |
+
2023-10-17 15:39:15,527 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-17 15:39:22,621 epoch 6 - iter 144/1445 - loss 0.02447880 - time (sec): 7.09 - samples/sec: 2467.72 - lr: 0.000027 - momentum: 0.000000
|
148 |
+
2023-10-17 15:39:29,423 epoch 6 - iter 288/1445 - loss 0.03137841 - time (sec): 13.89 - samples/sec: 2449.60 - lr: 0.000027 - momentum: 0.000000
|
149 |
+
2023-10-17 15:39:36,131 epoch 6 - iter 432/1445 - loss 0.03035696 - time (sec): 20.60 - samples/sec: 2504.30 - lr: 0.000026 - momentum: 0.000000
|
150 |
+
2023-10-17 15:39:43,478 epoch 6 - iter 576/1445 - loss 0.02805603 - time (sec): 27.95 - samples/sec: 2497.07 - lr: 0.000026 - momentum: 0.000000
|
151 |
+
2023-10-17 15:39:50,654 epoch 6 - iter 720/1445 - loss 0.02854656 - time (sec): 35.13 - samples/sec: 2505.16 - lr: 0.000025 - momentum: 0.000000
|
152 |
+
2023-10-17 15:39:57,367 epoch 6 - iter 864/1445 - loss 0.02796582 - time (sec): 41.84 - samples/sec: 2496.03 - lr: 0.000024 - momentum: 0.000000
|
153 |
+
2023-10-17 15:40:04,170 epoch 6 - iter 1008/1445 - loss 0.02775939 - time (sec): 48.64 - samples/sec: 2519.64 - lr: 0.000024 - momentum: 0.000000
|
154 |
+
2023-10-17 15:40:11,115 epoch 6 - iter 1152/1445 - loss 0.02855916 - time (sec): 55.59 - samples/sec: 2501.88 - lr: 0.000023 - momentum: 0.000000
|
155 |
+
2023-10-17 15:40:18,248 epoch 6 - iter 1296/1445 - loss 0.02841167 - time (sec): 62.72 - samples/sec: 2499.15 - lr: 0.000023 - momentum: 0.000000
|
156 |
+
2023-10-17 15:40:25,387 epoch 6 - iter 1440/1445 - loss 0.02790215 - time (sec): 69.86 - samples/sec: 2512.16 - lr: 0.000022 - momentum: 0.000000
|
157 |
+
2023-10-17 15:40:25,612 ----------------------------------------------------------------------------------------------------
|
158 |
+
2023-10-17 15:40:25,612 EPOCH 6 done: loss 0.0278 - lr: 0.000022
|
159 |
+
2023-10-17 15:40:29,051 DEV : loss 0.13435053825378418 - f1-score (micro avg) 0.8472
|
160 |
+
2023-10-17 15:40:29,073 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-17 15:40:36,627 epoch 7 - iter 144/1445 - loss 0.03621832 - time (sec): 7.55 - samples/sec: 2286.98 - lr: 0.000022 - momentum: 0.000000
|
162 |
+
2023-10-17 15:40:43,645 epoch 7 - iter 288/1445 - loss 0.02724953 - time (sec): 14.57 - samples/sec: 2335.39 - lr: 0.000021 - momentum: 0.000000
|
163 |
+
2023-10-17 15:40:50,674 epoch 7 - iter 432/1445 - loss 0.02754718 - time (sec): 21.60 - samples/sec: 2405.25 - lr: 0.000021 - momentum: 0.000000
|
164 |
+
2023-10-17 15:40:57,976 epoch 7 - iter 576/1445 - loss 0.02528310 - time (sec): 28.90 - samples/sec: 2422.96 - lr: 0.000020 - momentum: 0.000000
|
165 |
+
2023-10-17 15:41:04,930 epoch 7 - iter 720/1445 - loss 0.02377881 - time (sec): 35.86 - samples/sec: 2435.71 - lr: 0.000019 - momentum: 0.000000
|
166 |
+
2023-10-17 15:41:11,853 epoch 7 - iter 864/1445 - loss 0.02330516 - time (sec): 42.78 - samples/sec: 2478.15 - lr: 0.000019 - momentum: 0.000000
|
167 |
+
2023-10-17 15:41:18,745 epoch 7 - iter 1008/1445 - loss 0.02094244 - time (sec): 49.67 - samples/sec: 2479.00 - lr: 0.000018 - momentum: 0.000000
|
168 |
+
2023-10-17 15:41:25,574 epoch 7 - iter 1152/1445 - loss 0.01946340 - time (sec): 56.50 - samples/sec: 2481.85 - lr: 0.000018 - momentum: 0.000000
|
169 |
+
2023-10-17 15:41:33,047 epoch 7 - iter 1296/1445 - loss 0.01934243 - time (sec): 63.97 - samples/sec: 2471.24 - lr: 0.000017 - momentum: 0.000000
|
170 |
+
2023-10-17 15:41:39,929 epoch 7 - iter 1440/1445 - loss 0.01931728 - time (sec): 70.85 - samples/sec: 2480.95 - lr: 0.000017 - momentum: 0.000000
|
171 |
+
2023-10-17 15:41:40,152 ----------------------------------------------------------------------------------------------------
|
172 |
+
2023-10-17 15:41:40,153 EPOCH 7 done: loss 0.0193 - lr: 0.000017
|
173 |
+
2023-10-17 15:41:43,359 DEV : loss 0.13610993325710297 - f1-score (micro avg) 0.8496
|
174 |
+
2023-10-17 15:41:43,376 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 15:41:50,179 epoch 8 - iter 144/1445 - loss 0.01156269 - time (sec): 6.80 - samples/sec: 2373.92 - lr: 0.000016 - momentum: 0.000000
|
176 |
+
2023-10-17 15:41:57,334 epoch 8 - iter 288/1445 - loss 0.01424879 - time (sec): 13.96 - samples/sec: 2475.60 - lr: 0.000016 - momentum: 0.000000
|
177 |
+
2023-10-17 15:42:04,154 epoch 8 - iter 432/1445 - loss 0.01152694 - time (sec): 20.78 - samples/sec: 2517.25 - lr: 0.000015 - momentum: 0.000000
|
178 |
+
2023-10-17 15:42:11,076 epoch 8 - iter 576/1445 - loss 0.01252404 - time (sec): 27.70 - samples/sec: 2492.28 - lr: 0.000014 - momentum: 0.000000
|
179 |
+
2023-10-17 15:42:18,115 epoch 8 - iter 720/1445 - loss 0.01307924 - time (sec): 34.74 - samples/sec: 2478.52 - lr: 0.000014 - momentum: 0.000000
|
180 |
+
2023-10-17 15:42:25,240 epoch 8 - iter 864/1445 - loss 0.01191157 - time (sec): 41.86 - samples/sec: 2498.03 - lr: 0.000013 - momentum: 0.000000
|
181 |
+
2023-10-17 15:42:31,873 epoch 8 - iter 1008/1445 - loss 0.01386504 - time (sec): 48.50 - samples/sec: 2523.71 - lr: 0.000013 - momentum: 0.000000
|
182 |
+
2023-10-17 15:42:38,759 epoch 8 - iter 1152/1445 - loss 0.01425968 - time (sec): 55.38 - samples/sec: 2514.92 - lr: 0.000012 - momentum: 0.000000
|
183 |
+
2023-10-17 15:42:45,754 epoch 8 - iter 1296/1445 - loss 0.01357515 - time (sec): 62.38 - samples/sec: 2530.49 - lr: 0.000012 - momentum: 0.000000
|
184 |
+
2023-10-17 15:42:52,599 epoch 8 - iter 1440/1445 - loss 0.01409541 - time (sec): 69.22 - samples/sec: 2534.97 - lr: 0.000011 - momentum: 0.000000
|
185 |
+
2023-10-17 15:42:52,844 ----------------------------------------------------------------------------------------------------
|
186 |
+
2023-10-17 15:42:52,844 EPOCH 8 done: loss 0.0140 - lr: 0.000011
|
187 |
+
2023-10-17 15:42:56,085 DEV : loss 0.16182565689086914 - f1-score (micro avg) 0.8473
|
188 |
+
2023-10-17 15:42:56,101 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 15:43:03,053 epoch 9 - iter 144/1445 - loss 0.01038750 - time (sec): 6.95 - samples/sec: 2763.07 - lr: 0.000011 - momentum: 0.000000
|
190 |
+
2023-10-17 15:43:09,940 epoch 9 - iter 288/1445 - loss 0.00853687 - time (sec): 13.84 - samples/sec: 2536.32 - lr: 0.000010 - momentum: 0.000000
|
191 |
+
2023-10-17 15:43:17,344 epoch 9 - iter 432/1445 - loss 0.00758837 - time (sec): 21.24 - samples/sec: 2549.74 - lr: 0.000009 - momentum: 0.000000
|
192 |
+
2023-10-17 15:43:24,505 epoch 9 - iter 576/1445 - loss 0.00778959 - time (sec): 28.40 - samples/sec: 2550.27 - lr: 0.000009 - momentum: 0.000000
|
193 |
+
2023-10-17 15:43:31,649 epoch 9 - iter 720/1445 - loss 0.00866403 - time (sec): 35.55 - samples/sec: 2519.68 - lr: 0.000008 - momentum: 0.000000
|
194 |
+
2023-10-17 15:43:38,340 epoch 9 - iter 864/1445 - loss 0.00829698 - time (sec): 42.24 - samples/sec: 2498.92 - lr: 0.000008 - momentum: 0.000000
|
195 |
+
2023-10-17 15:43:45,253 epoch 9 - iter 1008/1445 - loss 0.00824515 - time (sec): 49.15 - samples/sec: 2506.38 - lr: 0.000007 - momentum: 0.000000
|
196 |
+
2023-10-17 15:43:52,687 epoch 9 - iter 1152/1445 - loss 0.00842807 - time (sec): 56.58 - samples/sec: 2497.57 - lr: 0.000007 - momentum: 0.000000
|
197 |
+
2023-10-17 15:43:59,636 epoch 9 - iter 1296/1445 - loss 0.00853763 - time (sec): 63.53 - samples/sec: 2501.91 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-17 15:44:06,350 epoch 9 - iter 1440/1445 - loss 0.00875377 - time (sec): 70.25 - samples/sec: 2497.52 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-17 15:44:06,618 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-17 15:44:06,619 EPOCH 9 done: loss 0.0087 - lr: 0.000006
|
201 |
+
2023-10-17 15:44:10,219 DEV : loss 0.14013005793094635 - f1-score (micro avg) 0.8622
|
202 |
+
2023-10-17 15:44:10,236 saving best model
|
203 |
+
2023-10-17 15:44:10,722 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-17 15:44:17,455 epoch 10 - iter 144/1445 - loss 0.00579021 - time (sec): 6.73 - samples/sec: 2592.22 - lr: 0.000005 - momentum: 0.000000
|
205 |
+
2023-10-17 15:44:24,391 epoch 10 - iter 288/1445 - loss 0.00521433 - time (sec): 13.67 - samples/sec: 2564.98 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-17 15:44:31,282 epoch 10 - iter 432/1445 - loss 0.00599340 - time (sec): 20.56 - samples/sec: 2514.52 - lr: 0.000004 - momentum: 0.000000
|
207 |
+
2023-10-17 15:44:38,028 epoch 10 - iter 576/1445 - loss 0.00550622 - time (sec): 27.30 - samples/sec: 2488.42 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-17 15:44:45,428 epoch 10 - iter 720/1445 - loss 0.00533641 - time (sec): 34.70 - samples/sec: 2497.69 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-17 15:44:52,515 epoch 10 - iter 864/1445 - loss 0.00570968 - time (sec): 41.79 - samples/sec: 2521.83 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-17 15:44:59,374 epoch 10 - iter 1008/1445 - loss 0.00509566 - time (sec): 48.65 - samples/sec: 2508.54 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-17 15:45:06,472 epoch 10 - iter 1152/1445 - loss 0.00510136 - time (sec): 55.75 - samples/sec: 2513.10 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-17 15:45:13,388 epoch 10 - iter 1296/1445 - loss 0.00492028 - time (sec): 62.66 - samples/sec: 2523.18 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-17 15:45:20,396 epoch 10 - iter 1440/1445 - loss 0.00498631 - time (sec): 69.67 - samples/sec: 2522.46 - lr: 0.000000 - momentum: 0.000000
|
214 |
+
2023-10-17 15:45:20,612 ----------------------------------------------------------------------------------------------------
|
215 |
+
2023-10-17 15:45:20,612 EPOCH 10 done: loss 0.0050 - lr: 0.000000
|
216 |
+
2023-10-17 15:45:23,866 DEV : loss 0.16260549426078796 - f1-score (micro avg) 0.8548
|
217 |
+
2023-10-17 15:45:24,223 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-17 15:45:24,225 Loading model from best epoch ...
|
219 |
+
2023-10-17 15:45:25,583 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
220 |
+
2023-10-17 15:45:28,550
|
221 |
+
Results:
|
222 |
+
- F-score (micro) 0.8439
|
223 |
+
- F-score (macro) 0.7461
|
224 |
+
- Accuracy 0.7378
|
225 |
+
|
226 |
+
By class:
|
227 |
+
precision recall f1-score support
|
228 |
+
|
229 |
+
PER 0.8556 0.8361 0.8458 482
|
230 |
+
LOC 0.9392 0.8428 0.8884 458
|
231 |
+
ORG 0.6000 0.4348 0.5042 69
|
232 |
+
|
233 |
+
micro avg 0.8788 0.8117 0.8439 1009
|
234 |
+
macro avg 0.7983 0.7046 0.7461 1009
|
235 |
+
weighted avg 0.8761 0.8117 0.8417 1009
|
236 |
+
|
237 |
+
2023-10-17 15:45:28,551 ----------------------------------------------------------------------------------------------------
|