Upload ./training.log with huggingface_hub
Browse files- training.log +247 -0
training.log
ADDED
@@ -0,0 +1,247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-23 15:49:02,712 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-23 15:49:02,713 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=25, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-23 15:49:02,713 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-23 15:49:02,713 MultiCorpus: 1100 train + 206 dev + 240 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
|
53 |
+
2023-10-23 15:49:02,713 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-23 15:49:02,714 Train: 1100 sentences
|
55 |
+
2023-10-23 15:49:02,714 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-23 15:49:02,714 Training Params:
|
58 |
+
2023-10-23 15:49:02,714 - learning_rate: "3e-05"
|
59 |
+
2023-10-23 15:49:02,714 - mini_batch_size: "4"
|
60 |
+
2023-10-23 15:49:02,714 - max_epochs: "10"
|
61 |
+
2023-10-23 15:49:02,714 - shuffle: "True"
|
62 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-23 15:49:02,714 Plugins:
|
64 |
+
2023-10-23 15:49:02,714 - TensorboardLogger
|
65 |
+
2023-10-23 15:49:02,714 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-23 15:49:02,714 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-23 15:49:02,714 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-23 15:49:02,714 Computation:
|
71 |
+
2023-10-23 15:49:02,714 - compute on device: cuda:0
|
72 |
+
2023-10-23 15:49:02,714 - embedding storage: none
|
73 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-23 15:49:02,714 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
|
75 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-23 15:49:02,714 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-23 15:49:04,134 epoch 1 - iter 27/275 - loss 2.83941383 - time (sec): 1.42 - samples/sec: 1260.09 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-23 15:49:05,543 epoch 1 - iter 54/275 - loss 2.14658630 - time (sec): 2.83 - samples/sec: 1439.75 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-23 15:49:06,964 epoch 1 - iter 81/275 - loss 1.72389593 - time (sec): 4.25 - samples/sec: 1523.60 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-23 15:49:08,373 epoch 1 - iter 108/275 - loss 1.46781181 - time (sec): 5.66 - samples/sec: 1569.08 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-23 15:49:09,768 epoch 1 - iter 135/275 - loss 1.28242063 - time (sec): 7.05 - samples/sec: 1581.18 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-23 15:49:11,165 epoch 1 - iter 162/275 - loss 1.12735272 - time (sec): 8.45 - samples/sec: 1591.46 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-23 15:49:12,563 epoch 1 - iter 189/275 - loss 1.02268702 - time (sec): 9.85 - samples/sec: 1582.62 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-23 15:49:13,960 epoch 1 - iter 216/275 - loss 0.91909206 - time (sec): 11.24 - samples/sec: 1607.03 - lr: 0.000023 - momentum: 0.000000
|
86 |
+
2023-10-23 15:49:15,355 epoch 1 - iter 243/275 - loss 0.84777616 - time (sec): 12.64 - samples/sec: 1606.88 - lr: 0.000026 - momentum: 0.000000
|
87 |
+
2023-10-23 15:49:16,737 epoch 1 - iter 270/275 - loss 0.78992865 - time (sec): 14.02 - samples/sec: 1597.84 - lr: 0.000029 - momentum: 0.000000
|
88 |
+
2023-10-23 15:49:16,993 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-23 15:49:16,994 EPOCH 1 done: loss 0.7815 - lr: 0.000029
|
90 |
+
2023-10-23 15:49:17,408 DEV : loss 0.17806066572666168 - f1-score (micro avg) 0.7256
|
91 |
+
2023-10-23 15:49:17,414 saving best model
|
92 |
+
2023-10-23 15:49:17,807 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-23 15:49:19,194 epoch 2 - iter 27/275 - loss 0.22098647 - time (sec): 1.39 - samples/sec: 1834.03 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-23 15:49:20,589 epoch 2 - iter 54/275 - loss 0.22583475 - time (sec): 2.78 - samples/sec: 1648.14 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-23 15:49:21,977 epoch 2 - iter 81/275 - loss 0.19163146 - time (sec): 4.17 - samples/sec: 1597.05 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-23 15:49:23,396 epoch 2 - iter 108/275 - loss 0.19026888 - time (sec): 5.59 - samples/sec: 1540.17 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-23 15:49:24,807 epoch 2 - iter 135/275 - loss 0.17483222 - time (sec): 7.00 - samples/sec: 1541.10 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-23 15:49:26,223 epoch 2 - iter 162/275 - loss 0.17437410 - time (sec): 8.41 - samples/sec: 1531.54 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-23 15:49:27,666 epoch 2 - iter 189/275 - loss 0.16454035 - time (sec): 9.86 - samples/sec: 1533.95 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-23 15:49:29,094 epoch 2 - iter 216/275 - loss 0.17001907 - time (sec): 11.29 - samples/sec: 1547.08 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-23 15:49:30,501 epoch 2 - iter 243/275 - loss 0.16952883 - time (sec): 12.69 - samples/sec: 1556.75 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-23 15:49:31,915 epoch 2 - iter 270/275 - loss 0.16760966 - time (sec): 14.11 - samples/sec: 1582.65 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-23 15:49:32,173 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-23 15:49:32,174 EPOCH 2 done: loss 0.1669 - lr: 0.000027
|
105 |
+
2023-10-23 15:49:32,710 DEV : loss 0.1282263696193695 - f1-score (micro avg) 0.823
|
106 |
+
2023-10-23 15:49:32,715 saving best model
|
107 |
+
2023-10-23 15:49:33,267 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-23 15:49:34,603 epoch 3 - iter 27/275 - loss 0.11417363 - time (sec): 1.33 - samples/sec: 1752.69 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-23 15:49:35,971 epoch 3 - iter 54/275 - loss 0.10292930 - time (sec): 2.70 - samples/sec: 1682.84 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-23 15:49:37,343 epoch 3 - iter 81/275 - loss 0.11259165 - time (sec): 4.07 - samples/sec: 1658.61 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-23 15:49:38,693 epoch 3 - iter 108/275 - loss 0.11306199 - time (sec): 5.42 - samples/sec: 1673.09 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-23 15:49:39,998 epoch 3 - iter 135/275 - loss 0.10315859 - time (sec): 6.73 - samples/sec: 1691.92 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-23 15:49:41,256 epoch 3 - iter 162/275 - loss 0.09810185 - time (sec): 7.98 - samples/sec: 1703.55 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-23 15:49:42,535 epoch 3 - iter 189/275 - loss 0.09949418 - time (sec): 9.26 - samples/sec: 1708.34 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-23 15:49:43,884 epoch 3 - iter 216/275 - loss 0.09940572 - time (sec): 10.61 - samples/sec: 1692.65 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-23 15:49:45,221 epoch 3 - iter 243/275 - loss 0.09918018 - time (sec): 11.95 - samples/sec: 1700.14 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-23 15:49:46,548 epoch 3 - iter 270/275 - loss 0.09785407 - time (sec): 13.28 - samples/sec: 1677.37 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-23 15:49:46,810 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-23 15:49:46,810 EPOCH 3 done: loss 0.0986 - lr: 0.000023
|
120 |
+
2023-10-23 15:49:47,349 DEV : loss 0.1556384116411209 - f1-score (micro avg) 0.8397
|
121 |
+
2023-10-23 15:49:47,354 saving best model
|
122 |
+
2023-10-23 15:49:47,902 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-23 15:49:49,268 epoch 4 - iter 27/275 - loss 0.10635465 - time (sec): 1.36 - samples/sec: 1699.34 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-23 15:49:50,661 epoch 4 - iter 54/275 - loss 0.07787905 - time (sec): 2.76 - samples/sec: 1710.12 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-23 15:49:52,050 epoch 4 - iter 81/275 - loss 0.07086144 - time (sec): 4.15 - samples/sec: 1619.19 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-23 15:49:53,470 epoch 4 - iter 108/275 - loss 0.06660226 - time (sec): 5.57 - samples/sec: 1579.83 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-23 15:49:55,026 epoch 4 - iter 135/275 - loss 0.06133614 - time (sec): 7.12 - samples/sec: 1579.05 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-23 15:49:56,436 epoch 4 - iter 162/275 - loss 0.07371892 - time (sec): 8.53 - samples/sec: 1600.96 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-23 15:49:57,837 epoch 4 - iter 189/275 - loss 0.06854694 - time (sec): 9.93 - samples/sec: 1601.60 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-23 15:49:59,240 epoch 4 - iter 216/275 - loss 0.07000006 - time (sec): 11.34 - samples/sec: 1590.97 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-23 15:50:00,651 epoch 4 - iter 243/275 - loss 0.06874057 - time (sec): 12.75 - samples/sec: 1562.29 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-23 15:50:02,081 epoch 4 - iter 270/275 - loss 0.06913971 - time (sec): 14.18 - samples/sec: 1577.52 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-23 15:50:02,344 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-23 15:50:02,344 EPOCH 4 done: loss 0.0702 - lr: 0.000020
|
135 |
+
2023-10-23 15:50:02,885 DEV : loss 0.1502317488193512 - f1-score (micro avg) 0.8561
|
136 |
+
2023-10-23 15:50:02,890 saving best model
|
137 |
+
2023-10-23 15:50:03,412 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-23 15:50:04,753 epoch 5 - iter 27/275 - loss 0.05181714 - time (sec): 1.34 - samples/sec: 1740.38 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-23 15:50:06,005 epoch 5 - iter 54/275 - loss 0.07653701 - time (sec): 2.59 - samples/sec: 1739.60 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-23 15:50:07,269 epoch 5 - iter 81/275 - loss 0.06745093 - time (sec): 3.86 - samples/sec: 1753.54 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-23 15:50:08,532 epoch 5 - iter 108/275 - loss 0.05451561 - time (sec): 5.12 - samples/sec: 1742.89 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-23 15:50:09,799 epoch 5 - iter 135/275 - loss 0.05158637 - time (sec): 6.39 - samples/sec: 1769.32 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-23 15:50:11,070 epoch 5 - iter 162/275 - loss 0.05025736 - time (sec): 7.66 - samples/sec: 1749.73 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-23 15:50:12,346 epoch 5 - iter 189/275 - loss 0.04870088 - time (sec): 8.93 - samples/sec: 1752.00 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-23 15:50:13,649 epoch 5 - iter 216/275 - loss 0.04861125 - time (sec): 10.24 - samples/sec: 1747.96 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-23 15:50:14,972 epoch 5 - iter 243/275 - loss 0.05333777 - time (sec): 11.56 - samples/sec: 1721.85 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-23 15:50:16,328 epoch 5 - iter 270/275 - loss 0.05110952 - time (sec): 12.92 - samples/sec: 1723.24 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-23 15:50:16,568 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-23 15:50:16,568 EPOCH 5 done: loss 0.0516 - lr: 0.000017
|
150 |
+
2023-10-23 15:50:17,100 DEV : loss 0.14082755148410797 - f1-score (micro avg) 0.891
|
151 |
+
2023-10-23 15:50:17,105 saving best model
|
152 |
+
2023-10-23 15:50:17,625 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-23 15:50:19,040 epoch 6 - iter 27/275 - loss 0.00386518 - time (sec): 1.41 - samples/sec: 1784.05 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-23 15:50:20,440 epoch 6 - iter 54/275 - loss 0.02921163 - time (sec): 2.81 - samples/sec: 1583.87 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-23 15:50:21,835 epoch 6 - iter 81/275 - loss 0.02939935 - time (sec): 4.21 - samples/sec: 1575.72 - lr: 0.000016 - momentum: 0.000000
|
156 |
+
2023-10-23 15:50:23,184 epoch 6 - iter 108/275 - loss 0.02973268 - time (sec): 5.56 - samples/sec: 1623.15 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-23 15:50:24,544 epoch 6 - iter 135/275 - loss 0.02759517 - time (sec): 6.92 - samples/sec: 1619.60 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-23 15:50:25,902 epoch 6 - iter 162/275 - loss 0.03106721 - time (sec): 8.27 - samples/sec: 1602.30 - lr: 0.000015 - momentum: 0.000000
|
159 |
+
2023-10-23 15:50:27,254 epoch 6 - iter 189/275 - loss 0.03200975 - time (sec): 9.63 - samples/sec: 1600.57 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-23 15:50:28,594 epoch 6 - iter 216/275 - loss 0.03413447 - time (sec): 10.97 - samples/sec: 1606.80 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-23 15:50:29,960 epoch 6 - iter 243/275 - loss 0.03567708 - time (sec): 12.33 - samples/sec: 1618.86 - lr: 0.000014 - momentum: 0.000000
|
162 |
+
2023-10-23 15:50:31,317 epoch 6 - iter 270/275 - loss 0.03741332 - time (sec): 13.69 - samples/sec: 1631.75 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-23 15:50:31,570 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-23 15:50:31,571 EPOCH 6 done: loss 0.0385 - lr: 0.000013
|
165 |
+
2023-10-23 15:50:32,119 DEV : loss 0.16913354396820068 - f1-score (micro avg) 0.8636
|
166 |
+
2023-10-23 15:50:32,125 ----------------------------------------------------------------------------------------------------
|
167 |
+
2023-10-23 15:50:33,534 epoch 7 - iter 27/275 - loss 0.04239827 - time (sec): 1.41 - samples/sec: 1661.75 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-23 15:50:34,955 epoch 7 - iter 54/275 - loss 0.03679797 - time (sec): 2.83 - samples/sec: 1551.80 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-23 15:50:36,351 epoch 7 - iter 81/275 - loss 0.03544132 - time (sec): 4.23 - samples/sec: 1533.62 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-23 15:50:37,751 epoch 7 - iter 108/275 - loss 0.03506543 - time (sec): 5.63 - samples/sec: 1577.68 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-23 15:50:39,184 epoch 7 - iter 135/275 - loss 0.03111449 - time (sec): 7.06 - samples/sec: 1574.98 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-23 15:50:40,590 epoch 7 - iter 162/275 - loss 0.02965772 - time (sec): 8.46 - samples/sec: 1558.89 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-23 15:50:42,017 epoch 7 - iter 189/275 - loss 0.02765179 - time (sec): 9.89 - samples/sec: 1570.22 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-23 15:50:43,434 epoch 7 - iter 216/275 - loss 0.02591692 - time (sec): 11.31 - samples/sec: 1581.26 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-23 15:50:44,860 epoch 7 - iter 243/275 - loss 0.02934153 - time (sec): 12.73 - samples/sec: 1586.85 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-23 15:50:46,288 epoch 7 - iter 270/275 - loss 0.02777065 - time (sec): 14.16 - samples/sec: 1576.34 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-23 15:50:46,565 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-23 15:50:46,566 EPOCH 7 done: loss 0.0278 - lr: 0.000010
|
179 |
+
2023-10-23 15:50:47,104 DEV : loss 0.16255125403404236 - f1-score (micro avg) 0.8921
|
180 |
+
2023-10-23 15:50:47,109 saving best model
|
181 |
+
2023-10-23 15:50:47,650 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-23 15:50:49,094 epoch 8 - iter 27/275 - loss 0.04335307 - time (sec): 1.44 - samples/sec: 1509.48 - lr: 0.000010 - momentum: 0.000000
|
183 |
+
2023-10-23 15:50:50,577 epoch 8 - iter 54/275 - loss 0.02995593 - time (sec): 2.92 - samples/sec: 1564.81 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-23 15:50:52,039 epoch 8 - iter 81/275 - loss 0.03529654 - time (sec): 4.39 - samples/sec: 1527.29 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-23 15:50:53,509 epoch 8 - iter 108/275 - loss 0.02984522 - time (sec): 5.86 - samples/sec: 1565.65 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-23 15:50:54,955 epoch 8 - iter 135/275 - loss 0.02572520 - time (sec): 7.30 - samples/sec: 1575.37 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-23 15:50:56,432 epoch 8 - iter 162/275 - loss 0.02300740 - time (sec): 8.78 - samples/sec: 1591.54 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-23 15:50:57,902 epoch 8 - iter 189/275 - loss 0.02124124 - time (sec): 10.25 - samples/sec: 1563.76 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-23 15:50:59,316 epoch 8 - iter 216/275 - loss 0.01934416 - time (sec): 11.66 - samples/sec: 1547.46 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-23 15:51:00,830 epoch 8 - iter 243/275 - loss 0.02046265 - time (sec): 13.18 - samples/sec: 1530.95 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-23 15:51:02,342 epoch 8 - iter 270/275 - loss 0.02074013 - time (sec): 14.69 - samples/sec: 1518.47 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-23 15:51:02,628 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-23 15:51:02,628 EPOCH 8 done: loss 0.0226 - lr: 0.000007
|
194 |
+
2023-10-23 15:51:03,168 DEV : loss 0.15772368013858795 - f1-score (micro avg) 0.8953
|
195 |
+
2023-10-23 15:51:03,174 saving best model
|
196 |
+
2023-10-23 15:51:03,723 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-23 15:51:05,148 epoch 9 - iter 27/275 - loss 0.01504040 - time (sec): 1.42 - samples/sec: 1600.13 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-23 15:51:06,569 epoch 9 - iter 54/275 - loss 0.01703108 - time (sec): 2.84 - samples/sec: 1611.83 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-23 15:51:07,979 epoch 9 - iter 81/275 - loss 0.02222312 - time (sec): 4.25 - samples/sec: 1597.43 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-23 15:51:09,366 epoch 9 - iter 108/275 - loss 0.02072152 - time (sec): 5.64 - samples/sec: 1592.87 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-23 15:51:10,792 epoch 9 - iter 135/275 - loss 0.01794282 - time (sec): 7.06 - samples/sec: 1600.44 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-23 15:51:12,232 epoch 9 - iter 162/275 - loss 0.01549132 - time (sec): 8.50 - samples/sec: 1609.76 - lr: 0.000005 - momentum: 0.000000
|
203 |
+
2023-10-23 15:51:13,672 epoch 9 - iter 189/275 - loss 0.01519766 - time (sec): 9.95 - samples/sec: 1589.81 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-23 15:51:15,129 epoch 9 - iter 216/275 - loss 0.01494770 - time (sec): 11.40 - samples/sec: 1587.79 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-23 15:51:16,584 epoch 9 - iter 243/275 - loss 0.01368241 - time (sec): 12.86 - samples/sec: 1576.48 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-23 15:51:18,013 epoch 9 - iter 270/275 - loss 0.01433322 - time (sec): 14.29 - samples/sec: 1572.91 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-23 15:51:18,281 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-23 15:51:18,281 EPOCH 9 done: loss 0.0141 - lr: 0.000003
|
209 |
+
2023-10-23 15:51:18,817 DEV : loss 0.16109435260295868 - f1-score (micro avg) 0.8953
|
210 |
+
2023-10-23 15:51:18,822 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-23 15:51:20,254 epoch 10 - iter 27/275 - loss 0.01426170 - time (sec): 1.43 - samples/sec: 1472.72 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-23 15:51:21,704 epoch 10 - iter 54/275 - loss 0.00849233 - time (sec): 2.88 - samples/sec: 1510.58 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-23 15:51:23,155 epoch 10 - iter 81/275 - loss 0.00932013 - time (sec): 4.33 - samples/sec: 1551.58 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-23 15:51:24,587 epoch 10 - iter 108/275 - loss 0.00701482 - time (sec): 5.76 - samples/sec: 1570.09 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-23 15:51:26,045 epoch 10 - iter 135/275 - loss 0.00720699 - time (sec): 7.22 - samples/sec: 1540.00 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-23 15:51:27,472 epoch 10 - iter 162/275 - loss 0.00872732 - time (sec): 8.65 - samples/sec: 1518.11 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-23 15:51:28,940 epoch 10 - iter 189/275 - loss 0.00824989 - time (sec): 10.12 - samples/sec: 1509.48 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-23 15:51:30,385 epoch 10 - iter 216/275 - loss 0.01055276 - time (sec): 11.56 - samples/sec: 1525.31 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-23 15:51:31,861 epoch 10 - iter 243/275 - loss 0.01278568 - time (sec): 13.04 - samples/sec: 1530.53 - lr: 0.000000 - momentum: 0.000000
|
220 |
+
2023-10-23 15:51:33,313 epoch 10 - iter 270/275 - loss 0.01209491 - time (sec): 14.49 - samples/sec: 1541.25 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-23 15:51:33,576 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-23 15:51:33,577 EPOCH 10 done: loss 0.0119 - lr: 0.000000
|
223 |
+
2023-10-23 15:51:34,109 DEV : loss 0.16141371428966522 - f1-score (micro avg) 0.8994
|
224 |
+
2023-10-23 15:51:34,114 saving best model
|
225 |
+
2023-10-23 15:51:35,051 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-23 15:51:35,052 Loading model from best epoch ...
|
227 |
+
2023-10-23 15:51:36,783 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
|
228 |
+
2023-10-23 15:51:37,444
|
229 |
+
Results:
|
230 |
+
- F-score (micro) 0.9041
|
231 |
+
- F-score (macro) 0.7376
|
232 |
+
- Accuracy 0.8411
|
233 |
+
|
234 |
+
By class:
|
235 |
+
precision recall f1-score support
|
236 |
+
|
237 |
+
scope 0.8883 0.9034 0.8958 176
|
238 |
+
pers 0.9685 0.9609 0.9647 128
|
239 |
+
work 0.8451 0.8108 0.8276 74
|
240 |
+
object 1.0000 1.0000 1.0000 2
|
241 |
+
loc 0.0000 0.0000 0.0000 2
|
242 |
+
|
243 |
+
micro avg 0.9077 0.9005 0.9041 382
|
244 |
+
macro avg 0.7404 0.7350 0.7376 382
|
245 |
+
weighted avg 0.9027 0.9005 0.9015 382
|
246 |
+
|
247 |
+
2023-10-23 15:51:37,444 ----------------------------------------------------------------------------------------------------
|