stefan-it commited on
Commit
ec91270
1 Parent(s): 19ebe8d

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +247 -0
training.log ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-23 15:49:02,712 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-23 15:49:02,713 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(64001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=25, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-23 15:49:02,713 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-23 15:49:02,713 MultiCorpus: 1100 train + 206 dev + 240 test sentences
52
+ - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
53
+ 2023-10-23 15:49:02,713 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-23 15:49:02,714 Train: 1100 sentences
55
+ 2023-10-23 15:49:02,714 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-23 15:49:02,714 Training Params:
58
+ 2023-10-23 15:49:02,714 - learning_rate: "3e-05"
59
+ 2023-10-23 15:49:02,714 - mini_batch_size: "4"
60
+ 2023-10-23 15:49:02,714 - max_epochs: "10"
61
+ 2023-10-23 15:49:02,714 - shuffle: "True"
62
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-23 15:49:02,714 Plugins:
64
+ 2023-10-23 15:49:02,714 - TensorboardLogger
65
+ 2023-10-23 15:49:02,714 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-23 15:49:02,714 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-23 15:49:02,714 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-23 15:49:02,714 Computation:
71
+ 2023-10-23 15:49:02,714 - compute on device: cuda:0
72
+ 2023-10-23 15:49:02,714 - embedding storage: none
73
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-23 15:49:02,714 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
75
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-23 15:49:02,714 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-23 15:49:02,714 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-23 15:49:04,134 epoch 1 - iter 27/275 - loss 2.83941383 - time (sec): 1.42 - samples/sec: 1260.09 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-23 15:49:05,543 epoch 1 - iter 54/275 - loss 2.14658630 - time (sec): 2.83 - samples/sec: 1439.75 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-23 15:49:06,964 epoch 1 - iter 81/275 - loss 1.72389593 - time (sec): 4.25 - samples/sec: 1523.60 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-23 15:49:08,373 epoch 1 - iter 108/275 - loss 1.46781181 - time (sec): 5.66 - samples/sec: 1569.08 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-23 15:49:09,768 epoch 1 - iter 135/275 - loss 1.28242063 - time (sec): 7.05 - samples/sec: 1581.18 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-23 15:49:11,165 epoch 1 - iter 162/275 - loss 1.12735272 - time (sec): 8.45 - samples/sec: 1591.46 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-23 15:49:12,563 epoch 1 - iter 189/275 - loss 1.02268702 - time (sec): 9.85 - samples/sec: 1582.62 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-23 15:49:13,960 epoch 1 - iter 216/275 - loss 0.91909206 - time (sec): 11.24 - samples/sec: 1607.03 - lr: 0.000023 - momentum: 0.000000
86
+ 2023-10-23 15:49:15,355 epoch 1 - iter 243/275 - loss 0.84777616 - time (sec): 12.64 - samples/sec: 1606.88 - lr: 0.000026 - momentum: 0.000000
87
+ 2023-10-23 15:49:16,737 epoch 1 - iter 270/275 - loss 0.78992865 - time (sec): 14.02 - samples/sec: 1597.84 - lr: 0.000029 - momentum: 0.000000
88
+ 2023-10-23 15:49:16,993 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-23 15:49:16,994 EPOCH 1 done: loss 0.7815 - lr: 0.000029
90
+ 2023-10-23 15:49:17,408 DEV : loss 0.17806066572666168 - f1-score (micro avg) 0.7256
91
+ 2023-10-23 15:49:17,414 saving best model
92
+ 2023-10-23 15:49:17,807 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-23 15:49:19,194 epoch 2 - iter 27/275 - loss 0.22098647 - time (sec): 1.39 - samples/sec: 1834.03 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-23 15:49:20,589 epoch 2 - iter 54/275 - loss 0.22583475 - time (sec): 2.78 - samples/sec: 1648.14 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-23 15:49:21,977 epoch 2 - iter 81/275 - loss 0.19163146 - time (sec): 4.17 - samples/sec: 1597.05 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-23 15:49:23,396 epoch 2 - iter 108/275 - loss 0.19026888 - time (sec): 5.59 - samples/sec: 1540.17 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-23 15:49:24,807 epoch 2 - iter 135/275 - loss 0.17483222 - time (sec): 7.00 - samples/sec: 1541.10 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-23 15:49:26,223 epoch 2 - iter 162/275 - loss 0.17437410 - time (sec): 8.41 - samples/sec: 1531.54 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-23 15:49:27,666 epoch 2 - iter 189/275 - loss 0.16454035 - time (sec): 9.86 - samples/sec: 1533.95 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-23 15:49:29,094 epoch 2 - iter 216/275 - loss 0.17001907 - time (sec): 11.29 - samples/sec: 1547.08 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-23 15:49:30,501 epoch 2 - iter 243/275 - loss 0.16952883 - time (sec): 12.69 - samples/sec: 1556.75 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-23 15:49:31,915 epoch 2 - iter 270/275 - loss 0.16760966 - time (sec): 14.11 - samples/sec: 1582.65 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-23 15:49:32,173 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-23 15:49:32,174 EPOCH 2 done: loss 0.1669 - lr: 0.000027
105
+ 2023-10-23 15:49:32,710 DEV : loss 0.1282263696193695 - f1-score (micro avg) 0.823
106
+ 2023-10-23 15:49:32,715 saving best model
107
+ 2023-10-23 15:49:33,267 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-23 15:49:34,603 epoch 3 - iter 27/275 - loss 0.11417363 - time (sec): 1.33 - samples/sec: 1752.69 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-23 15:49:35,971 epoch 3 - iter 54/275 - loss 0.10292930 - time (sec): 2.70 - samples/sec: 1682.84 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-23 15:49:37,343 epoch 3 - iter 81/275 - loss 0.11259165 - time (sec): 4.07 - samples/sec: 1658.61 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-23 15:49:38,693 epoch 3 - iter 108/275 - loss 0.11306199 - time (sec): 5.42 - samples/sec: 1673.09 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-23 15:49:39,998 epoch 3 - iter 135/275 - loss 0.10315859 - time (sec): 6.73 - samples/sec: 1691.92 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-23 15:49:41,256 epoch 3 - iter 162/275 - loss 0.09810185 - time (sec): 7.98 - samples/sec: 1703.55 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-23 15:49:42,535 epoch 3 - iter 189/275 - loss 0.09949418 - time (sec): 9.26 - samples/sec: 1708.34 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-23 15:49:43,884 epoch 3 - iter 216/275 - loss 0.09940572 - time (sec): 10.61 - samples/sec: 1692.65 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-23 15:49:45,221 epoch 3 - iter 243/275 - loss 0.09918018 - time (sec): 11.95 - samples/sec: 1700.14 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-23 15:49:46,548 epoch 3 - iter 270/275 - loss 0.09785407 - time (sec): 13.28 - samples/sec: 1677.37 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-23 15:49:46,810 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-23 15:49:46,810 EPOCH 3 done: loss 0.0986 - lr: 0.000023
120
+ 2023-10-23 15:49:47,349 DEV : loss 0.1556384116411209 - f1-score (micro avg) 0.8397
121
+ 2023-10-23 15:49:47,354 saving best model
122
+ 2023-10-23 15:49:47,902 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-23 15:49:49,268 epoch 4 - iter 27/275 - loss 0.10635465 - time (sec): 1.36 - samples/sec: 1699.34 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-23 15:49:50,661 epoch 4 - iter 54/275 - loss 0.07787905 - time (sec): 2.76 - samples/sec: 1710.12 - lr: 0.000023 - momentum: 0.000000
125
+ 2023-10-23 15:49:52,050 epoch 4 - iter 81/275 - loss 0.07086144 - time (sec): 4.15 - samples/sec: 1619.19 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-23 15:49:53,470 epoch 4 - iter 108/275 - loss 0.06660226 - time (sec): 5.57 - samples/sec: 1579.83 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-23 15:49:55,026 epoch 4 - iter 135/275 - loss 0.06133614 - time (sec): 7.12 - samples/sec: 1579.05 - lr: 0.000022 - momentum: 0.000000
128
+ 2023-10-23 15:49:56,436 epoch 4 - iter 162/275 - loss 0.07371892 - time (sec): 8.53 - samples/sec: 1600.96 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-23 15:49:57,837 epoch 4 - iter 189/275 - loss 0.06854694 - time (sec): 9.93 - samples/sec: 1601.60 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-23 15:49:59,240 epoch 4 - iter 216/275 - loss 0.07000006 - time (sec): 11.34 - samples/sec: 1590.97 - lr: 0.000021 - momentum: 0.000000
131
+ 2023-10-23 15:50:00,651 epoch 4 - iter 243/275 - loss 0.06874057 - time (sec): 12.75 - samples/sec: 1562.29 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-23 15:50:02,081 epoch 4 - iter 270/275 - loss 0.06913971 - time (sec): 14.18 - samples/sec: 1577.52 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-23 15:50:02,344 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-23 15:50:02,344 EPOCH 4 done: loss 0.0702 - lr: 0.000020
135
+ 2023-10-23 15:50:02,885 DEV : loss 0.1502317488193512 - f1-score (micro avg) 0.8561
136
+ 2023-10-23 15:50:02,890 saving best model
137
+ 2023-10-23 15:50:03,412 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-23 15:50:04,753 epoch 5 - iter 27/275 - loss 0.05181714 - time (sec): 1.34 - samples/sec: 1740.38 - lr: 0.000020 - momentum: 0.000000
139
+ 2023-10-23 15:50:06,005 epoch 5 - iter 54/275 - loss 0.07653701 - time (sec): 2.59 - samples/sec: 1739.60 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-23 15:50:07,269 epoch 5 - iter 81/275 - loss 0.06745093 - time (sec): 3.86 - samples/sec: 1753.54 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-23 15:50:08,532 epoch 5 - iter 108/275 - loss 0.05451561 - time (sec): 5.12 - samples/sec: 1742.89 - lr: 0.000019 - momentum: 0.000000
142
+ 2023-10-23 15:50:09,799 epoch 5 - iter 135/275 - loss 0.05158637 - time (sec): 6.39 - samples/sec: 1769.32 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-23 15:50:11,070 epoch 5 - iter 162/275 - loss 0.05025736 - time (sec): 7.66 - samples/sec: 1749.73 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-23 15:50:12,346 epoch 5 - iter 189/275 - loss 0.04870088 - time (sec): 8.93 - samples/sec: 1752.00 - lr: 0.000018 - momentum: 0.000000
145
+ 2023-10-23 15:50:13,649 epoch 5 - iter 216/275 - loss 0.04861125 - time (sec): 10.24 - samples/sec: 1747.96 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-23 15:50:14,972 epoch 5 - iter 243/275 - loss 0.05333777 - time (sec): 11.56 - samples/sec: 1721.85 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-23 15:50:16,328 epoch 5 - iter 270/275 - loss 0.05110952 - time (sec): 12.92 - samples/sec: 1723.24 - lr: 0.000017 - momentum: 0.000000
148
+ 2023-10-23 15:50:16,568 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-23 15:50:16,568 EPOCH 5 done: loss 0.0516 - lr: 0.000017
150
+ 2023-10-23 15:50:17,100 DEV : loss 0.14082755148410797 - f1-score (micro avg) 0.891
151
+ 2023-10-23 15:50:17,105 saving best model
152
+ 2023-10-23 15:50:17,625 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-23 15:50:19,040 epoch 6 - iter 27/275 - loss 0.00386518 - time (sec): 1.41 - samples/sec: 1784.05 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-23 15:50:20,440 epoch 6 - iter 54/275 - loss 0.02921163 - time (sec): 2.81 - samples/sec: 1583.87 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-23 15:50:21,835 epoch 6 - iter 81/275 - loss 0.02939935 - time (sec): 4.21 - samples/sec: 1575.72 - lr: 0.000016 - momentum: 0.000000
156
+ 2023-10-23 15:50:23,184 epoch 6 - iter 108/275 - loss 0.02973268 - time (sec): 5.56 - samples/sec: 1623.15 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-23 15:50:24,544 epoch 6 - iter 135/275 - loss 0.02759517 - time (sec): 6.92 - samples/sec: 1619.60 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-23 15:50:25,902 epoch 6 - iter 162/275 - loss 0.03106721 - time (sec): 8.27 - samples/sec: 1602.30 - lr: 0.000015 - momentum: 0.000000
159
+ 2023-10-23 15:50:27,254 epoch 6 - iter 189/275 - loss 0.03200975 - time (sec): 9.63 - samples/sec: 1600.57 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-23 15:50:28,594 epoch 6 - iter 216/275 - loss 0.03413447 - time (sec): 10.97 - samples/sec: 1606.80 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-23 15:50:29,960 epoch 6 - iter 243/275 - loss 0.03567708 - time (sec): 12.33 - samples/sec: 1618.86 - lr: 0.000014 - momentum: 0.000000
162
+ 2023-10-23 15:50:31,317 epoch 6 - iter 270/275 - loss 0.03741332 - time (sec): 13.69 - samples/sec: 1631.75 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-23 15:50:31,570 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-23 15:50:31,571 EPOCH 6 done: loss 0.0385 - lr: 0.000013
165
+ 2023-10-23 15:50:32,119 DEV : loss 0.16913354396820068 - f1-score (micro avg) 0.8636
166
+ 2023-10-23 15:50:32,125 ----------------------------------------------------------------------------------------------------
167
+ 2023-10-23 15:50:33,534 epoch 7 - iter 27/275 - loss 0.04239827 - time (sec): 1.41 - samples/sec: 1661.75 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-23 15:50:34,955 epoch 7 - iter 54/275 - loss 0.03679797 - time (sec): 2.83 - samples/sec: 1551.80 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-23 15:50:36,351 epoch 7 - iter 81/275 - loss 0.03544132 - time (sec): 4.23 - samples/sec: 1533.62 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-23 15:50:37,751 epoch 7 - iter 108/275 - loss 0.03506543 - time (sec): 5.63 - samples/sec: 1577.68 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-23 15:50:39,184 epoch 7 - iter 135/275 - loss 0.03111449 - time (sec): 7.06 - samples/sec: 1574.98 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-23 15:50:40,590 epoch 7 - iter 162/275 - loss 0.02965772 - time (sec): 8.46 - samples/sec: 1558.89 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-23 15:50:42,017 epoch 7 - iter 189/275 - loss 0.02765179 - time (sec): 9.89 - samples/sec: 1570.22 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-23 15:50:43,434 epoch 7 - iter 216/275 - loss 0.02591692 - time (sec): 11.31 - samples/sec: 1581.26 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-23 15:50:44,860 epoch 7 - iter 243/275 - loss 0.02934153 - time (sec): 12.73 - samples/sec: 1586.85 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-23 15:50:46,288 epoch 7 - iter 270/275 - loss 0.02777065 - time (sec): 14.16 - samples/sec: 1576.34 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-23 15:50:46,565 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-23 15:50:46,566 EPOCH 7 done: loss 0.0278 - lr: 0.000010
179
+ 2023-10-23 15:50:47,104 DEV : loss 0.16255125403404236 - f1-score (micro avg) 0.8921
180
+ 2023-10-23 15:50:47,109 saving best model
181
+ 2023-10-23 15:50:47,650 ----------------------------------------------------------------------------------------------------
182
+ 2023-10-23 15:50:49,094 epoch 8 - iter 27/275 - loss 0.04335307 - time (sec): 1.44 - samples/sec: 1509.48 - lr: 0.000010 - momentum: 0.000000
183
+ 2023-10-23 15:50:50,577 epoch 8 - iter 54/275 - loss 0.02995593 - time (sec): 2.92 - samples/sec: 1564.81 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-23 15:50:52,039 epoch 8 - iter 81/275 - loss 0.03529654 - time (sec): 4.39 - samples/sec: 1527.29 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-23 15:50:53,509 epoch 8 - iter 108/275 - loss 0.02984522 - time (sec): 5.86 - samples/sec: 1565.65 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-23 15:50:54,955 epoch 8 - iter 135/275 - loss 0.02572520 - time (sec): 7.30 - samples/sec: 1575.37 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-23 15:50:56,432 epoch 8 - iter 162/275 - loss 0.02300740 - time (sec): 8.78 - samples/sec: 1591.54 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-23 15:50:57,902 epoch 8 - iter 189/275 - loss 0.02124124 - time (sec): 10.25 - samples/sec: 1563.76 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-23 15:50:59,316 epoch 8 - iter 216/275 - loss 0.01934416 - time (sec): 11.66 - samples/sec: 1547.46 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-23 15:51:00,830 epoch 8 - iter 243/275 - loss 0.02046265 - time (sec): 13.18 - samples/sec: 1530.95 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-23 15:51:02,342 epoch 8 - iter 270/275 - loss 0.02074013 - time (sec): 14.69 - samples/sec: 1518.47 - lr: 0.000007 - momentum: 0.000000
192
+ 2023-10-23 15:51:02,628 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-23 15:51:02,628 EPOCH 8 done: loss 0.0226 - lr: 0.000007
194
+ 2023-10-23 15:51:03,168 DEV : loss 0.15772368013858795 - f1-score (micro avg) 0.8953
195
+ 2023-10-23 15:51:03,174 saving best model
196
+ 2023-10-23 15:51:03,723 ----------------------------------------------------------------------------------------------------
197
+ 2023-10-23 15:51:05,148 epoch 9 - iter 27/275 - loss 0.01504040 - time (sec): 1.42 - samples/sec: 1600.13 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-23 15:51:06,569 epoch 9 - iter 54/275 - loss 0.01703108 - time (sec): 2.84 - samples/sec: 1611.83 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-23 15:51:07,979 epoch 9 - iter 81/275 - loss 0.02222312 - time (sec): 4.25 - samples/sec: 1597.43 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-23 15:51:09,366 epoch 9 - iter 108/275 - loss 0.02072152 - time (sec): 5.64 - samples/sec: 1592.87 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-23 15:51:10,792 epoch 9 - iter 135/275 - loss 0.01794282 - time (sec): 7.06 - samples/sec: 1600.44 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-23 15:51:12,232 epoch 9 - iter 162/275 - loss 0.01549132 - time (sec): 8.50 - samples/sec: 1609.76 - lr: 0.000005 - momentum: 0.000000
203
+ 2023-10-23 15:51:13,672 epoch 9 - iter 189/275 - loss 0.01519766 - time (sec): 9.95 - samples/sec: 1589.81 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-23 15:51:15,129 epoch 9 - iter 216/275 - loss 0.01494770 - time (sec): 11.40 - samples/sec: 1587.79 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-23 15:51:16,584 epoch 9 - iter 243/275 - loss 0.01368241 - time (sec): 12.86 - samples/sec: 1576.48 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-23 15:51:18,013 epoch 9 - iter 270/275 - loss 0.01433322 - time (sec): 14.29 - samples/sec: 1572.91 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-23 15:51:18,281 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-23 15:51:18,281 EPOCH 9 done: loss 0.0141 - lr: 0.000003
209
+ 2023-10-23 15:51:18,817 DEV : loss 0.16109435260295868 - f1-score (micro avg) 0.8953
210
+ 2023-10-23 15:51:18,822 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-23 15:51:20,254 epoch 10 - iter 27/275 - loss 0.01426170 - time (sec): 1.43 - samples/sec: 1472.72 - lr: 0.000003 - momentum: 0.000000
212
+ 2023-10-23 15:51:21,704 epoch 10 - iter 54/275 - loss 0.00849233 - time (sec): 2.88 - samples/sec: 1510.58 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-23 15:51:23,155 epoch 10 - iter 81/275 - loss 0.00932013 - time (sec): 4.33 - samples/sec: 1551.58 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-23 15:51:24,587 epoch 10 - iter 108/275 - loss 0.00701482 - time (sec): 5.76 - samples/sec: 1570.09 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-23 15:51:26,045 epoch 10 - iter 135/275 - loss 0.00720699 - time (sec): 7.22 - samples/sec: 1540.00 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-23 15:51:27,472 epoch 10 - iter 162/275 - loss 0.00872732 - time (sec): 8.65 - samples/sec: 1518.11 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-23 15:51:28,940 epoch 10 - iter 189/275 - loss 0.00824989 - time (sec): 10.12 - samples/sec: 1509.48 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-23 15:51:30,385 epoch 10 - iter 216/275 - loss 0.01055276 - time (sec): 11.56 - samples/sec: 1525.31 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-23 15:51:31,861 epoch 10 - iter 243/275 - loss 0.01278568 - time (sec): 13.04 - samples/sec: 1530.53 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-10-23 15:51:33,313 epoch 10 - iter 270/275 - loss 0.01209491 - time (sec): 14.49 - samples/sec: 1541.25 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-23 15:51:33,576 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-23 15:51:33,577 EPOCH 10 done: loss 0.0119 - lr: 0.000000
223
+ 2023-10-23 15:51:34,109 DEV : loss 0.16141371428966522 - f1-score (micro avg) 0.8994
224
+ 2023-10-23 15:51:34,114 saving best model
225
+ 2023-10-23 15:51:35,051 ----------------------------------------------------------------------------------------------------
226
+ 2023-10-23 15:51:35,052 Loading model from best epoch ...
227
+ 2023-10-23 15:51:36,783 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
228
+ 2023-10-23 15:51:37,444
229
+ Results:
230
+ - F-score (micro) 0.9041
231
+ - F-score (macro) 0.7376
232
+ - Accuracy 0.8411
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ scope 0.8883 0.9034 0.8958 176
238
+ pers 0.9685 0.9609 0.9647 128
239
+ work 0.8451 0.8108 0.8276 74
240
+ object 1.0000 1.0000 1.0000 2
241
+ loc 0.0000 0.0000 0.0000 2
242
+
243
+ micro avg 0.9077 0.9005 0.9041 382
244
+ macro avg 0.7404 0.7350 0.7376 382
245
+ weighted avg 0.9027 0.9005 0.9015 382
246
+
247
+ 2023-10-23 15:51:37,444 ----------------------------------------------------------------------------------------------------