stefan-it commited on
Commit
7579959
1 Parent(s): 3b46e62

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +247 -0
training.log ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-11-16 06:11:33,784 ----------------------------------------------------------------------------------------------------
2
+ 2023-11-16 06:11:33,786 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): XLMRobertaModel(
5
+ (embeddings): XLMRobertaEmbeddings(
6
+ (word_embeddings): Embedding(250003, 1024)
7
+ (position_embeddings): Embedding(514, 1024, padding_idx=1)
8
+ (token_type_embeddings): Embedding(1, 1024)
9
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): XLMRobertaEncoder(
13
+ (layer): ModuleList(
14
+ (0-23): 24 x XLMRobertaLayer(
15
+ (attention): XLMRobertaAttention(
16
+ (self): XLMRobertaSelfAttention(
17
+ (query): Linear(in_features=1024, out_features=1024, bias=True)
18
+ (key): Linear(in_features=1024, out_features=1024, bias=True)
19
+ (value): Linear(in_features=1024, out_features=1024, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): XLMRobertaSelfOutput(
23
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
24
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): XLMRobertaIntermediate(
29
+ (dense): Linear(in_features=1024, out_features=4096, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): XLMRobertaOutput(
33
+ (dense): Linear(in_features=4096, out_features=1024, bias=True)
34
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): XLMRobertaPooler(
41
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=1024, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-11-16 06:11:33,786 ----------------------------------------------------------------------------------------------------
51
+ 2023-11-16 06:11:33,786 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
52
+ - ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
53
+ - ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
54
+ 2023-11-16 06:11:33,786 ----------------------------------------------------------------------------------------------------
55
+ 2023-11-16 06:11:33,786 Train: 30000 sentences
56
+ 2023-11-16 06:11:33,786 (train_with_dev=False, train_with_test=False)
57
+ 2023-11-16 06:11:33,786 ----------------------------------------------------------------------------------------------------
58
+ 2023-11-16 06:11:33,786 Training Params:
59
+ 2023-11-16 06:11:33,786 - learning_rate: "5e-06"
60
+ 2023-11-16 06:11:33,786 - mini_batch_size: "4"
61
+ 2023-11-16 06:11:33,786 - max_epochs: "10"
62
+ 2023-11-16 06:11:33,786 - shuffle: "True"
63
+ 2023-11-16 06:11:33,786 ----------------------------------------------------------------------------------------------------
64
+ 2023-11-16 06:11:33,786 Plugins:
65
+ 2023-11-16 06:11:33,786 - TensorboardLogger
66
+ 2023-11-16 06:11:33,786 - LinearScheduler | warmup_fraction: '0.1'
67
+ 2023-11-16 06:11:33,786 ----------------------------------------------------------------------------------------------------
68
+ 2023-11-16 06:11:33,786 Final evaluation on model from best epoch (best-model.pt)
69
+ 2023-11-16 06:11:33,787 - metric: "('micro avg', 'f1-score')"
70
+ 2023-11-16 06:11:33,787 ----------------------------------------------------------------------------------------------------
71
+ 2023-11-16 06:11:33,787 Computation:
72
+ 2023-11-16 06:11:33,787 - compute on device: cuda:0
73
+ 2023-11-16 06:11:33,787 - embedding storage: none
74
+ 2023-11-16 06:11:33,787 ----------------------------------------------------------------------------------------------------
75
+ 2023-11-16 06:11:33,787 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-4"
76
+ 2023-11-16 06:11:33,787 ----------------------------------------------------------------------------------------------------
77
+ 2023-11-16 06:11:33,787 ----------------------------------------------------------------------------------------------------
78
+ 2023-11-16 06:11:33,787 Logging anything other than scalars to TensorBoard is currently not supported.
79
+ 2023-11-16 06:13:08,349 epoch 1 - iter 750/7500 - loss 2.53865900 - time (sec): 94.56 - samples/sec: 253.75 - lr: 0.000000 - momentum: 0.000000
80
+ 2023-11-16 06:14:42,568 epoch 1 - iter 1500/7500 - loss 2.13967550 - time (sec): 188.78 - samples/sec: 256.47 - lr: 0.000001 - momentum: 0.000000
81
+ 2023-11-16 06:16:16,468 epoch 1 - iter 2250/7500 - loss 1.90406353 - time (sec): 282.68 - samples/sec: 256.63 - lr: 0.000001 - momentum: 0.000000
82
+ 2023-11-16 06:17:48,884 epoch 1 - iter 3000/7500 - loss 1.67899229 - time (sec): 375.10 - samples/sec: 256.80 - lr: 0.000002 - momentum: 0.000000
83
+ 2023-11-16 06:19:19,876 epoch 1 - iter 3750/7500 - loss 1.48518547 - time (sec): 466.09 - samples/sec: 258.11 - lr: 0.000002 - momentum: 0.000000
84
+ 2023-11-16 06:20:52,111 epoch 1 - iter 4500/7500 - loss 1.33429739 - time (sec): 558.32 - samples/sec: 259.20 - lr: 0.000003 - momentum: 0.000000
85
+ 2023-11-16 06:22:25,071 epoch 1 - iter 5250/7500 - loss 1.22009996 - time (sec): 651.28 - samples/sec: 258.72 - lr: 0.000003 - momentum: 0.000000
86
+ 2023-11-16 06:23:57,896 epoch 1 - iter 6000/7500 - loss 1.13315230 - time (sec): 744.11 - samples/sec: 258.55 - lr: 0.000004 - momentum: 0.000000
87
+ 2023-11-16 06:25:28,930 epoch 1 - iter 6750/7500 - loss 1.06203012 - time (sec): 835.14 - samples/sec: 259.35 - lr: 0.000004 - momentum: 0.000000
88
+ 2023-11-16 06:27:03,002 epoch 1 - iter 7500/7500 - loss 1.00081782 - time (sec): 929.21 - samples/sec: 259.14 - lr: 0.000005 - momentum: 0.000000
89
+ 2023-11-16 06:27:03,005 ----------------------------------------------------------------------------------------------------
90
+ 2023-11-16 06:27:03,005 EPOCH 1 done: loss 1.0008 - lr: 0.000005
91
+ 2023-11-16 06:27:30,760 DEV : loss 0.3205418884754181 - f1-score (micro avg) 0.7957
92
+ 2023-11-16 06:27:33,296 saving best model
93
+ 2023-11-16 06:27:35,260 ----------------------------------------------------------------------------------------------------
94
+ 2023-11-16 06:29:08,221 epoch 2 - iter 750/7500 - loss 0.40584352 - time (sec): 92.96 - samples/sec: 259.85 - lr: 0.000005 - momentum: 0.000000
95
+ 2023-11-16 06:30:41,767 epoch 2 - iter 1500/7500 - loss 0.41800053 - time (sec): 186.50 - samples/sec: 258.62 - lr: 0.000005 - momentum: 0.000000
96
+ 2023-11-16 06:32:13,981 epoch 2 - iter 2250/7500 - loss 0.40515032 - time (sec): 278.72 - samples/sec: 260.55 - lr: 0.000005 - momentum: 0.000000
97
+ 2023-11-16 06:33:44,785 epoch 2 - iter 3000/7500 - loss 0.40416870 - time (sec): 369.52 - samples/sec: 261.05 - lr: 0.000005 - momentum: 0.000000
98
+ 2023-11-16 06:35:15,515 epoch 2 - iter 3750/7500 - loss 0.40544240 - time (sec): 460.25 - samples/sec: 263.21 - lr: 0.000005 - momentum: 0.000000
99
+ 2023-11-16 06:36:48,219 epoch 2 - iter 4500/7500 - loss 0.40263197 - time (sec): 552.96 - samples/sec: 262.37 - lr: 0.000005 - momentum: 0.000000
100
+ 2023-11-16 06:38:23,766 epoch 2 - iter 5250/7500 - loss 0.39942117 - time (sec): 648.50 - samples/sec: 260.48 - lr: 0.000005 - momentum: 0.000000
101
+ 2023-11-16 06:39:57,620 epoch 2 - iter 6000/7500 - loss 0.40065088 - time (sec): 742.36 - samples/sec: 259.79 - lr: 0.000005 - momentum: 0.000000
102
+ 2023-11-16 06:41:29,077 epoch 2 - iter 6750/7500 - loss 0.39965016 - time (sec): 833.81 - samples/sec: 260.34 - lr: 0.000005 - momentum: 0.000000
103
+ 2023-11-16 06:43:02,733 epoch 2 - iter 7500/7500 - loss 0.39861413 - time (sec): 927.47 - samples/sec: 259.63 - lr: 0.000004 - momentum: 0.000000
104
+ 2023-11-16 06:43:02,736 ----------------------------------------------------------------------------------------------------
105
+ 2023-11-16 06:43:02,736 EPOCH 2 done: loss 0.3986 - lr: 0.000004
106
+ 2023-11-16 06:43:29,322 DEV : loss 0.2607610523700714 - f1-score (micro avg) 0.8643
107
+ 2023-11-16 06:43:31,142 saving best model
108
+ 2023-11-16 06:43:33,553 ----------------------------------------------------------------------------------------------------
109
+ 2023-11-16 06:45:08,034 epoch 3 - iter 750/7500 - loss 0.37315879 - time (sec): 94.48 - samples/sec: 253.04 - lr: 0.000004 - momentum: 0.000000
110
+ 2023-11-16 06:46:39,743 epoch 3 - iter 1500/7500 - loss 0.35743568 - time (sec): 186.18 - samples/sec: 256.62 - lr: 0.000004 - momentum: 0.000000
111
+ 2023-11-16 06:48:12,901 epoch 3 - iter 2250/7500 - loss 0.35305153 - time (sec): 279.34 - samples/sec: 259.77 - lr: 0.000004 - momentum: 0.000000
112
+ 2023-11-16 06:49:45,353 epoch 3 - iter 3000/7500 - loss 0.35234824 - time (sec): 371.79 - samples/sec: 259.85 - lr: 0.000004 - momentum: 0.000000
113
+ 2023-11-16 06:51:20,378 epoch 3 - iter 3750/7500 - loss 0.35046792 - time (sec): 466.82 - samples/sec: 258.57 - lr: 0.000004 - momentum: 0.000000
114
+ 2023-11-16 06:52:53,238 epoch 3 - iter 4500/7500 - loss 0.35142197 - time (sec): 559.68 - samples/sec: 259.89 - lr: 0.000004 - momentum: 0.000000
115
+ 2023-11-16 06:54:23,949 epoch 3 - iter 5250/7500 - loss 0.34665555 - time (sec): 650.39 - samples/sec: 260.41 - lr: 0.000004 - momentum: 0.000000
116
+ 2023-11-16 06:55:56,860 epoch 3 - iter 6000/7500 - loss 0.35003084 - time (sec): 743.30 - samples/sec: 259.59 - lr: 0.000004 - momentum: 0.000000
117
+ 2023-11-16 06:57:30,078 epoch 3 - iter 6750/7500 - loss 0.34700719 - time (sec): 836.52 - samples/sec: 259.35 - lr: 0.000004 - momentum: 0.000000
118
+ 2023-11-16 06:59:02,988 epoch 3 - iter 7500/7500 - loss 0.34834444 - time (sec): 929.43 - samples/sec: 259.08 - lr: 0.000004 - momentum: 0.000000
119
+ 2023-11-16 06:59:02,990 ----------------------------------------------------------------------------------------------------
120
+ 2023-11-16 06:59:02,990 EPOCH 3 done: loss 0.3483 - lr: 0.000004
121
+ 2023-11-16 06:59:30,217 DEV : loss 0.2834814190864563 - f1-score (micro avg) 0.881
122
+ 2023-11-16 06:59:32,866 saving best model
123
+ 2023-11-16 06:59:35,803 ----------------------------------------------------------------------------------------------------
124
+ 2023-11-16 07:01:09,960 epoch 4 - iter 750/7500 - loss 0.29042774 - time (sec): 94.15 - samples/sec: 256.86 - lr: 0.000004 - momentum: 0.000000
125
+ 2023-11-16 07:02:45,034 epoch 4 - iter 1500/7500 - loss 0.28875226 - time (sec): 189.23 - samples/sec: 258.73 - lr: 0.000004 - momentum: 0.000000
126
+ 2023-11-16 07:04:20,800 epoch 4 - iter 2250/7500 - loss 0.30241778 - time (sec): 284.99 - samples/sec: 255.97 - lr: 0.000004 - momentum: 0.000000
127
+ 2023-11-16 07:05:55,249 epoch 4 - iter 3000/7500 - loss 0.30810931 - time (sec): 379.44 - samples/sec: 254.41 - lr: 0.000004 - momentum: 0.000000
128
+ 2023-11-16 07:07:28,778 epoch 4 - iter 3750/7500 - loss 0.30459660 - time (sec): 472.97 - samples/sec: 255.40 - lr: 0.000004 - momentum: 0.000000
129
+ 2023-11-16 07:08:59,582 epoch 4 - iter 4500/7500 - loss 0.30550384 - time (sec): 563.77 - samples/sec: 257.73 - lr: 0.000004 - momentum: 0.000000
130
+ 2023-11-16 07:10:31,165 epoch 4 - iter 5250/7500 - loss 0.30595152 - time (sec): 655.36 - samples/sec: 258.24 - lr: 0.000004 - momentum: 0.000000
131
+ 2023-11-16 07:12:04,192 epoch 4 - iter 6000/7500 - loss 0.30648476 - time (sec): 748.38 - samples/sec: 258.00 - lr: 0.000003 - momentum: 0.000000
132
+ 2023-11-16 07:13:38,216 epoch 4 - iter 6750/7500 - loss 0.30712803 - time (sec): 842.41 - samples/sec: 257.62 - lr: 0.000003 - momentum: 0.000000
133
+ 2023-11-16 07:15:12,386 epoch 4 - iter 7500/7500 - loss 0.30384345 - time (sec): 936.58 - samples/sec: 257.10 - lr: 0.000003 - momentum: 0.000000
134
+ 2023-11-16 07:15:12,389 ----------------------------------------------------------------------------------------------------
135
+ 2023-11-16 07:15:12,389 EPOCH 4 done: loss 0.3038 - lr: 0.000003
136
+ 2023-11-16 07:15:39,642 DEV : loss 0.2750042676925659 - f1-score (micro avg) 0.8871
137
+ 2023-11-16 07:15:41,637 saving best model
138
+ 2023-11-16 07:15:44,075 ----------------------------------------------------------------------------------------------------
139
+ 2023-11-16 07:17:17,606 epoch 5 - iter 750/7500 - loss 0.22837945 - time (sec): 93.53 - samples/sec: 253.79 - lr: 0.000003 - momentum: 0.000000
140
+ 2023-11-16 07:18:50,674 epoch 5 - iter 1500/7500 - loss 0.24801582 - time (sec): 186.59 - samples/sec: 255.33 - lr: 0.000003 - momentum: 0.000000
141
+ 2023-11-16 07:20:23,354 epoch 5 - iter 2250/7500 - loss 0.24364625 - time (sec): 279.27 - samples/sec: 258.70 - lr: 0.000003 - momentum: 0.000000
142
+ 2023-11-16 07:21:53,572 epoch 5 - iter 3000/7500 - loss 0.25086533 - time (sec): 369.49 - samples/sec: 261.28 - lr: 0.000003 - momentum: 0.000000
143
+ 2023-11-16 07:23:27,878 epoch 5 - iter 3750/7500 - loss 0.25125342 - time (sec): 463.80 - samples/sec: 260.45 - lr: 0.000003 - momentum: 0.000000
144
+ 2023-11-16 07:24:59,899 epoch 5 - iter 4500/7500 - loss 0.25211752 - time (sec): 555.82 - samples/sec: 259.74 - lr: 0.000003 - momentum: 0.000000
145
+ 2023-11-16 07:26:31,650 epoch 5 - iter 5250/7500 - loss 0.25096563 - time (sec): 647.57 - samples/sec: 259.86 - lr: 0.000003 - momentum: 0.000000
146
+ 2023-11-16 07:28:06,382 epoch 5 - iter 6000/7500 - loss 0.25437307 - time (sec): 742.30 - samples/sec: 258.90 - lr: 0.000003 - momentum: 0.000000
147
+ 2023-11-16 07:29:38,647 epoch 5 - iter 6750/7500 - loss 0.25716650 - time (sec): 834.57 - samples/sec: 259.17 - lr: 0.000003 - momentum: 0.000000
148
+ 2023-11-16 07:31:13,423 epoch 5 - iter 7500/7500 - loss 0.25526851 - time (sec): 929.34 - samples/sec: 259.10 - lr: 0.000003 - momentum: 0.000000
149
+ 2023-11-16 07:31:13,426 ----------------------------------------------------------------------------------------------------
150
+ 2023-11-16 07:31:13,427 EPOCH 5 done: loss 0.2553 - lr: 0.000003
151
+ 2023-11-16 07:31:40,891 DEV : loss 0.2662450671195984 - f1-score (micro avg) 0.8974
152
+ 2023-11-16 07:31:43,349 saving best model
153
+ 2023-11-16 07:31:46,083 ----------------------------------------------------------------------------------------------------
154
+ 2023-11-16 07:33:16,627 epoch 6 - iter 750/7500 - loss 0.19587155 - time (sec): 90.54 - samples/sec: 263.71 - lr: 0.000003 - momentum: 0.000000
155
+ 2023-11-16 07:34:47,310 epoch 6 - iter 1500/7500 - loss 0.20788294 - time (sec): 181.22 - samples/sec: 265.84 - lr: 0.000003 - momentum: 0.000000
156
+ 2023-11-16 07:36:21,324 epoch 6 - iter 2250/7500 - loss 0.20608536 - time (sec): 275.24 - samples/sec: 264.05 - lr: 0.000003 - momentum: 0.000000
157
+ 2023-11-16 07:37:54,486 epoch 6 - iter 3000/7500 - loss 0.21411200 - time (sec): 368.40 - samples/sec: 261.45 - lr: 0.000003 - momentum: 0.000000
158
+ 2023-11-16 07:39:27,386 epoch 6 - iter 3750/7500 - loss 0.21815036 - time (sec): 461.30 - samples/sec: 260.18 - lr: 0.000003 - momentum: 0.000000
159
+ 2023-11-16 07:41:00,912 epoch 6 - iter 4500/7500 - loss 0.21725635 - time (sec): 554.83 - samples/sec: 260.18 - lr: 0.000002 - momentum: 0.000000
160
+ 2023-11-16 07:42:34,526 epoch 6 - iter 5250/7500 - loss 0.21942273 - time (sec): 648.44 - samples/sec: 259.04 - lr: 0.000002 - momentum: 0.000000
161
+ 2023-11-16 07:44:07,951 epoch 6 - iter 6000/7500 - loss 0.22107059 - time (sec): 741.87 - samples/sec: 258.58 - lr: 0.000002 - momentum: 0.000000
162
+ 2023-11-16 07:45:41,437 epoch 6 - iter 6750/7500 - loss 0.22258724 - time (sec): 835.35 - samples/sec: 258.83 - lr: 0.000002 - momentum: 0.000000
163
+ 2023-11-16 07:47:12,427 epoch 6 - iter 7500/7500 - loss 0.22153847 - time (sec): 926.34 - samples/sec: 259.94 - lr: 0.000002 - momentum: 0.000000
164
+ 2023-11-16 07:47:12,430 ----------------------------------------------------------------------------------------------------
165
+ 2023-11-16 07:47:12,430 EPOCH 6 done: loss 0.2215 - lr: 0.000002
166
+ 2023-11-16 07:47:39,790 DEV : loss 0.2961623966693878 - f1-score (micro avg) 0.9003
167
+ 2023-11-16 07:47:42,072 saving best model
168
+ 2023-11-16 07:47:44,511 ----------------------------------------------------------------------------------------------------
169
+ 2023-11-16 07:49:18,880 epoch 7 - iter 750/7500 - loss 0.16803306 - time (sec): 94.36 - samples/sec: 255.20 - lr: 0.000002 - momentum: 0.000000
170
+ 2023-11-16 07:50:54,973 epoch 7 - iter 1500/7500 - loss 0.17324952 - time (sec): 190.46 - samples/sec: 254.75 - lr: 0.000002 - momentum: 0.000000
171
+ 2023-11-16 07:52:32,192 epoch 7 - iter 2250/7500 - loss 0.17809510 - time (sec): 287.68 - samples/sec: 251.96 - lr: 0.000002 - momentum: 0.000000
172
+ 2023-11-16 07:54:06,050 epoch 7 - iter 3000/7500 - loss 0.18157709 - time (sec): 381.53 - samples/sec: 252.63 - lr: 0.000002 - momentum: 0.000000
173
+ 2023-11-16 07:55:39,777 epoch 7 - iter 3750/7500 - loss 0.18010115 - time (sec): 475.26 - samples/sec: 252.98 - lr: 0.000002 - momentum: 0.000000
174
+ 2023-11-16 07:57:12,748 epoch 7 - iter 4500/7500 - loss 0.18253655 - time (sec): 568.23 - samples/sec: 253.84 - lr: 0.000002 - momentum: 0.000000
175
+ 2023-11-16 07:58:45,045 epoch 7 - iter 5250/7500 - loss 0.18478993 - time (sec): 660.53 - samples/sec: 254.99 - lr: 0.000002 - momentum: 0.000000
176
+ 2023-11-16 08:00:17,924 epoch 7 - iter 6000/7500 - loss 0.18257351 - time (sec): 753.41 - samples/sec: 255.75 - lr: 0.000002 - momentum: 0.000000
177
+ 2023-11-16 08:01:49,343 epoch 7 - iter 6750/7500 - loss 0.18422323 - time (sec): 844.83 - samples/sec: 256.40 - lr: 0.000002 - momentum: 0.000000
178
+ 2023-11-16 08:03:22,527 epoch 7 - iter 7500/7500 - loss 0.18484974 - time (sec): 938.01 - samples/sec: 256.71 - lr: 0.000002 - momentum: 0.000000
179
+ 2023-11-16 08:03:22,531 ----------------------------------------------------------------------------------------------------
180
+ 2023-11-16 08:03:22,531 EPOCH 7 done: loss 0.1848 - lr: 0.000002
181
+ 2023-11-16 08:03:48,970 DEV : loss 0.305960088968277 - f1-score (micro avg) 0.9028
182
+ 2023-11-16 08:03:51,887 saving best model
183
+ 2023-11-16 08:03:53,942 ----------------------------------------------------------------------------------------------------
184
+ 2023-11-16 08:05:28,806 epoch 8 - iter 750/7500 - loss 0.14648152 - time (sec): 94.86 - samples/sec: 244.77 - lr: 0.000002 - momentum: 0.000000
185
+ 2023-11-16 08:07:04,823 epoch 8 - iter 1500/7500 - loss 0.15989226 - time (sec): 190.88 - samples/sec: 250.15 - lr: 0.000002 - momentum: 0.000000
186
+ 2023-11-16 08:08:37,702 epoch 8 - iter 2250/7500 - loss 0.16196706 - time (sec): 283.76 - samples/sec: 254.57 - lr: 0.000002 - momentum: 0.000000
187
+ 2023-11-16 08:10:09,006 epoch 8 - iter 3000/7500 - loss 0.16121972 - time (sec): 375.06 - samples/sec: 257.26 - lr: 0.000001 - momentum: 0.000000
188
+ 2023-11-16 08:11:41,390 epoch 8 - iter 3750/7500 - loss 0.15974733 - time (sec): 467.45 - samples/sec: 257.59 - lr: 0.000001 - momentum: 0.000000
189
+ 2023-11-16 08:13:14,066 epoch 8 - iter 4500/7500 - loss 0.15727904 - time (sec): 560.12 - samples/sec: 258.26 - lr: 0.000001 - momentum: 0.000000
190
+ 2023-11-16 08:14:47,177 epoch 8 - iter 5250/7500 - loss 0.15597106 - time (sec): 653.23 - samples/sec: 257.92 - lr: 0.000001 - momentum: 0.000000
191
+ 2023-11-16 08:16:17,918 epoch 8 - iter 6000/7500 - loss 0.15441827 - time (sec): 743.97 - samples/sec: 258.11 - lr: 0.000001 - momentum: 0.000000
192
+ 2023-11-16 08:17:52,222 epoch 8 - iter 6750/7500 - loss 0.15283100 - time (sec): 838.28 - samples/sec: 258.17 - lr: 0.000001 - momentum: 0.000000
193
+ 2023-11-16 08:19:25,488 epoch 8 - iter 7500/7500 - loss 0.15507668 - time (sec): 931.54 - samples/sec: 258.49 - lr: 0.000001 - momentum: 0.000000
194
+ 2023-11-16 08:19:25,491 ----------------------------------------------------------------------------------------------------
195
+ 2023-11-16 08:19:25,491 EPOCH 8 done: loss 0.1551 - lr: 0.000001
196
+ 2023-11-16 08:19:53,344 DEV : loss 0.3231204152107239 - f1-score (micro avg) 0.9014
197
+ 2023-11-16 08:19:55,450 ----------------------------------------------------------------------------------------------------
198
+ 2023-11-16 08:21:28,400 epoch 9 - iter 750/7500 - loss 0.12523890 - time (sec): 92.95 - samples/sec: 258.76 - lr: 0.000001 - momentum: 0.000000
199
+ 2023-11-16 08:23:00,060 epoch 9 - iter 1500/7500 - loss 0.12801485 - time (sec): 184.61 - samples/sec: 263.03 - lr: 0.000001 - momentum: 0.000000
200
+ 2023-11-16 08:24:36,577 epoch 9 - iter 2250/7500 - loss 0.13158450 - time (sec): 281.12 - samples/sec: 255.88 - lr: 0.000001 - momentum: 0.000000
201
+ 2023-11-16 08:26:08,372 epoch 9 - iter 3000/7500 - loss 0.12955430 - time (sec): 372.92 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000
202
+ 2023-11-16 08:27:42,811 epoch 9 - iter 3750/7500 - loss 0.13110177 - time (sec): 467.36 - samples/sec: 256.70 - lr: 0.000001 - momentum: 0.000000
203
+ 2023-11-16 08:29:15,844 epoch 9 - iter 4500/7500 - loss 0.13696235 - time (sec): 560.39 - samples/sec: 256.51 - lr: 0.000001 - momentum: 0.000000
204
+ 2023-11-16 08:30:48,381 epoch 9 - iter 5250/7500 - loss 0.13444283 - time (sec): 652.93 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000
205
+ 2023-11-16 08:32:21,725 epoch 9 - iter 6000/7500 - loss 0.13580845 - time (sec): 746.27 - samples/sec: 258.30 - lr: 0.000001 - momentum: 0.000000
206
+ 2023-11-16 08:33:54,750 epoch 9 - iter 6750/7500 - loss 0.13419816 - time (sec): 839.30 - samples/sec: 258.02 - lr: 0.000001 - momentum: 0.000000
207
+ 2023-11-16 08:35:28,480 epoch 9 - iter 7500/7500 - loss 0.13459907 - time (sec): 933.03 - samples/sec: 258.08 - lr: 0.000001 - momentum: 0.000000
208
+ 2023-11-16 08:35:28,482 ----------------------------------------------------------------------------------------------------
209
+ 2023-11-16 08:35:28,483 EPOCH 9 done: loss 0.1346 - lr: 0.000001
210
+ 2023-11-16 08:35:55,875 DEV : loss 0.3105945885181427 - f1-score (micro avg) 0.9036
211
+ 2023-11-16 08:35:58,080 saving best model
212
+ 2023-11-16 08:36:01,035 ----------------------------------------------------------------------------------------------------
213
+ 2023-11-16 08:37:36,044 epoch 10 - iter 750/7500 - loss 0.10551800 - time (sec): 95.01 - samples/sec: 248.68 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-11-16 08:39:09,059 epoch 10 - iter 1500/7500 - loss 0.11970928 - time (sec): 188.02 - samples/sec: 251.27 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-11-16 08:40:42,311 epoch 10 - iter 2250/7500 - loss 0.12199666 - time (sec): 281.27 - samples/sec: 256.11 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-11-16 08:42:13,894 epoch 10 - iter 3000/7500 - loss 0.12112190 - time (sec): 372.86 - samples/sec: 257.41 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-11-16 08:43:44,971 epoch 10 - iter 3750/7500 - loss 0.12198423 - time (sec): 463.93 - samples/sec: 259.71 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-11-16 08:45:19,480 epoch 10 - iter 4500/7500 - loss 0.11644070 - time (sec): 558.44 - samples/sec: 259.28 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-11-16 08:46:52,528 epoch 10 - iter 5250/7500 - loss 0.12094725 - time (sec): 651.49 - samples/sec: 259.32 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-11-16 08:48:29,025 epoch 10 - iter 6000/7500 - loss 0.11921992 - time (sec): 747.99 - samples/sec: 257.59 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-11-16 08:50:06,215 epoch 10 - iter 6750/7500 - loss 0.11723856 - time (sec): 845.18 - samples/sec: 256.47 - lr: 0.000000 - momentum: 0.000000
222
+ 2023-11-16 08:51:43,737 epoch 10 - iter 7500/7500 - loss 0.11691516 - time (sec): 942.70 - samples/sec: 255.43 - lr: 0.000000 - momentum: 0.000000
223
+ 2023-11-16 08:51:43,740 ----------------------------------------------------------------------------------------------------
224
+ 2023-11-16 08:51:43,740 EPOCH 10 done: loss 0.1169 - lr: 0.000000
225
+ 2023-11-16 08:52:11,462 DEV : loss 0.3263167440891266 - f1-score (micro avg) 0.905
226
+ 2023-11-16 08:52:14,084 saving best model
227
+ 2023-11-16 08:52:19,334 ----------------------------------------------------------------------------------------------------
228
+ 2023-11-16 08:52:19,337 Loading model from best epoch ...
229
+ 2023-11-16 08:52:29,363 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
230
+ 2023-11-16 08:52:58,009
231
+ Results:
232
+ - F-score (micro) 0.9038
233
+ - F-score (macro) 0.9028
234
+ - Accuracy 0.8536
235
+
236
+ By class:
237
+ precision recall f1-score support
238
+
239
+ LOC 0.9015 0.9153 0.9083 5288
240
+ PER 0.9219 0.9417 0.9317 3962
241
+ ORG 0.8674 0.8692 0.8683 3807
242
+
243
+ micro avg 0.8979 0.9099 0.9038 13057
244
+ macro avg 0.8969 0.9087 0.9028 13057
245
+ weighted avg 0.8977 0.9099 0.9037 13057
246
+
247
+ 2023-11-16 08:52:58,009 ----------------------------------------------------------------------------------------------------