stefan-it commited on
Commit
8d6afac
1 Parent(s): 84c74f2

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c484c00140ec2bf40498189122b25f3dc729d8c9ef81a0dfbe7a5e539f1f19d3
3
+ size 440966725
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 20:07:17 0.0000 0.5825 0.1282 0.6854 0.7612 0.7213 0.6041
3
+ 2 20:08:43 0.0000 0.1302 0.1191 0.8099 0.8225 0.8161 0.7105
4
+ 3 20:10:08 0.0000 0.0838 0.1379 0.8026 0.8431 0.8223 0.7219
5
+ 4 20:11:33 0.0000 0.0559 0.1617 0.8275 0.8654 0.8460 0.7532
6
+ 5 20:12:57 0.0000 0.0377 0.1790 0.8331 0.8522 0.8426 0.7523
7
+ 6 20:14:21 0.0000 0.0256 0.1823 0.8361 0.8471 0.8415 0.7519
8
+ 7 20:15:45 0.0000 0.0180 0.1940 0.8337 0.8528 0.8431 0.7593
9
+ 8 20:17:09 0.0000 0.0141 0.2001 0.8375 0.8620 0.8496 0.7636
10
+ 9 20:18:34 0.0000 0.0078 0.2041 0.8567 0.8625 0.8596 0.7767
11
+ 10 20:19:59 0.0000 0.0057 0.2036 0.8534 0.8671 0.8602 0.7772
runs/events.out.tfevents.1697573158.bce904bcef33.2482.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75891cb61b7645d5b64aa9b2fe835ecae4a71d485ee79b33b98f2b3bedb4c59c
3
+ size 825716
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 20:05:58,315 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 20:05:58,316 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=21, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 20:05:58,316 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 20:05:58,316 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
48
+ - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
49
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 20:05:58,317 Train: 5901 sentences
51
+ 2023-10-17 20:05:58,317 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 20:05:58,317 Training Params:
54
+ 2023-10-17 20:05:58,317 - learning_rate: "3e-05"
55
+ 2023-10-17 20:05:58,317 - mini_batch_size: "4"
56
+ 2023-10-17 20:05:58,317 - max_epochs: "10"
57
+ 2023-10-17 20:05:58,317 - shuffle: "True"
58
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 20:05:58,317 Plugins:
60
+ 2023-10-17 20:05:58,317 - TensorboardLogger
61
+ 2023-10-17 20:05:58,317 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 20:05:58,317 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 20:05:58,317 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 20:05:58,317 Computation:
67
+ 2023-10-17 20:05:58,317 - compute on device: cuda:0
68
+ 2023-10-17 20:05:58,317 - embedding storage: none
69
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 20:05:58,317 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 20:05:58,317 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 20:06:05,703 epoch 1 - iter 147/1476 - loss 2.88376740 - time (sec): 7.38 - samples/sec: 2399.29 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 20:06:12,547 epoch 1 - iter 294/1476 - loss 1.83405663 - time (sec): 14.23 - samples/sec: 2329.66 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 20:06:20,094 epoch 1 - iter 441/1476 - loss 1.34685714 - time (sec): 21.78 - samples/sec: 2364.86 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 20:06:27,578 epoch 1 - iter 588/1476 - loss 1.08468691 - time (sec): 29.26 - samples/sec: 2373.41 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 20:06:34,521 epoch 1 - iter 735/1476 - loss 0.93275820 - time (sec): 36.20 - samples/sec: 2362.49 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 20:06:41,355 epoch 1 - iter 882/1476 - loss 0.83450367 - time (sec): 43.04 - samples/sec: 2330.69 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 20:06:48,377 epoch 1 - iter 1029/1476 - loss 0.75458015 - time (sec): 50.06 - samples/sec: 2317.34 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 20:06:55,693 epoch 1 - iter 1176/1476 - loss 0.68725226 - time (sec): 57.37 - samples/sec: 2303.13 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 20:07:03,442 epoch 1 - iter 1323/1476 - loss 0.63285062 - time (sec): 65.12 - samples/sec: 2281.72 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 20:07:10,621 epoch 1 - iter 1470/1476 - loss 0.58387807 - time (sec): 72.30 - samples/sec: 2294.27 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 20:07:10,883 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 20:07:10,883 EPOCH 1 done: loss 0.5825 - lr: 0.000030
86
+ 2023-10-17 20:07:17,234 DEV : loss 0.12818463146686554 - f1-score (micro avg) 0.7213
87
+ 2023-10-17 20:07:17,263 saving best model
88
+ 2023-10-17 20:07:17,633 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 20:07:24,643 epoch 2 - iter 147/1476 - loss 0.13820773 - time (sec): 7.01 - samples/sec: 2385.67 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 20:07:32,026 epoch 2 - iter 294/1476 - loss 0.13981224 - time (sec): 14.39 - samples/sec: 2427.68 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 20:07:39,398 epoch 2 - iter 441/1476 - loss 0.13892246 - time (sec): 21.76 - samples/sec: 2406.79 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 20:07:46,926 epoch 2 - iter 588/1476 - loss 0.13506201 - time (sec): 29.29 - samples/sec: 2319.51 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 20:07:54,318 epoch 2 - iter 735/1476 - loss 0.13369013 - time (sec): 36.68 - samples/sec: 2242.62 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 20:08:01,477 epoch 2 - iter 882/1476 - loss 0.13371218 - time (sec): 43.84 - samples/sec: 2227.85 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 20:08:09,013 epoch 2 - iter 1029/1476 - loss 0.13126252 - time (sec): 51.38 - samples/sec: 2221.33 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 20:08:16,524 epoch 2 - iter 1176/1476 - loss 0.13158893 - time (sec): 58.89 - samples/sec: 2215.15 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 20:08:24,525 epoch 2 - iter 1323/1476 - loss 0.13111484 - time (sec): 66.89 - samples/sec: 2220.09 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 20:08:31,884 epoch 2 - iter 1470/1476 - loss 0.13044166 - time (sec): 74.25 - samples/sec: 2233.50 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 20:08:32,152 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 20:08:32,153 EPOCH 2 done: loss 0.1302 - lr: 0.000027
101
+ 2023-10-17 20:08:43,623 DEV : loss 0.11906815320253372 - f1-score (micro avg) 0.8161
102
+ 2023-10-17 20:08:43,656 saving best model
103
+ 2023-10-17 20:08:44,148 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 20:08:51,602 epoch 3 - iter 147/1476 - loss 0.06487959 - time (sec): 7.45 - samples/sec: 2371.36 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 20:08:58,751 epoch 3 - iter 294/1476 - loss 0.07490643 - time (sec): 14.60 - samples/sec: 2407.47 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 20:09:05,781 epoch 3 - iter 441/1476 - loss 0.07115129 - time (sec): 21.63 - samples/sec: 2400.27 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 20:09:12,619 epoch 3 - iter 588/1476 - loss 0.07499880 - time (sec): 28.47 - samples/sec: 2387.22 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 20:09:19,700 epoch 3 - iter 735/1476 - loss 0.08049723 - time (sec): 35.55 - samples/sec: 2374.58 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 20:09:26,769 epoch 3 - iter 882/1476 - loss 0.08104214 - time (sec): 42.62 - samples/sec: 2339.21 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 20:09:34,338 epoch 3 - iter 1029/1476 - loss 0.08301973 - time (sec): 50.19 - samples/sec: 2344.05 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 20:09:41,499 epoch 3 - iter 1176/1476 - loss 0.08364485 - time (sec): 57.35 - samples/sec: 2332.93 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 20:09:48,739 epoch 3 - iter 1323/1476 - loss 0.08295442 - time (sec): 64.59 - samples/sec: 2321.23 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 20:09:56,409 epoch 3 - iter 1470/1476 - loss 0.08380446 - time (sec): 72.26 - samples/sec: 2296.93 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 20:09:56,681 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 20:09:56,681 EPOCH 3 done: loss 0.0838 - lr: 0.000023
116
+ 2023-10-17 20:10:08,037 DEV : loss 0.1379304975271225 - f1-score (micro avg) 0.8223
117
+ 2023-10-17 20:10:08,071 saving best model
118
+ 2023-10-17 20:10:08,547 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 20:10:15,658 epoch 4 - iter 147/1476 - loss 0.05625078 - time (sec): 7.11 - samples/sec: 2241.51 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 20:10:23,083 epoch 4 - iter 294/1476 - loss 0.05457959 - time (sec): 14.53 - samples/sec: 2321.35 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-17 20:10:30,071 epoch 4 - iter 441/1476 - loss 0.05927497 - time (sec): 21.52 - samples/sec: 2279.64 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 20:10:37,495 epoch 4 - iter 588/1476 - loss 0.05943946 - time (sec): 28.94 - samples/sec: 2265.71 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 20:10:45,002 epoch 4 - iter 735/1476 - loss 0.06174984 - time (sec): 36.45 - samples/sec: 2202.49 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-17 20:10:52,447 epoch 4 - iter 882/1476 - loss 0.05984732 - time (sec): 43.90 - samples/sec: 2211.74 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 20:10:59,329 epoch 4 - iter 1029/1476 - loss 0.05718144 - time (sec): 50.78 - samples/sec: 2222.57 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 20:11:06,848 epoch 4 - iter 1176/1476 - loss 0.05612735 - time (sec): 58.30 - samples/sec: 2247.18 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-17 20:11:13,792 epoch 4 - iter 1323/1476 - loss 0.05566318 - time (sec): 65.24 - samples/sec: 2254.84 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 20:11:21,672 epoch 4 - iter 1470/1476 - loss 0.05600539 - time (sec): 73.12 - samples/sec: 2266.68 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-17 20:11:21,955 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 20:11:21,955 EPOCH 4 done: loss 0.0559 - lr: 0.000020
131
+ 2023-10-17 20:11:33,291 DEV : loss 0.16167429089546204 - f1-score (micro avg) 0.846
132
+ 2023-10-17 20:11:33,323 saving best model
133
+ 2023-10-17 20:11:33,784 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-17 20:11:41,069 epoch 5 - iter 147/1476 - loss 0.03312893 - time (sec): 7.28 - samples/sec: 2445.55 - lr: 0.000020 - momentum: 0.000000
135
+ 2023-10-17 20:11:47,781 epoch 5 - iter 294/1476 - loss 0.03390059 - time (sec): 13.99 - samples/sec: 2407.91 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 20:11:54,966 epoch 5 - iter 441/1476 - loss 0.03294051 - time (sec): 21.18 - samples/sec: 2384.13 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-17 20:12:02,147 epoch 5 - iter 588/1476 - loss 0.03900188 - time (sec): 28.36 - samples/sec: 2358.13 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-17 20:12:09,336 epoch 5 - iter 735/1476 - loss 0.03809314 - time (sec): 35.55 - samples/sec: 2359.36 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 20:12:16,735 epoch 5 - iter 882/1476 - loss 0.03709148 - time (sec): 42.95 - samples/sec: 2337.85 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-17 20:12:23,992 epoch 5 - iter 1029/1476 - loss 0.03763883 - time (sec): 50.21 - samples/sec: 2316.28 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-17 20:12:30,787 epoch 5 - iter 1176/1476 - loss 0.03885216 - time (sec): 57.00 - samples/sec: 2309.32 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 20:12:38,354 epoch 5 - iter 1323/1476 - loss 0.03762999 - time (sec): 64.57 - samples/sec: 2325.78 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-17 20:12:45,409 epoch 5 - iter 1470/1476 - loss 0.03723655 - time (sec): 71.62 - samples/sec: 2317.05 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-17 20:12:45,675 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-17 20:12:45,675 EPOCH 5 done: loss 0.0377 - lr: 0.000017
146
+ 2023-10-17 20:12:57,231 DEV : loss 0.1790267527103424 - f1-score (micro avg) 0.8426
147
+ 2023-10-17 20:12:57,261 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-17 20:13:04,516 epoch 6 - iter 147/1476 - loss 0.02474197 - time (sec): 7.25 - samples/sec: 2182.35 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-17 20:13:11,743 epoch 6 - iter 294/1476 - loss 0.02176561 - time (sec): 14.48 - samples/sec: 2281.17 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-17 20:13:19,002 epoch 6 - iter 441/1476 - loss 0.02055555 - time (sec): 21.74 - samples/sec: 2289.19 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-17 20:13:26,628 epoch 6 - iter 588/1476 - loss 0.02265555 - time (sec): 29.37 - samples/sec: 2240.42 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-17 20:13:33,937 epoch 6 - iter 735/1476 - loss 0.02383975 - time (sec): 36.68 - samples/sec: 2233.14 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-17 20:13:41,057 epoch 6 - iter 882/1476 - loss 0.02491593 - time (sec): 43.80 - samples/sec: 2228.19 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-17 20:13:48,543 epoch 6 - iter 1029/1476 - loss 0.02373826 - time (sec): 51.28 - samples/sec: 2242.09 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-17 20:13:55,584 epoch 6 - iter 1176/1476 - loss 0.02425560 - time (sec): 58.32 - samples/sec: 2255.37 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-17 20:14:02,711 epoch 6 - iter 1323/1476 - loss 0.02426721 - time (sec): 65.45 - samples/sec: 2257.42 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-17 20:14:10,120 epoch 6 - iter 1470/1476 - loss 0.02573153 - time (sec): 72.86 - samples/sec: 2275.89 - lr: 0.000013 - momentum: 0.000000
158
+ 2023-10-17 20:14:10,406 ----------------------------------------------------------------------------------------------------
159
+ 2023-10-17 20:14:10,406 EPOCH 6 done: loss 0.0256 - lr: 0.000013
160
+ 2023-10-17 20:14:21,960 DEV : loss 0.1823473572731018 - f1-score (micro avg) 0.8415
161
+ 2023-10-17 20:14:21,993 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-17 20:14:29,433 epoch 7 - iter 147/1476 - loss 0.01049509 - time (sec): 7.44 - samples/sec: 2270.28 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-17 20:14:36,187 epoch 7 - iter 294/1476 - loss 0.01510363 - time (sec): 14.19 - samples/sec: 2341.53 - lr: 0.000013 - momentum: 0.000000
164
+ 2023-10-17 20:14:43,513 epoch 7 - iter 441/1476 - loss 0.01732408 - time (sec): 21.52 - samples/sec: 2376.45 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-17 20:14:50,969 epoch 7 - iter 588/1476 - loss 0.01690878 - time (sec): 28.97 - samples/sec: 2369.98 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-17 20:14:58,196 epoch 7 - iter 735/1476 - loss 0.02010782 - time (sec): 36.20 - samples/sec: 2330.85 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-17 20:15:05,260 epoch 7 - iter 882/1476 - loss 0.02004143 - time (sec): 43.27 - samples/sec: 2338.44 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-17 20:15:12,552 epoch 7 - iter 1029/1476 - loss 0.01839535 - time (sec): 50.56 - samples/sec: 2313.93 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-17 20:15:19,971 epoch 7 - iter 1176/1476 - loss 0.01888196 - time (sec): 57.98 - samples/sec: 2313.87 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-17 20:15:27,136 epoch 7 - iter 1323/1476 - loss 0.01831645 - time (sec): 65.14 - samples/sec: 2320.63 - lr: 0.000010 - momentum: 0.000000
171
+ 2023-10-17 20:15:33,941 epoch 7 - iter 1470/1476 - loss 0.01784839 - time (sec): 71.95 - samples/sec: 2305.36 - lr: 0.000010 - momentum: 0.000000
172
+ 2023-10-17 20:15:34,199 ----------------------------------------------------------------------------------------------------
173
+ 2023-10-17 20:15:34,200 EPOCH 7 done: loss 0.0180 - lr: 0.000010
174
+ 2023-10-17 20:15:45,643 DEV : loss 0.19402551651000977 - f1-score (micro avg) 0.8431
175
+ 2023-10-17 20:15:45,677 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-17 20:15:52,842 epoch 8 - iter 147/1476 - loss 0.01320226 - time (sec): 7.16 - samples/sec: 2278.37 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-17 20:16:00,302 epoch 8 - iter 294/1476 - loss 0.01657591 - time (sec): 14.62 - samples/sec: 2333.78 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-17 20:16:07,287 epoch 8 - iter 441/1476 - loss 0.01541718 - time (sec): 21.61 - samples/sec: 2302.97 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-17 20:16:14,178 epoch 8 - iter 588/1476 - loss 0.01364886 - time (sec): 28.50 - samples/sec: 2306.16 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-17 20:16:21,411 epoch 8 - iter 735/1476 - loss 0.01449721 - time (sec): 35.73 - samples/sec: 2315.68 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-17 20:16:28,411 epoch 8 - iter 882/1476 - loss 0.01320813 - time (sec): 42.73 - samples/sec: 2299.05 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-17 20:16:36,244 epoch 8 - iter 1029/1476 - loss 0.01499181 - time (sec): 50.57 - samples/sec: 2327.41 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-17 20:16:43,224 epoch 8 - iter 1176/1476 - loss 0.01478780 - time (sec): 57.55 - samples/sec: 2319.55 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-17 20:16:50,451 epoch 8 - iter 1323/1476 - loss 0.01429358 - time (sec): 64.77 - samples/sec: 2320.02 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-17 20:16:57,506 epoch 8 - iter 1470/1476 - loss 0.01418919 - time (sec): 71.83 - samples/sec: 2303.65 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-17 20:16:57,853 ----------------------------------------------------------------------------------------------------
187
+ 2023-10-17 20:16:57,853 EPOCH 8 done: loss 0.0141 - lr: 0.000007
188
+ 2023-10-17 20:17:09,346 DEV : loss 0.20007802546024323 - f1-score (micro avg) 0.8496
189
+ 2023-10-17 20:17:09,379 saving best model
190
+ 2023-10-17 20:17:09,862 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-17 20:17:17,475 epoch 9 - iter 147/1476 - loss 0.00436533 - time (sec): 7.61 - samples/sec: 2367.42 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-17 20:17:24,847 epoch 9 - iter 294/1476 - loss 0.00510740 - time (sec): 14.98 - samples/sec: 2429.53 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-17 20:17:32,428 epoch 9 - iter 441/1476 - loss 0.00763822 - time (sec): 22.56 - samples/sec: 2423.43 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-17 20:17:39,465 epoch 9 - iter 588/1476 - loss 0.00743456 - time (sec): 29.60 - samples/sec: 2361.44 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-17 20:17:47,117 epoch 9 - iter 735/1476 - loss 0.00689757 - time (sec): 37.25 - samples/sec: 2308.36 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-17 20:17:54,005 epoch 9 - iter 882/1476 - loss 0.00620989 - time (sec): 44.14 - samples/sec: 2316.95 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-17 20:18:00,853 epoch 9 - iter 1029/1476 - loss 0.00730072 - time (sec): 50.99 - samples/sec: 2301.79 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-17 20:18:07,914 epoch 9 - iter 1176/1476 - loss 0.00714572 - time (sec): 58.05 - samples/sec: 2289.46 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-17 20:18:15,625 epoch 9 - iter 1323/1476 - loss 0.00707158 - time (sec): 65.76 - samples/sec: 2299.61 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-17 20:18:22,423 epoch 9 - iter 1470/1476 - loss 0.00773284 - time (sec): 72.56 - samples/sec: 2283.54 - lr: 0.000003 - momentum: 0.000000
201
+ 2023-10-17 20:18:22,726 ----------------------------------------------------------------------------------------------------
202
+ 2023-10-17 20:18:22,726 EPOCH 9 done: loss 0.0078 - lr: 0.000003
203
+ 2023-10-17 20:18:34,319 DEV : loss 0.2041151374578476 - f1-score (micro avg) 0.8596
204
+ 2023-10-17 20:18:34,349 saving best model
205
+ 2023-10-17 20:18:34,828 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-17 20:18:42,621 epoch 10 - iter 147/1476 - loss 0.00709689 - time (sec): 7.79 - samples/sec: 2535.00 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-17 20:18:50,003 epoch 10 - iter 294/1476 - loss 0.00608595 - time (sec): 15.17 - samples/sec: 2439.34 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-17 20:18:57,062 epoch 10 - iter 441/1476 - loss 0.00472843 - time (sec): 22.23 - samples/sec: 2396.12 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-17 20:19:04,224 epoch 10 - iter 588/1476 - loss 0.00501675 - time (sec): 29.39 - samples/sec: 2314.37 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-17 20:19:11,140 epoch 10 - iter 735/1476 - loss 0.00493818 - time (sec): 36.31 - samples/sec: 2306.06 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-17 20:19:18,274 epoch 10 - iter 882/1476 - loss 0.00479554 - time (sec): 43.44 - samples/sec: 2297.04 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-17 20:19:25,625 epoch 10 - iter 1029/1476 - loss 0.00501958 - time (sec): 50.79 - samples/sec: 2305.35 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-17 20:19:32,682 epoch 10 - iter 1176/1476 - loss 0.00595793 - time (sec): 57.85 - samples/sec: 2291.26 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-17 20:19:39,780 epoch 10 - iter 1323/1476 - loss 0.00561353 - time (sec): 64.95 - samples/sec: 2284.04 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-17 20:19:47,478 epoch 10 - iter 1470/1476 - loss 0.00566830 - time (sec): 72.65 - samples/sec: 2283.76 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-17 20:19:47,747 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 20:19:47,748 EPOCH 10 done: loss 0.0057 - lr: 0.000000
218
+ 2023-10-17 20:19:59,001 DEV : loss 0.2035822868347168 - f1-score (micro avg) 0.8602
219
+ 2023-10-17 20:19:59,031 saving best model
220
+ 2023-10-17 20:19:59,889 ----------------------------------------------------------------------------------------------------
221
+ 2023-10-17 20:19:59,891 Loading model from best epoch ...
222
+ 2023-10-17 20:20:01,237 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
223
+ 2023-10-17 20:20:07,284
224
+ Results:
225
+ - F-score (micro) 0.7934
226
+ - F-score (macro) 0.7114
227
+ - Accuracy 0.6758
228
+
229
+ By class:
230
+ precision recall f1-score support
231
+
232
+ loc 0.8474 0.8671 0.8571 858
233
+ pers 0.7487 0.8045 0.7756 537
234
+ org 0.5329 0.6136 0.5704 132
235
+ prod 0.7500 0.7377 0.7438 61
236
+ time 0.5625 0.6667 0.6102 54
237
+
238
+ micro avg 0.7730 0.8149 0.7934 1642
239
+ macro avg 0.6883 0.7379 0.7114 1642
240
+ weighted avg 0.7768 0.8149 0.7951 1642
241
+
242
+ 2023-10-17 20:20:07,284 ----------------------------------------------------------------------------------------------------