Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697573158.bce904bcef33.2482.2 +3 -0
- test.tsv +0 -0
- training.log +242 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c484c00140ec2bf40498189122b25f3dc729d8c9ef81a0dfbe7a5e539f1f19d3
|
3 |
+
size 440966725
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:07:17 0.0000 0.5825 0.1282 0.6854 0.7612 0.7213 0.6041
|
3 |
+
2 20:08:43 0.0000 0.1302 0.1191 0.8099 0.8225 0.8161 0.7105
|
4 |
+
3 20:10:08 0.0000 0.0838 0.1379 0.8026 0.8431 0.8223 0.7219
|
5 |
+
4 20:11:33 0.0000 0.0559 0.1617 0.8275 0.8654 0.8460 0.7532
|
6 |
+
5 20:12:57 0.0000 0.0377 0.1790 0.8331 0.8522 0.8426 0.7523
|
7 |
+
6 20:14:21 0.0000 0.0256 0.1823 0.8361 0.8471 0.8415 0.7519
|
8 |
+
7 20:15:45 0.0000 0.0180 0.1940 0.8337 0.8528 0.8431 0.7593
|
9 |
+
8 20:17:09 0.0000 0.0141 0.2001 0.8375 0.8620 0.8496 0.7636
|
10 |
+
9 20:18:34 0.0000 0.0078 0.2041 0.8567 0.8625 0.8596 0.7767
|
11 |
+
10 20:19:59 0.0000 0.0057 0.2036 0.8534 0.8671 0.8602 0.7772
|
runs/events.out.tfevents.1697573158.bce904bcef33.2482.2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:75891cb61b7645d5b64aa9b2fe835ecae4a71d485ee79b33b98f2b3bedb4c59c
|
3 |
+
size 825716
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 20:05:58,315 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 20:05:58,316 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 20:05:58,316 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 20:05:58,316 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
|
49 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 20:05:58,317 Train: 5901 sentences
|
51 |
+
2023-10-17 20:05:58,317 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 20:05:58,317 Training Params:
|
54 |
+
2023-10-17 20:05:58,317 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 20:05:58,317 - mini_batch_size: "4"
|
56 |
+
2023-10-17 20:05:58,317 - max_epochs: "10"
|
57 |
+
2023-10-17 20:05:58,317 - shuffle: "True"
|
58 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 20:05:58,317 Plugins:
|
60 |
+
2023-10-17 20:05:58,317 - TensorboardLogger
|
61 |
+
2023-10-17 20:05:58,317 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 20:05:58,317 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 20:05:58,317 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 20:05:58,317 Computation:
|
67 |
+
2023-10-17 20:05:58,317 - compute on device: cuda:0
|
68 |
+
2023-10-17 20:05:58,317 - embedding storage: none
|
69 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 20:05:58,317 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
71 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 20:05:58,317 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 20:05:58,317 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 20:06:05,703 epoch 1 - iter 147/1476 - loss 2.88376740 - time (sec): 7.38 - samples/sec: 2399.29 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 20:06:12,547 epoch 1 - iter 294/1476 - loss 1.83405663 - time (sec): 14.23 - samples/sec: 2329.66 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 20:06:20,094 epoch 1 - iter 441/1476 - loss 1.34685714 - time (sec): 21.78 - samples/sec: 2364.86 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 20:06:27,578 epoch 1 - iter 588/1476 - loss 1.08468691 - time (sec): 29.26 - samples/sec: 2373.41 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 20:06:34,521 epoch 1 - iter 735/1476 - loss 0.93275820 - time (sec): 36.20 - samples/sec: 2362.49 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 20:06:41,355 epoch 1 - iter 882/1476 - loss 0.83450367 - time (sec): 43.04 - samples/sec: 2330.69 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 20:06:48,377 epoch 1 - iter 1029/1476 - loss 0.75458015 - time (sec): 50.06 - samples/sec: 2317.34 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 20:06:55,693 epoch 1 - iter 1176/1476 - loss 0.68725226 - time (sec): 57.37 - samples/sec: 2303.13 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 20:07:03,442 epoch 1 - iter 1323/1476 - loss 0.63285062 - time (sec): 65.12 - samples/sec: 2281.72 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 20:07:10,621 epoch 1 - iter 1470/1476 - loss 0.58387807 - time (sec): 72.30 - samples/sec: 2294.27 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 20:07:10,883 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 20:07:10,883 EPOCH 1 done: loss 0.5825 - lr: 0.000030
|
86 |
+
2023-10-17 20:07:17,234 DEV : loss 0.12818463146686554 - f1-score (micro avg) 0.7213
|
87 |
+
2023-10-17 20:07:17,263 saving best model
|
88 |
+
2023-10-17 20:07:17,633 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 20:07:24,643 epoch 2 - iter 147/1476 - loss 0.13820773 - time (sec): 7.01 - samples/sec: 2385.67 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 20:07:32,026 epoch 2 - iter 294/1476 - loss 0.13981224 - time (sec): 14.39 - samples/sec: 2427.68 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 20:07:39,398 epoch 2 - iter 441/1476 - loss 0.13892246 - time (sec): 21.76 - samples/sec: 2406.79 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 20:07:46,926 epoch 2 - iter 588/1476 - loss 0.13506201 - time (sec): 29.29 - samples/sec: 2319.51 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 20:07:54,318 epoch 2 - iter 735/1476 - loss 0.13369013 - time (sec): 36.68 - samples/sec: 2242.62 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 20:08:01,477 epoch 2 - iter 882/1476 - loss 0.13371218 - time (sec): 43.84 - samples/sec: 2227.85 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 20:08:09,013 epoch 2 - iter 1029/1476 - loss 0.13126252 - time (sec): 51.38 - samples/sec: 2221.33 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 20:08:16,524 epoch 2 - iter 1176/1476 - loss 0.13158893 - time (sec): 58.89 - samples/sec: 2215.15 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 20:08:24,525 epoch 2 - iter 1323/1476 - loss 0.13111484 - time (sec): 66.89 - samples/sec: 2220.09 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 20:08:31,884 epoch 2 - iter 1470/1476 - loss 0.13044166 - time (sec): 74.25 - samples/sec: 2233.50 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 20:08:32,152 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 20:08:32,153 EPOCH 2 done: loss 0.1302 - lr: 0.000027
|
101 |
+
2023-10-17 20:08:43,623 DEV : loss 0.11906815320253372 - f1-score (micro avg) 0.8161
|
102 |
+
2023-10-17 20:08:43,656 saving best model
|
103 |
+
2023-10-17 20:08:44,148 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 20:08:51,602 epoch 3 - iter 147/1476 - loss 0.06487959 - time (sec): 7.45 - samples/sec: 2371.36 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 20:08:58,751 epoch 3 - iter 294/1476 - loss 0.07490643 - time (sec): 14.60 - samples/sec: 2407.47 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 20:09:05,781 epoch 3 - iter 441/1476 - loss 0.07115129 - time (sec): 21.63 - samples/sec: 2400.27 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 20:09:12,619 epoch 3 - iter 588/1476 - loss 0.07499880 - time (sec): 28.47 - samples/sec: 2387.22 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 20:09:19,700 epoch 3 - iter 735/1476 - loss 0.08049723 - time (sec): 35.55 - samples/sec: 2374.58 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 20:09:26,769 epoch 3 - iter 882/1476 - loss 0.08104214 - time (sec): 42.62 - samples/sec: 2339.21 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 20:09:34,338 epoch 3 - iter 1029/1476 - loss 0.08301973 - time (sec): 50.19 - samples/sec: 2344.05 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 20:09:41,499 epoch 3 - iter 1176/1476 - loss 0.08364485 - time (sec): 57.35 - samples/sec: 2332.93 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 20:09:48,739 epoch 3 - iter 1323/1476 - loss 0.08295442 - time (sec): 64.59 - samples/sec: 2321.23 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 20:09:56,409 epoch 3 - iter 1470/1476 - loss 0.08380446 - time (sec): 72.26 - samples/sec: 2296.93 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 20:09:56,681 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 20:09:56,681 EPOCH 3 done: loss 0.0838 - lr: 0.000023
|
116 |
+
2023-10-17 20:10:08,037 DEV : loss 0.1379304975271225 - f1-score (micro avg) 0.8223
|
117 |
+
2023-10-17 20:10:08,071 saving best model
|
118 |
+
2023-10-17 20:10:08,547 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 20:10:15,658 epoch 4 - iter 147/1476 - loss 0.05625078 - time (sec): 7.11 - samples/sec: 2241.51 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 20:10:23,083 epoch 4 - iter 294/1476 - loss 0.05457959 - time (sec): 14.53 - samples/sec: 2321.35 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 20:10:30,071 epoch 4 - iter 441/1476 - loss 0.05927497 - time (sec): 21.52 - samples/sec: 2279.64 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 20:10:37,495 epoch 4 - iter 588/1476 - loss 0.05943946 - time (sec): 28.94 - samples/sec: 2265.71 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 20:10:45,002 epoch 4 - iter 735/1476 - loss 0.06174984 - time (sec): 36.45 - samples/sec: 2202.49 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 20:10:52,447 epoch 4 - iter 882/1476 - loss 0.05984732 - time (sec): 43.90 - samples/sec: 2211.74 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 20:10:59,329 epoch 4 - iter 1029/1476 - loss 0.05718144 - time (sec): 50.78 - samples/sec: 2222.57 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 20:11:06,848 epoch 4 - iter 1176/1476 - loss 0.05612735 - time (sec): 58.30 - samples/sec: 2247.18 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 20:11:13,792 epoch 4 - iter 1323/1476 - loss 0.05566318 - time (sec): 65.24 - samples/sec: 2254.84 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 20:11:21,672 epoch 4 - iter 1470/1476 - loss 0.05600539 - time (sec): 73.12 - samples/sec: 2266.68 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 20:11:21,955 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 20:11:21,955 EPOCH 4 done: loss 0.0559 - lr: 0.000020
|
131 |
+
2023-10-17 20:11:33,291 DEV : loss 0.16167429089546204 - f1-score (micro avg) 0.846
|
132 |
+
2023-10-17 20:11:33,323 saving best model
|
133 |
+
2023-10-17 20:11:33,784 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 20:11:41,069 epoch 5 - iter 147/1476 - loss 0.03312893 - time (sec): 7.28 - samples/sec: 2445.55 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-17 20:11:47,781 epoch 5 - iter 294/1476 - loss 0.03390059 - time (sec): 13.99 - samples/sec: 2407.91 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 20:11:54,966 epoch 5 - iter 441/1476 - loss 0.03294051 - time (sec): 21.18 - samples/sec: 2384.13 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 20:12:02,147 epoch 5 - iter 588/1476 - loss 0.03900188 - time (sec): 28.36 - samples/sec: 2358.13 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-17 20:12:09,336 epoch 5 - iter 735/1476 - loss 0.03809314 - time (sec): 35.55 - samples/sec: 2359.36 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 20:12:16,735 epoch 5 - iter 882/1476 - loss 0.03709148 - time (sec): 42.95 - samples/sec: 2337.85 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 20:12:23,992 epoch 5 - iter 1029/1476 - loss 0.03763883 - time (sec): 50.21 - samples/sec: 2316.28 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-17 20:12:30,787 epoch 5 - iter 1176/1476 - loss 0.03885216 - time (sec): 57.00 - samples/sec: 2309.32 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 20:12:38,354 epoch 5 - iter 1323/1476 - loss 0.03762999 - time (sec): 64.57 - samples/sec: 2325.78 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 20:12:45,409 epoch 5 - iter 1470/1476 - loss 0.03723655 - time (sec): 71.62 - samples/sec: 2317.05 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-17 20:12:45,675 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 20:12:45,675 EPOCH 5 done: loss 0.0377 - lr: 0.000017
|
146 |
+
2023-10-17 20:12:57,231 DEV : loss 0.1790267527103424 - f1-score (micro avg) 0.8426
|
147 |
+
2023-10-17 20:12:57,261 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-17 20:13:04,516 epoch 6 - iter 147/1476 - loss 0.02474197 - time (sec): 7.25 - samples/sec: 2182.35 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-17 20:13:11,743 epoch 6 - iter 294/1476 - loss 0.02176561 - time (sec): 14.48 - samples/sec: 2281.17 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 20:13:19,002 epoch 6 - iter 441/1476 - loss 0.02055555 - time (sec): 21.74 - samples/sec: 2289.19 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-17 20:13:26,628 epoch 6 - iter 588/1476 - loss 0.02265555 - time (sec): 29.37 - samples/sec: 2240.42 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-17 20:13:33,937 epoch 6 - iter 735/1476 - loss 0.02383975 - time (sec): 36.68 - samples/sec: 2233.14 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 20:13:41,057 epoch 6 - iter 882/1476 - loss 0.02491593 - time (sec): 43.80 - samples/sec: 2228.19 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-17 20:13:48,543 epoch 6 - iter 1029/1476 - loss 0.02373826 - time (sec): 51.28 - samples/sec: 2242.09 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-17 20:13:55,584 epoch 6 - iter 1176/1476 - loss 0.02425560 - time (sec): 58.32 - samples/sec: 2255.37 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 20:14:02,711 epoch 6 - iter 1323/1476 - loss 0.02426721 - time (sec): 65.45 - samples/sec: 2257.42 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-17 20:14:10,120 epoch 6 - iter 1470/1476 - loss 0.02573153 - time (sec): 72.86 - samples/sec: 2275.89 - lr: 0.000013 - momentum: 0.000000
|
158 |
+
2023-10-17 20:14:10,406 ----------------------------------------------------------------------------------------------------
|
159 |
+
2023-10-17 20:14:10,406 EPOCH 6 done: loss 0.0256 - lr: 0.000013
|
160 |
+
2023-10-17 20:14:21,960 DEV : loss 0.1823473572731018 - f1-score (micro avg) 0.8415
|
161 |
+
2023-10-17 20:14:21,993 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-17 20:14:29,433 epoch 7 - iter 147/1476 - loss 0.01049509 - time (sec): 7.44 - samples/sec: 2270.28 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-17 20:14:36,187 epoch 7 - iter 294/1476 - loss 0.01510363 - time (sec): 14.19 - samples/sec: 2341.53 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-17 20:14:43,513 epoch 7 - iter 441/1476 - loss 0.01732408 - time (sec): 21.52 - samples/sec: 2376.45 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-17 20:14:50,969 epoch 7 - iter 588/1476 - loss 0.01690878 - time (sec): 28.97 - samples/sec: 2369.98 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 20:14:58,196 epoch 7 - iter 735/1476 - loss 0.02010782 - time (sec): 36.20 - samples/sec: 2330.85 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-17 20:15:05,260 epoch 7 - iter 882/1476 - loss 0.02004143 - time (sec): 43.27 - samples/sec: 2338.44 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-17 20:15:12,552 epoch 7 - iter 1029/1476 - loss 0.01839535 - time (sec): 50.56 - samples/sec: 2313.93 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 20:15:19,971 epoch 7 - iter 1176/1476 - loss 0.01888196 - time (sec): 57.98 - samples/sec: 2313.87 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-17 20:15:27,136 epoch 7 - iter 1323/1476 - loss 0.01831645 - time (sec): 65.14 - samples/sec: 2320.63 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-17 20:15:33,941 epoch 7 - iter 1470/1476 - loss 0.01784839 - time (sec): 71.95 - samples/sec: 2305.36 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-17 20:15:34,199 ----------------------------------------------------------------------------------------------------
|
173 |
+
2023-10-17 20:15:34,200 EPOCH 7 done: loss 0.0180 - lr: 0.000010
|
174 |
+
2023-10-17 20:15:45,643 DEV : loss 0.19402551651000977 - f1-score (micro avg) 0.8431
|
175 |
+
2023-10-17 20:15:45,677 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-17 20:15:52,842 epoch 8 - iter 147/1476 - loss 0.01320226 - time (sec): 7.16 - samples/sec: 2278.37 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-17 20:16:00,302 epoch 8 - iter 294/1476 - loss 0.01657591 - time (sec): 14.62 - samples/sec: 2333.78 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-17 20:16:07,287 epoch 8 - iter 441/1476 - loss 0.01541718 - time (sec): 21.61 - samples/sec: 2302.97 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-17 20:16:14,178 epoch 8 - iter 588/1476 - loss 0.01364886 - time (sec): 28.50 - samples/sec: 2306.16 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-17 20:16:21,411 epoch 8 - iter 735/1476 - loss 0.01449721 - time (sec): 35.73 - samples/sec: 2315.68 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-17 20:16:28,411 epoch 8 - iter 882/1476 - loss 0.01320813 - time (sec): 42.73 - samples/sec: 2299.05 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-17 20:16:36,244 epoch 8 - iter 1029/1476 - loss 0.01499181 - time (sec): 50.57 - samples/sec: 2327.41 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-17 20:16:43,224 epoch 8 - iter 1176/1476 - loss 0.01478780 - time (sec): 57.55 - samples/sec: 2319.55 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-17 20:16:50,451 epoch 8 - iter 1323/1476 - loss 0.01429358 - time (sec): 64.77 - samples/sec: 2320.02 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-17 20:16:57,506 epoch 8 - iter 1470/1476 - loss 0.01418919 - time (sec): 71.83 - samples/sec: 2303.65 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-17 20:16:57,853 ----------------------------------------------------------------------------------------------------
|
187 |
+
2023-10-17 20:16:57,853 EPOCH 8 done: loss 0.0141 - lr: 0.000007
|
188 |
+
2023-10-17 20:17:09,346 DEV : loss 0.20007802546024323 - f1-score (micro avg) 0.8496
|
189 |
+
2023-10-17 20:17:09,379 saving best model
|
190 |
+
2023-10-17 20:17:09,862 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-17 20:17:17,475 epoch 9 - iter 147/1476 - loss 0.00436533 - time (sec): 7.61 - samples/sec: 2367.42 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-17 20:17:24,847 epoch 9 - iter 294/1476 - loss 0.00510740 - time (sec): 14.98 - samples/sec: 2429.53 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-17 20:17:32,428 epoch 9 - iter 441/1476 - loss 0.00763822 - time (sec): 22.56 - samples/sec: 2423.43 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-17 20:17:39,465 epoch 9 - iter 588/1476 - loss 0.00743456 - time (sec): 29.60 - samples/sec: 2361.44 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-17 20:17:47,117 epoch 9 - iter 735/1476 - loss 0.00689757 - time (sec): 37.25 - samples/sec: 2308.36 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-17 20:17:54,005 epoch 9 - iter 882/1476 - loss 0.00620989 - time (sec): 44.14 - samples/sec: 2316.95 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-17 20:18:00,853 epoch 9 - iter 1029/1476 - loss 0.00730072 - time (sec): 50.99 - samples/sec: 2301.79 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-17 20:18:07,914 epoch 9 - iter 1176/1476 - loss 0.00714572 - time (sec): 58.05 - samples/sec: 2289.46 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-17 20:18:15,625 epoch 9 - iter 1323/1476 - loss 0.00707158 - time (sec): 65.76 - samples/sec: 2299.61 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-17 20:18:22,423 epoch 9 - iter 1470/1476 - loss 0.00773284 - time (sec): 72.56 - samples/sec: 2283.54 - lr: 0.000003 - momentum: 0.000000
|
201 |
+
2023-10-17 20:18:22,726 ----------------------------------------------------------------------------------------------------
|
202 |
+
2023-10-17 20:18:22,726 EPOCH 9 done: loss 0.0078 - lr: 0.000003
|
203 |
+
2023-10-17 20:18:34,319 DEV : loss 0.2041151374578476 - f1-score (micro avg) 0.8596
|
204 |
+
2023-10-17 20:18:34,349 saving best model
|
205 |
+
2023-10-17 20:18:34,828 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-17 20:18:42,621 epoch 10 - iter 147/1476 - loss 0.00709689 - time (sec): 7.79 - samples/sec: 2535.00 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-17 20:18:50,003 epoch 10 - iter 294/1476 - loss 0.00608595 - time (sec): 15.17 - samples/sec: 2439.34 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-17 20:18:57,062 epoch 10 - iter 441/1476 - loss 0.00472843 - time (sec): 22.23 - samples/sec: 2396.12 - lr: 0.000002 - momentum: 0.000000
|
209 |
+
2023-10-17 20:19:04,224 epoch 10 - iter 588/1476 - loss 0.00501675 - time (sec): 29.39 - samples/sec: 2314.37 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-17 20:19:11,140 epoch 10 - iter 735/1476 - loss 0.00493818 - time (sec): 36.31 - samples/sec: 2306.06 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-17 20:19:18,274 epoch 10 - iter 882/1476 - loss 0.00479554 - time (sec): 43.44 - samples/sec: 2297.04 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-17 20:19:25,625 epoch 10 - iter 1029/1476 - loss 0.00501958 - time (sec): 50.79 - samples/sec: 2305.35 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-17 20:19:32,682 epoch 10 - iter 1176/1476 - loss 0.00595793 - time (sec): 57.85 - samples/sec: 2291.26 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-17 20:19:39,780 epoch 10 - iter 1323/1476 - loss 0.00561353 - time (sec): 64.95 - samples/sec: 2284.04 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-10-17 20:19:47,478 epoch 10 - iter 1470/1476 - loss 0.00566830 - time (sec): 72.65 - samples/sec: 2283.76 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-17 20:19:47,747 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 20:19:47,748 EPOCH 10 done: loss 0.0057 - lr: 0.000000
|
218 |
+
2023-10-17 20:19:59,001 DEV : loss 0.2035822868347168 - f1-score (micro avg) 0.8602
|
219 |
+
2023-10-17 20:19:59,031 saving best model
|
220 |
+
2023-10-17 20:19:59,889 ----------------------------------------------------------------------------------------------------
|
221 |
+
2023-10-17 20:19:59,891 Loading model from best epoch ...
|
222 |
+
2023-10-17 20:20:01,237 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
|
223 |
+
2023-10-17 20:20:07,284
|
224 |
+
Results:
|
225 |
+
- F-score (micro) 0.7934
|
226 |
+
- F-score (macro) 0.7114
|
227 |
+
- Accuracy 0.6758
|
228 |
+
|
229 |
+
By class:
|
230 |
+
precision recall f1-score support
|
231 |
+
|
232 |
+
loc 0.8474 0.8671 0.8571 858
|
233 |
+
pers 0.7487 0.8045 0.7756 537
|
234 |
+
org 0.5329 0.6136 0.5704 132
|
235 |
+
prod 0.7500 0.7377 0.7438 61
|
236 |
+
time 0.5625 0.6667 0.6102 54
|
237 |
+
|
238 |
+
micro avg 0.7730 0.8149 0.7934 1642
|
239 |
+
macro avg 0.6883 0.7379 0.7114 1642
|
240 |
+
weighted avg 0.7768 0.8149 0.7951 1642
|
241 |
+
|
242 |
+
2023-10-17 20:20:07,284 ----------------------------------------------------------------------------------------------------
|