Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697574939.bce904bcef33.2482.4 +3 -0
- test.tsv +0 -0
- training.log +243 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a9f81ee51bf7dc6a1ea05e8ce24d545e73678fbd2f781ae419d3442371541f3e
|
3 |
+
size 440966725
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:36:38 0.0000 0.7104 0.1236 0.7087 0.7108 0.7098 0.5756
|
3 |
+
2 20:37:41 0.0000 0.1272 0.1132 0.7695 0.8070 0.7878 0.6751
|
4 |
+
3 20:38:44 0.0000 0.0731 0.1058 0.8026 0.8385 0.8202 0.7215
|
5 |
+
4 20:39:47 0.0000 0.0481 0.1444 0.8217 0.8448 0.8331 0.7397
|
6 |
+
5 20:40:50 0.0000 0.0376 0.1691 0.8353 0.8597 0.8473 0.7562
|
7 |
+
6 20:41:53 0.0000 0.0248 0.1866 0.8350 0.8465 0.8407 0.7476
|
8 |
+
7 20:42:55 0.0000 0.0186 0.1866 0.8443 0.8574 0.8508 0.7591
|
9 |
+
8 20:43:58 0.0000 0.0134 0.1891 0.8397 0.8551 0.8473 0.7590
|
10 |
+
9 20:45:00 0.0000 0.0096 0.1901 0.8435 0.8614 0.8524 0.7677
|
11 |
+
10 20:46:05 0.0000 0.0060 0.1948 0.8496 0.8608 0.8552 0.7720
|
runs/events.out.tfevents.1697574939.bce904bcef33.2482.4
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4886ecd53a6b6dfc6609af91164f4c8a9485699020e7cc143c15cd29f96d4175
|
3 |
+
size 415388
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 20:35:39,058 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 20:35:39,058 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 20:35:39,059 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
|
49 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 20:35:39,059 Train: 5901 sentences
|
51 |
+
2023-10-17 20:35:39,059 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 20:35:39,059 Training Params:
|
54 |
+
2023-10-17 20:35:39,059 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 20:35:39,059 - mini_batch_size: "8"
|
56 |
+
2023-10-17 20:35:39,059 - max_epochs: "10"
|
57 |
+
2023-10-17 20:35:39,059 - shuffle: "True"
|
58 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 20:35:39,059 Plugins:
|
60 |
+
2023-10-17 20:35:39,059 - TensorboardLogger
|
61 |
+
2023-10-17 20:35:39,059 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 20:35:39,059 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 20:35:39,059 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 20:35:39,059 Computation:
|
67 |
+
2023-10-17 20:35:39,059 - compute on device: cuda:0
|
68 |
+
2023-10-17 20:35:39,059 - embedding storage: none
|
69 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 20:35:39,059 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
|
71 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 20:35:39,059 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 20:35:39,060 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 20:35:44,132 epoch 1 - iter 73/738 - loss 3.64116980 - time (sec): 5.07 - samples/sec: 3162.96 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 20:35:49,571 epoch 1 - iter 146/738 - loss 2.18983601 - time (sec): 10.51 - samples/sec: 3329.36 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 20:35:54,419 epoch 1 - iter 219/738 - loss 1.69219914 - time (sec): 15.36 - samples/sec: 3248.03 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 20:35:59,996 epoch 1 - iter 292/738 - loss 1.37862460 - time (sec): 20.94 - samples/sec: 3185.56 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 20:36:05,134 epoch 1 - iter 365/738 - loss 1.18355123 - time (sec): 26.07 - samples/sec: 3175.50 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 20:36:10,456 epoch 1 - iter 438/738 - loss 1.04194700 - time (sec): 31.39 - samples/sec: 3148.63 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 20:36:15,936 epoch 1 - iter 511/738 - loss 0.93151159 - time (sec): 36.88 - samples/sec: 3129.66 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 20:36:20,907 epoch 1 - iter 584/738 - loss 0.84481528 - time (sec): 41.85 - samples/sec: 3135.43 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 20:36:25,940 epoch 1 - iter 657/738 - loss 0.77413535 - time (sec): 46.88 - samples/sec: 3158.04 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 20:36:31,086 epoch 1 - iter 730/738 - loss 0.71881625 - time (sec): 52.03 - samples/sec: 3151.08 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 20:36:31,913 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 20:36:31,914 EPOCH 1 done: loss 0.7104 - lr: 0.000030
|
86 |
+
2023-10-17 20:36:38,260 DEV : loss 0.12364602833986282 - f1-score (micro avg) 0.7098
|
87 |
+
2023-10-17 20:36:38,292 saving best model
|
88 |
+
2023-10-17 20:36:38,694 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 20:36:43,332 epoch 2 - iter 73/738 - loss 0.13875612 - time (sec): 4.64 - samples/sec: 3216.42 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 20:36:48,882 epoch 2 - iter 146/738 - loss 0.15187587 - time (sec): 10.19 - samples/sec: 3250.23 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 20:36:54,024 epoch 2 - iter 219/738 - loss 0.15074889 - time (sec): 15.33 - samples/sec: 3273.00 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 20:36:59,600 epoch 2 - iter 292/738 - loss 0.14401873 - time (sec): 20.91 - samples/sec: 3264.88 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 20:37:05,153 epoch 2 - iter 365/738 - loss 0.14152811 - time (sec): 26.46 - samples/sec: 3253.39 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 20:37:10,261 epoch 2 - iter 438/738 - loss 0.13657078 - time (sec): 31.57 - samples/sec: 3243.60 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 20:37:15,103 epoch 2 - iter 511/738 - loss 0.13404559 - time (sec): 36.41 - samples/sec: 3230.17 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 20:37:19,610 epoch 2 - iter 584/738 - loss 0.13244950 - time (sec): 40.92 - samples/sec: 3243.80 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 20:37:24,161 epoch 2 - iter 657/738 - loss 0.12901636 - time (sec): 45.47 - samples/sec: 3265.18 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 20:37:29,190 epoch 2 - iter 730/738 - loss 0.12733023 - time (sec): 50.49 - samples/sec: 3266.90 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 20:37:29,645 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 20:37:29,646 EPOCH 2 done: loss 0.1272 - lr: 0.000027
|
101 |
+
2023-10-17 20:37:41,241 DEV : loss 0.11324623972177505 - f1-score (micro avg) 0.7878
|
102 |
+
2023-10-17 20:37:41,272 saving best model
|
103 |
+
2023-10-17 20:37:41,764 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 20:37:46,421 epoch 3 - iter 73/738 - loss 0.08078576 - time (sec): 4.66 - samples/sec: 3186.13 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 20:37:51,451 epoch 3 - iter 146/738 - loss 0.08098219 - time (sec): 9.69 - samples/sec: 3153.82 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 20:37:56,715 epoch 3 - iter 219/738 - loss 0.07988876 - time (sec): 14.95 - samples/sec: 3177.16 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 20:38:01,548 epoch 3 - iter 292/738 - loss 0.08049920 - time (sec): 19.78 - samples/sec: 3235.71 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 20:38:06,420 epoch 3 - iter 365/738 - loss 0.07427187 - time (sec): 24.65 - samples/sec: 3258.39 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 20:38:11,210 epoch 3 - iter 438/738 - loss 0.07767398 - time (sec): 29.44 - samples/sec: 3263.15 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 20:38:17,124 epoch 3 - iter 511/738 - loss 0.07609348 - time (sec): 35.36 - samples/sec: 3250.97 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 20:38:22,355 epoch 3 - iter 584/738 - loss 0.07449298 - time (sec): 40.59 - samples/sec: 3244.97 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 20:38:27,736 epoch 3 - iter 657/738 - loss 0.07310394 - time (sec): 45.97 - samples/sec: 3231.28 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 20:38:32,682 epoch 3 - iter 730/738 - loss 0.07307713 - time (sec): 50.92 - samples/sec: 3234.62 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 20:38:33,210 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 20:38:33,211 EPOCH 3 done: loss 0.0731 - lr: 0.000023
|
116 |
+
2023-10-17 20:38:44,573 DEV : loss 0.10576911270618439 - f1-score (micro avg) 0.8202
|
117 |
+
2023-10-17 20:38:44,603 saving best model
|
118 |
+
2023-10-17 20:38:45,099 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 20:38:50,418 epoch 4 - iter 73/738 - loss 0.04149513 - time (sec): 5.31 - samples/sec: 3190.23 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 20:38:56,360 epoch 4 - iter 146/738 - loss 0.04069771 - time (sec): 11.26 - samples/sec: 3121.34 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 20:39:01,363 epoch 4 - iter 219/738 - loss 0.04207686 - time (sec): 16.26 - samples/sec: 3158.45 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 20:39:06,635 epoch 4 - iter 292/738 - loss 0.04550866 - time (sec): 21.53 - samples/sec: 3158.71 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 20:39:11,345 epoch 4 - iter 365/738 - loss 0.04746098 - time (sec): 26.24 - samples/sec: 3176.94 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 20:39:15,967 epoch 4 - iter 438/738 - loss 0.04785900 - time (sec): 30.86 - samples/sec: 3208.66 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 20:39:20,773 epoch 4 - iter 511/738 - loss 0.04796136 - time (sec): 35.67 - samples/sec: 3231.21 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 20:39:25,854 epoch 4 - iter 584/738 - loss 0.04916438 - time (sec): 40.75 - samples/sec: 3241.96 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 20:39:31,026 epoch 4 - iter 657/738 - loss 0.04802511 - time (sec): 45.92 - samples/sec: 3255.52 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 20:39:35,741 epoch 4 - iter 730/738 - loss 0.04827589 - time (sec): 50.64 - samples/sec: 3252.29 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 20:39:36,300 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 20:39:36,301 EPOCH 4 done: loss 0.0481 - lr: 0.000020
|
131 |
+
2023-10-17 20:39:47,755 DEV : loss 0.14439940452575684 - f1-score (micro avg) 0.8331
|
132 |
+
2023-10-17 20:39:47,786 saving best model
|
133 |
+
2023-10-17 20:39:48,346 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 20:39:53,447 epoch 5 - iter 73/738 - loss 0.03820328 - time (sec): 5.10 - samples/sec: 3420.65 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-17 20:39:58,482 epoch 5 - iter 146/738 - loss 0.03746306 - time (sec): 10.14 - samples/sec: 3361.46 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 20:40:03,991 epoch 5 - iter 219/738 - loss 0.03939664 - time (sec): 15.64 - samples/sec: 3326.58 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 20:40:09,113 epoch 5 - iter 292/738 - loss 0.03908194 - time (sec): 20.77 - samples/sec: 3301.73 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-17 20:40:14,194 epoch 5 - iter 365/738 - loss 0.03717953 - time (sec): 25.85 - samples/sec: 3302.44 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 20:40:19,366 epoch 5 - iter 438/738 - loss 0.03476013 - time (sec): 31.02 - samples/sec: 3294.32 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 20:40:23,875 epoch 5 - iter 511/738 - loss 0.03534706 - time (sec): 35.53 - samples/sec: 3291.70 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-17 20:40:28,537 epoch 5 - iter 584/738 - loss 0.03692831 - time (sec): 40.19 - samples/sec: 3287.11 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 20:40:33,247 epoch 5 - iter 657/738 - loss 0.03670076 - time (sec): 44.90 - samples/sec: 3298.03 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 20:40:38,478 epoch 5 - iter 730/738 - loss 0.03777783 - time (sec): 50.13 - samples/sec: 3291.89 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-17 20:40:38,944 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 20:40:38,944 EPOCH 5 done: loss 0.0376 - lr: 0.000017
|
146 |
+
2023-10-17 20:40:50,539 DEV : loss 0.1691051870584488 - f1-score (micro avg) 0.8473
|
147 |
+
2023-10-17 20:40:50,572 saving best model
|
148 |
+
2023-10-17 20:40:51,123 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-17 20:40:56,085 epoch 6 - iter 73/738 - loss 0.02494294 - time (sec): 4.96 - samples/sec: 3274.56 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 20:41:01,232 epoch 6 - iter 146/738 - loss 0.02742836 - time (sec): 10.11 - samples/sec: 3184.58 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-17 20:41:05,758 epoch 6 - iter 219/738 - loss 0.02446918 - time (sec): 14.63 - samples/sec: 3219.58 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-17 20:41:10,892 epoch 6 - iter 292/738 - loss 0.02160983 - time (sec): 19.77 - samples/sec: 3227.62 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 20:41:16,184 epoch 6 - iter 365/738 - loss 0.02527468 - time (sec): 25.06 - samples/sec: 3203.28 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-17 20:41:20,766 epoch 6 - iter 438/738 - loss 0.02394053 - time (sec): 29.64 - samples/sec: 3234.50 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-17 20:41:25,447 epoch 6 - iter 511/738 - loss 0.02404647 - time (sec): 34.32 - samples/sec: 3250.82 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 20:41:30,296 epoch 6 - iter 584/738 - loss 0.02483385 - time (sec): 39.17 - samples/sec: 3245.50 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-17 20:41:36,103 epoch 6 - iter 657/738 - loss 0.02533529 - time (sec): 44.98 - samples/sec: 3276.03 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-17 20:41:41,227 epoch 6 - iter 730/738 - loss 0.02453929 - time (sec): 50.10 - samples/sec: 3274.64 - lr: 0.000013 - momentum: 0.000000
|
159 |
+
2023-10-17 20:41:41,964 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-17 20:41:41,964 EPOCH 6 done: loss 0.0248 - lr: 0.000013
|
161 |
+
2023-10-17 20:41:53,305 DEV : loss 0.18655835092067719 - f1-score (micro avg) 0.8407
|
162 |
+
2023-10-17 20:41:53,335 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-17 20:41:58,205 epoch 7 - iter 73/738 - loss 0.01138158 - time (sec): 4.87 - samples/sec: 3127.58 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-17 20:42:02,805 epoch 7 - iter 146/738 - loss 0.01083901 - time (sec): 9.47 - samples/sec: 3295.87 - lr: 0.000013 - momentum: 0.000000
|
165 |
+
2023-10-17 20:42:07,435 epoch 7 - iter 219/738 - loss 0.01471937 - time (sec): 14.10 - samples/sec: 3243.52 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 20:42:12,317 epoch 7 - iter 292/738 - loss 0.01463951 - time (sec): 18.98 - samples/sec: 3268.16 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-17 20:42:17,142 epoch 7 - iter 365/738 - loss 0.01650812 - time (sec): 23.81 - samples/sec: 3279.92 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-17 20:42:22,200 epoch 7 - iter 438/738 - loss 0.01641925 - time (sec): 28.86 - samples/sec: 3325.52 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 20:42:28,377 epoch 7 - iter 511/738 - loss 0.01890794 - time (sec): 35.04 - samples/sec: 3327.95 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-17 20:42:33,160 epoch 7 - iter 584/738 - loss 0.01836577 - time (sec): 39.82 - samples/sec: 3322.87 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-17 20:42:38,103 epoch 7 - iter 657/738 - loss 0.01813755 - time (sec): 44.77 - samples/sec: 3319.80 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-17 20:42:43,114 epoch 7 - iter 730/738 - loss 0.01811864 - time (sec): 49.78 - samples/sec: 3308.21 - lr: 0.000010 - momentum: 0.000000
|
173 |
+
2023-10-17 20:42:43,689 ----------------------------------------------------------------------------------------------------
|
174 |
+
2023-10-17 20:42:43,690 EPOCH 7 done: loss 0.0186 - lr: 0.000010
|
175 |
+
2023-10-17 20:42:55,131 DEV : loss 0.1865713894367218 - f1-score (micro avg) 0.8508
|
176 |
+
2023-10-17 20:42:55,161 saving best model
|
177 |
+
2023-10-17 20:42:55,722 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-17 20:43:00,521 epoch 8 - iter 73/738 - loss 0.01395816 - time (sec): 4.79 - samples/sec: 3243.82 - lr: 0.000010 - momentum: 0.000000
|
179 |
+
2023-10-17 20:43:05,444 epoch 8 - iter 146/738 - loss 0.01071471 - time (sec): 9.72 - samples/sec: 3238.91 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-17 20:43:10,235 epoch 8 - iter 219/738 - loss 0.01056576 - time (sec): 14.51 - samples/sec: 3260.54 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-17 20:43:14,954 epoch 8 - iter 292/738 - loss 0.01129541 - time (sec): 19.23 - samples/sec: 3257.05 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-17 20:43:21,213 epoch 8 - iter 365/738 - loss 0.01372072 - time (sec): 25.49 - samples/sec: 3246.49 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-17 20:43:27,166 epoch 8 - iter 438/738 - loss 0.01307143 - time (sec): 31.44 - samples/sec: 3231.89 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-17 20:43:32,257 epoch 8 - iter 511/738 - loss 0.01235288 - time (sec): 36.53 - samples/sec: 3215.28 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-17 20:43:37,396 epoch 8 - iter 584/738 - loss 0.01288183 - time (sec): 41.67 - samples/sec: 3220.07 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-17 20:43:42,270 epoch 8 - iter 657/738 - loss 0.01324443 - time (sec): 46.54 - samples/sec: 3209.11 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-17 20:43:46,637 epoch 8 - iter 730/738 - loss 0.01347051 - time (sec): 50.91 - samples/sec: 3231.77 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-17 20:43:47,173 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 20:43:47,173 EPOCH 8 done: loss 0.0134 - lr: 0.000007
|
190 |
+
2023-10-17 20:43:58,642 DEV : loss 0.18905526399612427 - f1-score (micro avg) 0.8473
|
191 |
+
2023-10-17 20:43:58,674 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-17 20:44:03,908 epoch 9 - iter 73/738 - loss 0.01099258 - time (sec): 5.23 - samples/sec: 3427.72 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-17 20:44:09,018 epoch 9 - iter 146/738 - loss 0.00781828 - time (sec): 10.34 - samples/sec: 3329.22 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-17 20:44:13,566 epoch 9 - iter 219/738 - loss 0.00703626 - time (sec): 14.89 - samples/sec: 3350.47 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-17 20:44:18,952 epoch 9 - iter 292/738 - loss 0.00747490 - time (sec): 20.28 - samples/sec: 3299.06 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-17 20:44:23,717 epoch 9 - iter 365/738 - loss 0.00926992 - time (sec): 25.04 - samples/sec: 3298.05 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-17 20:44:28,988 epoch 9 - iter 438/738 - loss 0.00910148 - time (sec): 30.31 - samples/sec: 3298.16 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-17 20:44:34,231 epoch 9 - iter 511/738 - loss 0.01035846 - time (sec): 35.56 - samples/sec: 3282.09 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-17 20:44:39,225 epoch 9 - iter 584/738 - loss 0.01077101 - time (sec): 40.55 - samples/sec: 3279.83 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-17 20:44:44,390 epoch 9 - iter 657/738 - loss 0.01029712 - time (sec): 45.71 - samples/sec: 3280.07 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-17 20:44:48,891 epoch 9 - iter 730/738 - loss 0.00965010 - time (sec): 50.22 - samples/sec: 3284.93 - lr: 0.000003 - momentum: 0.000000
|
202 |
+
2023-10-17 20:44:49,402 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 20:44:49,403 EPOCH 9 done: loss 0.0096 - lr: 0.000003
|
204 |
+
2023-10-17 20:45:00,841 DEV : loss 0.19005419313907623 - f1-score (micro avg) 0.8524
|
205 |
+
2023-10-17 20:45:00,871 saving best model
|
206 |
+
2023-10-17 20:45:01,430 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-17 20:45:06,530 epoch 10 - iter 73/738 - loss 0.00555099 - time (sec): 5.09 - samples/sec: 3200.00 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-17 20:45:12,684 epoch 10 - iter 146/738 - loss 0.00688196 - time (sec): 11.25 - samples/sec: 3108.59 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-17 20:45:17,624 epoch 10 - iter 219/738 - loss 0.00781010 - time (sec): 16.19 - samples/sec: 3112.06 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-17 20:45:22,889 epoch 10 - iter 292/738 - loss 0.00705490 - time (sec): 21.45 - samples/sec: 3145.34 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-17 20:45:27,707 epoch 10 - iter 365/738 - loss 0.00643071 - time (sec): 26.27 - samples/sec: 3163.51 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-17 20:45:32,265 epoch 10 - iter 438/738 - loss 0.00712227 - time (sec): 30.83 - samples/sec: 3201.88 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-17 20:45:37,454 epoch 10 - iter 511/738 - loss 0.00670318 - time (sec): 36.02 - samples/sec: 3182.86 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-17 20:45:42,386 epoch 10 - iter 584/738 - loss 0.00651760 - time (sec): 40.95 - samples/sec: 3202.32 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-17 20:45:47,406 epoch 10 - iter 657/738 - loss 0.00642097 - time (sec): 45.97 - samples/sec: 3209.18 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-17 20:45:52,833 epoch 10 - iter 730/738 - loss 0.00609764 - time (sec): 51.40 - samples/sec: 3207.79 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-17 20:45:53,346 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-17 20:45:53,346 EPOCH 10 done: loss 0.0060 - lr: 0.000000
|
219 |
+
2023-10-17 20:46:05,101 DEV : loss 0.19482703506946564 - f1-score (micro avg) 0.8552
|
220 |
+
2023-10-17 20:46:05,137 saving best model
|
221 |
+
2023-10-17 20:46:06,066 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-17 20:46:06,068 Loading model from best epoch ...
|
223 |
+
2023-10-17 20:46:07,473 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
|
224 |
+
2023-10-17 20:46:13,965
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.8082
|
227 |
+
- F-score (macro) 0.7155
|
228 |
+
- Accuracy 0.6981
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
loc 0.8568 0.8788 0.8677 858
|
234 |
+
pers 0.7727 0.8231 0.7971 537
|
235 |
+
org 0.5970 0.6061 0.6015 132
|
236 |
+
time 0.5625 0.6667 0.6102 54
|
237 |
+
prod 0.7321 0.6721 0.7009 61
|
238 |
+
|
239 |
+
micro avg 0.7931 0.8240 0.8082 1642
|
240 |
+
macro avg 0.7042 0.7293 0.7155 1642
|
241 |
+
weighted avg 0.7941 0.8240 0.8085 1642
|
242 |
+
|
243 |
+
2023-10-17 20:46:13,965 ----------------------------------------------------------------------------------------------------
|