Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697560366.4c6324b99746.1390.7 +3 -0
- test.tsv +0 -0
- training.log +243 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5531f0fddf13e81c074c4b1a14ae587ca658eb5215a9bde20d7aa936c3f40965
|
3 |
+
size 440966725
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 16:34:04 0.0000 0.6287 0.2032 0.7117 0.5848 0.6421 0.4816
|
3 |
+
2 16:35:28 0.0000 0.1522 0.1864 0.6792 0.6638 0.6714 0.5296
|
4 |
+
3 16:36:53 0.0000 0.0937 0.1743 0.7055 0.7756 0.7389 0.6067
|
5 |
+
4 16:38:16 0.0000 0.0612 0.1783 0.7409 0.7756 0.7578 0.6367
|
6 |
+
5 16:39:43 0.0000 0.0403 0.2714 0.7964 0.7553 0.7753 0.6483
|
7 |
+
6 16:41:08 0.0000 0.0247 0.2380 0.7791 0.7803 0.7797 0.6574
|
8 |
+
7 16:42:32 0.0000 0.0155 0.2671 0.7842 0.7615 0.7727 0.6446
|
9 |
+
8 16:43:53 0.0000 0.0115 0.2598 0.7769 0.7897 0.7832 0.6614
|
10 |
+
9 16:45:18 0.0000 0.0066 0.2585 0.7961 0.7936 0.7948 0.6798
|
11 |
+
10 16:46:42 0.0000 0.0034 0.2764 0.8093 0.7795 0.7941 0.6741
|
runs/events.out.tfevents.1697560366.4c6324b99746.1390.7
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ead97e29e9a97f2076b815aaaaabe03fb481953667f4a12968cd4c91a0bd77ed
|
3 |
+
size 502124
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 16:32:46,824 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 16:32:46,826 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 16:32:46,826 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 16:32:46,826 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
|
49 |
+
2023-10-17 16:32:46,826 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 16:32:46,826 Train: 3575 sentences
|
51 |
+
2023-10-17 16:32:46,827 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 16:32:46,827 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 16:32:46,827 Training Params:
|
54 |
+
2023-10-17 16:32:46,827 - learning_rate: "5e-05"
|
55 |
+
2023-10-17 16:32:46,827 - mini_batch_size: "4"
|
56 |
+
2023-10-17 16:32:46,827 - max_epochs: "10"
|
57 |
+
2023-10-17 16:32:46,827 - shuffle: "True"
|
58 |
+
2023-10-17 16:32:46,827 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 16:32:46,827 Plugins:
|
60 |
+
2023-10-17 16:32:46,827 - TensorboardLogger
|
61 |
+
2023-10-17 16:32:46,827 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 16:32:46,827 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 16:32:46,827 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 16:32:46,827 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 16:32:46,827 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 16:32:46,828 Computation:
|
67 |
+
2023-10-17 16:32:46,828 - compute on device: cuda:0
|
68 |
+
2023-10-17 16:32:46,828 - embedding storage: none
|
69 |
+
2023-10-17 16:32:46,828 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 16:32:46,828 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
|
71 |
+
2023-10-17 16:32:46,828 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 16:32:46,828 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 16:32:46,828 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 16:32:53,873 epoch 1 - iter 89/894 - loss 3.11410852 - time (sec): 7.04 - samples/sec: 1268.20 - lr: 0.000005 - momentum: 0.000000
|
75 |
+
2023-10-17 16:33:00,942 epoch 1 - iter 178/894 - loss 1.92954730 - time (sec): 14.11 - samples/sec: 1229.02 - lr: 0.000010 - momentum: 0.000000
|
76 |
+
2023-10-17 16:33:08,109 epoch 1 - iter 267/894 - loss 1.46044506 - time (sec): 21.28 - samples/sec: 1176.16 - lr: 0.000015 - momentum: 0.000000
|
77 |
+
2023-10-17 16:33:15,623 epoch 1 - iter 356/894 - loss 1.15857887 - time (sec): 28.79 - samples/sec: 1202.71 - lr: 0.000020 - momentum: 0.000000
|
78 |
+
2023-10-17 16:33:22,680 epoch 1 - iter 445/894 - loss 0.99646697 - time (sec): 35.85 - samples/sec: 1201.91 - lr: 0.000025 - momentum: 0.000000
|
79 |
+
2023-10-17 16:33:29,890 epoch 1 - iter 534/894 - loss 0.88147494 - time (sec): 43.06 - samples/sec: 1200.57 - lr: 0.000030 - momentum: 0.000000
|
80 |
+
2023-10-17 16:33:36,882 epoch 1 - iter 623/894 - loss 0.79966928 - time (sec): 50.05 - samples/sec: 1191.36 - lr: 0.000035 - momentum: 0.000000
|
81 |
+
2023-10-17 16:33:43,748 epoch 1 - iter 712/894 - loss 0.73134493 - time (sec): 56.92 - samples/sec: 1203.05 - lr: 0.000040 - momentum: 0.000000
|
82 |
+
2023-10-17 16:33:50,870 epoch 1 - iter 801/894 - loss 0.67281431 - time (sec): 64.04 - samples/sec: 1216.42 - lr: 0.000045 - momentum: 0.000000
|
83 |
+
2023-10-17 16:33:57,724 epoch 1 - iter 890/894 - loss 0.62977212 - time (sec): 70.89 - samples/sec: 1216.64 - lr: 0.000050 - momentum: 0.000000
|
84 |
+
2023-10-17 16:33:58,031 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 16:33:58,032 EPOCH 1 done: loss 0.6287 - lr: 0.000050
|
86 |
+
2023-10-17 16:34:04,362 DEV : loss 0.20323888957500458 - f1-score (micro avg) 0.6421
|
87 |
+
2023-10-17 16:34:04,418 saving best model
|
88 |
+
2023-10-17 16:34:04,952 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 16:34:11,744 epoch 2 - iter 89/894 - loss 0.16949173 - time (sec): 6.79 - samples/sec: 1248.93 - lr: 0.000049 - momentum: 0.000000
|
90 |
+
2023-10-17 16:34:19,004 epoch 2 - iter 178/894 - loss 0.17515983 - time (sec): 14.05 - samples/sec: 1281.61 - lr: 0.000049 - momentum: 0.000000
|
91 |
+
2023-10-17 16:34:25,969 epoch 2 - iter 267/894 - loss 0.17426985 - time (sec): 21.01 - samples/sec: 1246.46 - lr: 0.000048 - momentum: 0.000000
|
92 |
+
2023-10-17 16:34:33,351 epoch 2 - iter 356/894 - loss 0.17013608 - time (sec): 28.40 - samples/sec: 1246.77 - lr: 0.000048 - momentum: 0.000000
|
93 |
+
2023-10-17 16:34:40,525 epoch 2 - iter 445/894 - loss 0.16745688 - time (sec): 35.57 - samples/sec: 1239.00 - lr: 0.000047 - momentum: 0.000000
|
94 |
+
2023-10-17 16:34:47,700 epoch 2 - iter 534/894 - loss 0.16144297 - time (sec): 42.75 - samples/sec: 1217.63 - lr: 0.000047 - momentum: 0.000000
|
95 |
+
2023-10-17 16:34:54,935 epoch 2 - iter 623/894 - loss 0.15623398 - time (sec): 49.98 - samples/sec: 1222.12 - lr: 0.000046 - momentum: 0.000000
|
96 |
+
2023-10-17 16:35:02,179 epoch 2 - iter 712/894 - loss 0.15158756 - time (sec): 57.23 - samples/sec: 1228.90 - lr: 0.000046 - momentum: 0.000000
|
97 |
+
2023-10-17 16:35:09,277 epoch 2 - iter 801/894 - loss 0.15137973 - time (sec): 64.32 - samples/sec: 1219.42 - lr: 0.000045 - momentum: 0.000000
|
98 |
+
2023-10-17 16:35:16,421 epoch 2 - iter 890/894 - loss 0.15238150 - time (sec): 71.47 - samples/sec: 1207.65 - lr: 0.000044 - momentum: 0.000000
|
99 |
+
2023-10-17 16:35:16,734 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 16:35:16,735 EPOCH 2 done: loss 0.1522 - lr: 0.000044
|
101 |
+
2023-10-17 16:35:28,076 DEV : loss 0.18639299273490906 - f1-score (micro avg) 0.6714
|
102 |
+
2023-10-17 16:35:28,132 saving best model
|
103 |
+
2023-10-17 16:35:29,529 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 16:35:36,530 epoch 3 - iter 89/894 - loss 0.11203303 - time (sec): 7.00 - samples/sec: 1150.94 - lr: 0.000044 - momentum: 0.000000
|
105 |
+
2023-10-17 16:35:43,592 epoch 3 - iter 178/894 - loss 0.10878561 - time (sec): 14.06 - samples/sec: 1194.19 - lr: 0.000043 - momentum: 0.000000
|
106 |
+
2023-10-17 16:35:50,849 epoch 3 - iter 267/894 - loss 0.09676878 - time (sec): 21.32 - samples/sec: 1209.94 - lr: 0.000043 - momentum: 0.000000
|
107 |
+
2023-10-17 16:35:57,786 epoch 3 - iter 356/894 - loss 0.09079084 - time (sec): 28.25 - samples/sec: 1205.20 - lr: 0.000042 - momentum: 0.000000
|
108 |
+
2023-10-17 16:36:04,925 epoch 3 - iter 445/894 - loss 0.08840102 - time (sec): 35.39 - samples/sec: 1221.78 - lr: 0.000042 - momentum: 0.000000
|
109 |
+
2023-10-17 16:36:12,130 epoch 3 - iter 534/894 - loss 0.09039810 - time (sec): 42.60 - samples/sec: 1208.26 - lr: 0.000041 - momentum: 0.000000
|
110 |
+
2023-10-17 16:36:19,527 epoch 3 - iter 623/894 - loss 0.08822105 - time (sec): 49.99 - samples/sec: 1209.13 - lr: 0.000041 - momentum: 0.000000
|
111 |
+
2023-10-17 16:36:26,922 epoch 3 - iter 712/894 - loss 0.08881662 - time (sec): 57.39 - samples/sec: 1201.43 - lr: 0.000040 - momentum: 0.000000
|
112 |
+
2023-10-17 16:36:34,295 epoch 3 - iter 801/894 - loss 0.09266436 - time (sec): 64.76 - samples/sec: 1194.34 - lr: 0.000039 - momentum: 0.000000
|
113 |
+
2023-10-17 16:36:41,729 epoch 3 - iter 890/894 - loss 0.09358859 - time (sec): 72.20 - samples/sec: 1194.00 - lr: 0.000039 - momentum: 0.000000
|
114 |
+
2023-10-17 16:36:42,065 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 16:36:42,065 EPOCH 3 done: loss 0.0937 - lr: 0.000039
|
116 |
+
2023-10-17 16:36:53,475 DEV : loss 0.17432451248168945 - f1-score (micro avg) 0.7389
|
117 |
+
2023-10-17 16:36:53,531 saving best model
|
118 |
+
2023-10-17 16:36:54,922 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 16:37:02,198 epoch 4 - iter 89/894 - loss 0.05862225 - time (sec): 7.27 - samples/sec: 1317.64 - lr: 0.000038 - momentum: 0.000000
|
120 |
+
2023-10-17 16:37:09,386 epoch 4 - iter 178/894 - loss 0.05796894 - time (sec): 14.46 - samples/sec: 1320.39 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-17 16:37:16,288 epoch 4 - iter 267/894 - loss 0.05799161 - time (sec): 21.36 - samples/sec: 1272.90 - lr: 0.000037 - momentum: 0.000000
|
122 |
+
2023-10-17 16:37:23,185 epoch 4 - iter 356/894 - loss 0.05680783 - time (sec): 28.26 - samples/sec: 1239.46 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-17 16:37:30,107 epoch 4 - iter 445/894 - loss 0.05803381 - time (sec): 35.18 - samples/sec: 1236.96 - lr: 0.000036 - momentum: 0.000000
|
124 |
+
2023-10-17 16:37:37,232 epoch 4 - iter 534/894 - loss 0.05629713 - time (sec): 42.31 - samples/sec: 1236.71 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-17 16:37:44,144 epoch 4 - iter 623/894 - loss 0.05739990 - time (sec): 49.22 - samples/sec: 1231.42 - lr: 0.000035 - momentum: 0.000000
|
126 |
+
2023-10-17 16:37:51,253 epoch 4 - iter 712/894 - loss 0.05819088 - time (sec): 56.33 - samples/sec: 1231.58 - lr: 0.000034 - momentum: 0.000000
|
127 |
+
2023-10-17 16:37:58,226 epoch 4 - iter 801/894 - loss 0.05824462 - time (sec): 63.30 - samples/sec: 1229.49 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-17 16:38:05,069 epoch 4 - iter 890/894 - loss 0.06068208 - time (sec): 70.14 - samples/sec: 1228.00 - lr: 0.000033 - momentum: 0.000000
|
129 |
+
2023-10-17 16:38:05,382 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 16:38:05,382 EPOCH 4 done: loss 0.0612 - lr: 0.000033
|
131 |
+
2023-10-17 16:38:16,763 DEV : loss 0.17829285562038422 - f1-score (micro avg) 0.7578
|
132 |
+
2023-10-17 16:38:16,817 saving best model
|
133 |
+
2023-10-17 16:38:18,207 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 16:38:25,151 epoch 5 - iter 89/894 - loss 0.02982003 - time (sec): 6.94 - samples/sec: 1208.40 - lr: 0.000033 - momentum: 0.000000
|
135 |
+
2023-10-17 16:38:32,390 epoch 5 - iter 178/894 - loss 0.03304612 - time (sec): 14.18 - samples/sec: 1271.34 - lr: 0.000032 - momentum: 0.000000
|
136 |
+
2023-10-17 16:38:39,495 epoch 5 - iter 267/894 - loss 0.03586922 - time (sec): 21.28 - samples/sec: 1258.57 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-17 16:38:47,102 epoch 5 - iter 356/894 - loss 0.04145233 - time (sec): 28.89 - samples/sec: 1216.66 - lr: 0.000031 - momentum: 0.000000
|
138 |
+
2023-10-17 16:38:54,547 epoch 5 - iter 445/894 - loss 0.04280853 - time (sec): 36.33 - samples/sec: 1189.27 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-17 16:39:02,581 epoch 5 - iter 534/894 - loss 0.04154607 - time (sec): 44.37 - samples/sec: 1176.45 - lr: 0.000030 - momentum: 0.000000
|
140 |
+
2023-10-17 16:39:09,838 epoch 5 - iter 623/894 - loss 0.04203572 - time (sec): 51.63 - samples/sec: 1175.56 - lr: 0.000029 - momentum: 0.000000
|
141 |
+
2023-10-17 16:39:17,085 epoch 5 - iter 712/894 - loss 0.04172114 - time (sec): 58.87 - samples/sec: 1184.70 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-17 16:39:24,099 epoch 5 - iter 801/894 - loss 0.04233064 - time (sec): 65.89 - samples/sec: 1182.10 - lr: 0.000028 - momentum: 0.000000
|
143 |
+
2023-10-17 16:39:31,558 epoch 5 - iter 890/894 - loss 0.04024411 - time (sec): 73.35 - samples/sec: 1176.40 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-17 16:39:31,872 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 16:39:31,872 EPOCH 5 done: loss 0.0403 - lr: 0.000028
|
146 |
+
2023-10-17 16:39:43,356 DEV : loss 0.2713957130908966 - f1-score (micro avg) 0.7753
|
147 |
+
2023-10-17 16:39:43,411 saving best model
|
148 |
+
2023-10-17 16:39:44,800 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-17 16:39:51,905 epoch 6 - iter 89/894 - loss 0.03784356 - time (sec): 7.10 - samples/sec: 1252.78 - lr: 0.000027 - momentum: 0.000000
|
150 |
+
2023-10-17 16:39:59,301 epoch 6 - iter 178/894 - loss 0.03078874 - time (sec): 14.50 - samples/sec: 1213.14 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-17 16:40:06,519 epoch 6 - iter 267/894 - loss 0.02968316 - time (sec): 21.72 - samples/sec: 1191.18 - lr: 0.000026 - momentum: 0.000000
|
152 |
+
2023-10-17 16:40:13,613 epoch 6 - iter 356/894 - loss 0.03060611 - time (sec): 28.81 - samples/sec: 1188.95 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-17 16:40:21,175 epoch 6 - iter 445/894 - loss 0.02585728 - time (sec): 36.37 - samples/sec: 1182.02 - lr: 0.000025 - momentum: 0.000000
|
154 |
+
2023-10-17 16:40:28,273 epoch 6 - iter 534/894 - loss 0.02507947 - time (sec): 43.47 - samples/sec: 1177.02 - lr: 0.000024 - momentum: 0.000000
|
155 |
+
2023-10-17 16:40:35,216 epoch 6 - iter 623/894 - loss 0.02444122 - time (sec): 50.41 - samples/sec: 1173.00 - lr: 0.000024 - momentum: 0.000000
|
156 |
+
2023-10-17 16:40:42,613 epoch 6 - iter 712/894 - loss 0.02585665 - time (sec): 57.81 - samples/sec: 1180.03 - lr: 0.000023 - momentum: 0.000000
|
157 |
+
2023-10-17 16:40:49,698 epoch 6 - iter 801/894 - loss 0.02528235 - time (sec): 64.89 - samples/sec: 1180.84 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-17 16:40:57,157 epoch 6 - iter 890/894 - loss 0.02483238 - time (sec): 72.35 - samples/sec: 1191.34 - lr: 0.000022 - momentum: 0.000000
|
159 |
+
2023-10-17 16:40:57,481 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-17 16:40:57,481 EPOCH 6 done: loss 0.0247 - lr: 0.000022
|
161 |
+
2023-10-17 16:41:08,509 DEV : loss 0.23799937963485718 - f1-score (micro avg) 0.7797
|
162 |
+
2023-10-17 16:41:08,567 saving best model
|
163 |
+
2023-10-17 16:41:09,970 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-17 16:41:17,048 epoch 7 - iter 89/894 - loss 0.01634402 - time (sec): 7.07 - samples/sec: 1230.27 - lr: 0.000022 - momentum: 0.000000
|
165 |
+
2023-10-17 16:41:23,790 epoch 7 - iter 178/894 - loss 0.01132312 - time (sec): 13.82 - samples/sec: 1189.82 - lr: 0.000021 - momentum: 0.000000
|
166 |
+
2023-10-17 16:41:30,778 epoch 7 - iter 267/894 - loss 0.01278808 - time (sec): 20.80 - samples/sec: 1202.61 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-17 16:41:37,945 epoch 7 - iter 356/894 - loss 0.01168855 - time (sec): 27.97 - samples/sec: 1219.29 - lr: 0.000020 - momentum: 0.000000
|
168 |
+
2023-10-17 16:41:44,979 epoch 7 - iter 445/894 - loss 0.01495033 - time (sec): 35.00 - samples/sec: 1220.15 - lr: 0.000019 - momentum: 0.000000
|
169 |
+
2023-10-17 16:41:52,075 epoch 7 - iter 534/894 - loss 0.01544760 - time (sec): 42.10 - samples/sec: 1226.15 - lr: 0.000019 - momentum: 0.000000
|
170 |
+
2023-10-17 16:41:59,111 epoch 7 - iter 623/894 - loss 0.01402268 - time (sec): 49.14 - samples/sec: 1224.39 - lr: 0.000018 - momentum: 0.000000
|
171 |
+
2023-10-17 16:42:06,783 epoch 7 - iter 712/894 - loss 0.01470202 - time (sec): 56.81 - samples/sec: 1220.18 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-17 16:42:13,897 epoch 7 - iter 801/894 - loss 0.01567910 - time (sec): 63.92 - samples/sec: 1222.31 - lr: 0.000017 - momentum: 0.000000
|
173 |
+
2023-10-17 16:42:20,822 epoch 7 - iter 890/894 - loss 0.01542654 - time (sec): 70.85 - samples/sec: 1215.22 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-17 16:42:21,144 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 16:42:21,144 EPOCH 7 done: loss 0.0155 - lr: 0.000017
|
176 |
+
2023-10-17 16:42:32,075 DEV : loss 0.2671242356300354 - f1-score (micro avg) 0.7727
|
177 |
+
2023-10-17 16:42:32,135 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-17 16:42:38,984 epoch 8 - iter 89/894 - loss 0.00642232 - time (sec): 6.85 - samples/sec: 1286.01 - lr: 0.000016 - momentum: 0.000000
|
179 |
+
2023-10-17 16:42:45,825 epoch 8 - iter 178/894 - loss 0.01547951 - time (sec): 13.69 - samples/sec: 1244.24 - lr: 0.000016 - momentum: 0.000000
|
180 |
+
2023-10-17 16:42:52,646 epoch 8 - iter 267/894 - loss 0.01183362 - time (sec): 20.51 - samples/sec: 1237.58 - lr: 0.000015 - momentum: 0.000000
|
181 |
+
2023-10-17 16:42:59,577 epoch 8 - iter 356/894 - loss 0.01316590 - time (sec): 27.44 - samples/sec: 1224.66 - lr: 0.000014 - momentum: 0.000000
|
182 |
+
2023-10-17 16:43:06,861 epoch 8 - iter 445/894 - loss 0.01148861 - time (sec): 34.72 - samples/sec: 1255.91 - lr: 0.000014 - momentum: 0.000000
|
183 |
+
2023-10-17 16:43:13,946 epoch 8 - iter 534/894 - loss 0.01218895 - time (sec): 41.81 - samples/sec: 1255.45 - lr: 0.000013 - momentum: 0.000000
|
184 |
+
2023-10-17 16:43:20,701 epoch 8 - iter 623/894 - loss 0.01256129 - time (sec): 48.56 - samples/sec: 1269.29 - lr: 0.000013 - momentum: 0.000000
|
185 |
+
2023-10-17 16:43:27,458 epoch 8 - iter 712/894 - loss 0.01163940 - time (sec): 55.32 - samples/sec: 1259.03 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2023-10-17 16:43:34,177 epoch 8 - iter 801/894 - loss 0.01183209 - time (sec): 62.04 - samples/sec: 1257.71 - lr: 0.000012 - momentum: 0.000000
|
187 |
+
2023-10-17 16:43:40,862 epoch 8 - iter 890/894 - loss 0.01144474 - time (sec): 68.72 - samples/sec: 1255.86 - lr: 0.000011 - momentum: 0.000000
|
188 |
+
2023-10-17 16:43:41,154 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 16:43:41,155 EPOCH 8 done: loss 0.0115 - lr: 0.000011
|
190 |
+
2023-10-17 16:43:52,982 DEV : loss 0.25983676314353943 - f1-score (micro avg) 0.7832
|
191 |
+
2023-10-17 16:43:53,059 saving best model
|
192 |
+
2023-10-17 16:43:54,471 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-17 16:44:01,588 epoch 9 - iter 89/894 - loss 0.00948902 - time (sec): 7.11 - samples/sec: 1194.84 - lr: 0.000011 - momentum: 0.000000
|
194 |
+
2023-10-17 16:44:08,811 epoch 9 - iter 178/894 - loss 0.00788875 - time (sec): 14.34 - samples/sec: 1195.01 - lr: 0.000010 - momentum: 0.000000
|
195 |
+
2023-10-17 16:44:16,000 epoch 9 - iter 267/894 - loss 0.00936076 - time (sec): 21.53 - samples/sec: 1154.95 - lr: 0.000009 - momentum: 0.000000
|
196 |
+
2023-10-17 16:44:23,123 epoch 9 - iter 356/894 - loss 0.00847747 - time (sec): 28.65 - samples/sec: 1180.69 - lr: 0.000009 - momentum: 0.000000
|
197 |
+
2023-10-17 16:44:30,220 epoch 9 - iter 445/894 - loss 0.00842446 - time (sec): 35.75 - samples/sec: 1194.06 - lr: 0.000008 - momentum: 0.000000
|
198 |
+
2023-10-17 16:44:37,433 epoch 9 - iter 534/894 - loss 0.00877574 - time (sec): 42.96 - samples/sec: 1201.69 - lr: 0.000008 - momentum: 0.000000
|
199 |
+
2023-10-17 16:44:44,464 epoch 9 - iter 623/894 - loss 0.00788580 - time (sec): 49.99 - samples/sec: 1201.20 - lr: 0.000007 - momentum: 0.000000
|
200 |
+
2023-10-17 16:44:51,571 epoch 9 - iter 712/894 - loss 0.00767381 - time (sec): 57.10 - samples/sec: 1204.23 - lr: 0.000007 - momentum: 0.000000
|
201 |
+
2023-10-17 16:44:58,906 epoch 9 - iter 801/894 - loss 0.00698938 - time (sec): 64.43 - samples/sec: 1205.26 - lr: 0.000006 - momentum: 0.000000
|
202 |
+
2023-10-17 16:45:06,127 epoch 9 - iter 890/894 - loss 0.00663373 - time (sec): 71.65 - samples/sec: 1203.66 - lr: 0.000006 - momentum: 0.000000
|
203 |
+
2023-10-17 16:45:06,433 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-17 16:45:06,433 EPOCH 9 done: loss 0.0066 - lr: 0.000006
|
205 |
+
2023-10-17 16:45:18,138 DEV : loss 0.2585032284259796 - f1-score (micro avg) 0.7948
|
206 |
+
2023-10-17 16:45:18,200 saving best model
|
207 |
+
2023-10-17 16:45:19,640 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-17 16:45:26,775 epoch 10 - iter 89/894 - loss 0.00559870 - time (sec): 7.13 - samples/sec: 1289.77 - lr: 0.000005 - momentum: 0.000000
|
209 |
+
2023-10-17 16:45:33,768 epoch 10 - iter 178/894 - loss 0.00539965 - time (sec): 14.12 - samples/sec: 1226.33 - lr: 0.000004 - momentum: 0.000000
|
210 |
+
2023-10-17 16:45:40,756 epoch 10 - iter 267/894 - loss 0.00432438 - time (sec): 21.11 - samples/sec: 1202.75 - lr: 0.000004 - momentum: 0.000000
|
211 |
+
2023-10-17 16:45:47,738 epoch 10 - iter 356/894 - loss 0.00382682 - time (sec): 28.09 - samples/sec: 1211.09 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-17 16:45:54,757 epoch 10 - iter 445/894 - loss 0.00379707 - time (sec): 35.11 - samples/sec: 1216.64 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-17 16:46:02,045 epoch 10 - iter 534/894 - loss 0.00433668 - time (sec): 42.40 - samples/sec: 1232.01 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-17 16:46:09,021 epoch 10 - iter 623/894 - loss 0.00409876 - time (sec): 49.38 - samples/sec: 1213.61 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-17 16:46:16,175 epoch 10 - iter 712/894 - loss 0.00359853 - time (sec): 56.53 - samples/sec: 1215.64 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-17 16:46:23,161 epoch 10 - iter 801/894 - loss 0.00367913 - time (sec): 63.52 - samples/sec: 1210.80 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-17 16:46:30,292 epoch 10 - iter 890/894 - loss 0.00340485 - time (sec): 70.65 - samples/sec: 1218.69 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-17 16:46:30,607 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-17 16:46:30,607 EPOCH 10 done: loss 0.0034 - lr: 0.000000
|
220 |
+
2023-10-17 16:46:42,254 DEV : loss 0.27636414766311646 - f1-score (micro avg) 0.7941
|
221 |
+
2023-10-17 16:46:42,844 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-17 16:46:42,846 Loading model from best epoch ...
|
223 |
+
2023-10-17 16:46:45,143 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
|
224 |
+
2023-10-17 16:46:51,229
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.7627
|
227 |
+
- F-score (macro) 0.6782
|
228 |
+
- Accuracy 0.6355
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
loc 0.8344 0.8540 0.8441 596
|
234 |
+
pers 0.7230 0.7838 0.7522 333
|
235 |
+
org 0.5345 0.4697 0.5000 132
|
236 |
+
prod 0.6909 0.5758 0.6281 66
|
237 |
+
time 0.6600 0.6735 0.6667 49
|
238 |
+
|
239 |
+
micro avg 0.7576 0.7679 0.7627 1176
|
240 |
+
macro avg 0.6886 0.6713 0.6782 1176
|
241 |
+
weighted avg 0.7539 0.7679 0.7599 1176
|
242 |
+
|
243 |
+
2023-10-17 16:46:51,230 ----------------------------------------------------------------------------------------------------
|