Theoreticallyhugo commited on
Commit
a0d3396
1 Parent(s): d2113ac

trainer: training complete at 2024-03-03 19:45:18.187762.

Browse files
Files changed (3) hide show
  1. README.md +88 -0
  2. meta_data/README_s42_e5.md +88 -0
  3. model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-bloom-rail-1.0
3
+ base_model: bigscience/bloom-560m
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - essays_su_g
8
+ metrics:
9
+ - accuracy
10
+ model-index:
11
+ - name: bloom-full_labels
12
+ results:
13
+ - task:
14
+ name: Token Classification
15
+ type: token-classification
16
+ dataset:
17
+ name: essays_su_g
18
+ type: essays_su_g
19
+ config: full_labels
20
+ split: train[0%:20%]
21
+ args: full_labels
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.7978079994653657
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # bloom-full_labels
32
+
33
+ This model is a fine-tuned version of [bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m) on the essays_su_g dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 0.7047
36
+ - B-claim: {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0}
37
+ - B-majorclaim: {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0}
38
+ - B-premise: {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0}
39
+ - I-claim: {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0}
40
+ - I-majorclaim: {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0}
41
+ - I-premise: {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0}
42
+ - O: {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0}
43
+ - Accuracy: 0.7978
44
+ - Macro avg: {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0}
45
+ - Weighted avg: {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0}
46
+
47
+ ## Model description
48
+
49
+ More information needed
50
+
51
+ ## Intended uses & limitations
52
+
53
+ More information needed
54
+
55
+ ## Training and evaluation data
56
+
57
+ More information needed
58
+
59
+ ## Training procedure
60
+
61
+ ### Training hyperparameters
62
+
63
+ The following hyperparameters were used during training:
64
+ - learning_rate: 2e-05
65
+ - train_batch_size: 8
66
+ - eval_batch_size: 8
67
+ - seed: 42
68
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
69
+ - lr_scheduler_type: linear
70
+ - num_epochs: 5
71
+
72
+ ### Training results
73
+
74
+ | Training Loss | Epoch | Step | Validation Loss | B-claim | B-majorclaim | B-premise | I-claim | I-majorclaim | I-premise | O | Accuracy | Macro avg | Weighted avg |
75
+ |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:--------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
76
+ | No log | 1.0 | 81 | 0.7937 | {'precision': 0.3116883116883117, 'recall': 0.2535211267605634, 'f1-score': 0.2796116504854369, 'support': 284.0} | {'precision': 0.17391304347826086, 'recall': 0.028368794326241134, 'f1-score': 0.04878048780487805, 'support': 141.0} | {'precision': 0.5714285714285714, 'recall': 0.4689265536723164, 'f1-score': 0.5151280062063615, 'support': 708.0} | {'precision': 0.5458064516129032, 'recall': 0.1013902205177373, 'f1-score': 0.17101273499090358, 'support': 4172.0} | {'precision': 0.4496436318562132, 'recall': 0.698603755416466, 'f1-score': 0.547134238310709, 'support': 2077.0} | {'precision': 0.7235728757001549, 'recall': 0.9698107179937705, 'f1-score': 0.82878886120875, 'support': 12521.0} | {'precision': 0.9145402022147328, 'recall': 0.7579808459696727, 'f1-score': 0.8289330133100589, 'support': 10024.0} | 0.7359 | {'precision': 0.5272275839970211, 'recall': 0.4683717163795382, 'f1-score': 0.4599127131881569, 'support': 29927.0} | {'precision': 0.7336463378005765, 'recall': 0.7358906672904066, 'f1-score': 0.7012848326220683, 'support': 29927.0} |
77
+ | No log | 2.0 | 162 | 0.8594 | {'precision': 0.3852813852813853, 'recall': 0.31338028169014087, 'f1-score': 0.34563106796116505, 'support': 284.0} | {'precision': 0.5, 'recall': 0.05673758865248227, 'f1-score': 0.10191082802547771, 'support': 141.0} | {'precision': 0.555984555984556, 'recall': 0.6101694915254238, 'f1-score': 0.5818181818181819, 'support': 708.0} | {'precision': 0.5365853658536586, 'recall': 0.015819750719079578, 'f1-score': 0.030733410942956924, 'support': 4172.0} | {'precision': 0.6063059224541969, 'recall': 0.6851227732306211, 'f1-score': 0.6433092224231466, 'support': 2077.0} | {'precision': 0.7233196891499081, 'recall': 0.9738040092644358, 'f1-score': 0.8300769283137042, 'support': 12521.0} | {'precision': 0.8663324979114453, 'recall': 0.8276137270550679, 'f1-score': 0.8465306122448979, 'support': 10024.0} | 0.7521 | {'precision': 0.5962584880907357, 'recall': 0.4975210888767502, 'f1-score': 0.48285860738993286, 'support': 29927.0} | {'precision': 0.7288499118938129, 'recall': 0.7520633541617937, 'f1-score': 0.6972915776644995, 'support': 29927.0} |
78
+ | No log | 3.0 | 243 | 0.6374 | {'precision': 0.4406779661016949, 'recall': 0.2746478873239437, 'f1-score': 0.33839479392624733, 'support': 284.0} | {'precision': 0.6890756302521008, 'recall': 0.5815602836879432, 'f1-score': 0.6307692307692307, 'support': 141.0} | {'precision': 0.6152125279642058, 'recall': 0.7768361581920904, 'f1-score': 0.6866416978776528, 'support': 708.0} | {'precision': 0.43018637335777576, 'recall': 0.6749760306807286, 'f1-score': 0.5254711699944019, 'support': 4172.0} | {'precision': 0.7759119861030689, 'recall': 0.6451612903225806, 'f1-score': 0.7045215562565721, 'support': 2077.0} | {'precision': 0.8966225233548917, 'recall': 0.6975481191598115, 'f1-score': 0.7846554667145808, 'support': 12521.0} | {'precision': 0.8359600857968852, 'recall': 0.8942537909018355, 'f1-score': 0.8641249337253579, 'support': 10024.0} | 0.7540 | {'precision': 0.6690924418472318, 'recall': 0.6492833657527048, 'f1-score': 0.6477969784662919, 'support': 29927.0} | {'precision': 0.7909400854003533, 'recall': 0.7539679887726802, 'f1-score': 0.7623016451053796, 'support': 29927.0} |
79
+ | No log | 4.0 | 324 | 0.6704 | {'precision': 0.49489795918367346, 'recall': 0.3415492957746479, 'f1-score': 0.4041666666666667, 'support': 284.0} | {'precision': 0.7155172413793104, 'recall': 0.5886524822695035, 'f1-score': 0.6459143968871596, 'support': 141.0} | {'precision': 0.6989869753979739, 'recall': 0.6822033898305084, 'f1-score': 0.6904932094353109, 'support': 708.0} | {'precision': 0.6432561851556265, 'recall': 0.38638542665388304, 'f1-score': 0.4827792752321055, 'support': 4172.0} | {'precision': 0.6661024121878968, 'recall': 0.757823784304285, 'f1-score': 0.7090090090090089, 'support': 2077.0} | {'precision': 0.8252104563579974, 'recall': 0.8925005989936906, 'f1-score': 0.8575375052756781, 'support': 12521.0} | {'precision': 0.8499952439836393, 'recall': 0.8914604948124502, 'f1-score': 0.8702342114232848, 'support': 10024.0} | 0.8006 | {'precision': 0.6991380676637311, 'recall': 0.6486536389484242, 'f1-score': 0.6657334677041735, 'support': 29927.0} | {'precision': 0.7904665918521213, 'recall': 0.8006148294182511, 'f1-score': 0.7899872403655044, 'support': 29927.0} |
80
+ | No log | 5.0 | 405 | 0.7047 | {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0} | {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0} | {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0} | {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0} | {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0} | {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0} | {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0} | 0.7978 | {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0} | {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0} |
81
+
82
+
83
+ ### Framework versions
84
+
85
+ - Transformers 4.37.2
86
+ - Pytorch 2.2.0+cu121
87
+ - Datasets 2.17.0
88
+ - Tokenizers 0.15.2
meta_data/README_s42_e5.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-bloom-rail-1.0
3
+ base_model: bigscience/bloom-560m
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - essays_su_g
8
+ metrics:
9
+ - accuracy
10
+ model-index:
11
+ - name: bloom-full_labels
12
+ results:
13
+ - task:
14
+ name: Token Classification
15
+ type: token-classification
16
+ dataset:
17
+ name: essays_su_g
18
+ type: essays_su_g
19
+ config: full_labels
20
+ split: train[0%:20%]
21
+ args: full_labels
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.7978079994653657
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # bloom-full_labels
32
+
33
+ This model is a fine-tuned version of [bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m) on the essays_su_g dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 0.7047
36
+ - B-claim: {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0}
37
+ - B-majorclaim: {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0}
38
+ - B-premise: {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0}
39
+ - I-claim: {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0}
40
+ - I-majorclaim: {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0}
41
+ - I-premise: {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0}
42
+ - O: {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0}
43
+ - Accuracy: 0.7978
44
+ - Macro avg: {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0}
45
+ - Weighted avg: {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0}
46
+
47
+ ## Model description
48
+
49
+ More information needed
50
+
51
+ ## Intended uses & limitations
52
+
53
+ More information needed
54
+
55
+ ## Training and evaluation data
56
+
57
+ More information needed
58
+
59
+ ## Training procedure
60
+
61
+ ### Training hyperparameters
62
+
63
+ The following hyperparameters were used during training:
64
+ - learning_rate: 2e-05
65
+ - train_batch_size: 8
66
+ - eval_batch_size: 8
67
+ - seed: 42
68
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
69
+ - lr_scheduler_type: linear
70
+ - num_epochs: 5
71
+
72
+ ### Training results
73
+
74
+ | Training Loss | Epoch | Step | Validation Loss | B-claim | B-majorclaim | B-premise | I-claim | I-majorclaim | I-premise | O | Accuracy | Macro avg | Weighted avg |
75
+ |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:--------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
76
+ | No log | 1.0 | 81 | 0.7937 | {'precision': 0.3116883116883117, 'recall': 0.2535211267605634, 'f1-score': 0.2796116504854369, 'support': 284.0} | {'precision': 0.17391304347826086, 'recall': 0.028368794326241134, 'f1-score': 0.04878048780487805, 'support': 141.0} | {'precision': 0.5714285714285714, 'recall': 0.4689265536723164, 'f1-score': 0.5151280062063615, 'support': 708.0} | {'precision': 0.5458064516129032, 'recall': 0.1013902205177373, 'f1-score': 0.17101273499090358, 'support': 4172.0} | {'precision': 0.4496436318562132, 'recall': 0.698603755416466, 'f1-score': 0.547134238310709, 'support': 2077.0} | {'precision': 0.7235728757001549, 'recall': 0.9698107179937705, 'f1-score': 0.82878886120875, 'support': 12521.0} | {'precision': 0.9145402022147328, 'recall': 0.7579808459696727, 'f1-score': 0.8289330133100589, 'support': 10024.0} | 0.7359 | {'precision': 0.5272275839970211, 'recall': 0.4683717163795382, 'f1-score': 0.4599127131881569, 'support': 29927.0} | {'precision': 0.7336463378005765, 'recall': 0.7358906672904066, 'f1-score': 0.7012848326220683, 'support': 29927.0} |
77
+ | No log | 2.0 | 162 | 0.8594 | {'precision': 0.3852813852813853, 'recall': 0.31338028169014087, 'f1-score': 0.34563106796116505, 'support': 284.0} | {'precision': 0.5, 'recall': 0.05673758865248227, 'f1-score': 0.10191082802547771, 'support': 141.0} | {'precision': 0.555984555984556, 'recall': 0.6101694915254238, 'f1-score': 0.5818181818181819, 'support': 708.0} | {'precision': 0.5365853658536586, 'recall': 0.015819750719079578, 'f1-score': 0.030733410942956924, 'support': 4172.0} | {'precision': 0.6063059224541969, 'recall': 0.6851227732306211, 'f1-score': 0.6433092224231466, 'support': 2077.0} | {'precision': 0.7233196891499081, 'recall': 0.9738040092644358, 'f1-score': 0.8300769283137042, 'support': 12521.0} | {'precision': 0.8663324979114453, 'recall': 0.8276137270550679, 'f1-score': 0.8465306122448979, 'support': 10024.0} | 0.7521 | {'precision': 0.5962584880907357, 'recall': 0.4975210888767502, 'f1-score': 0.48285860738993286, 'support': 29927.0} | {'precision': 0.7288499118938129, 'recall': 0.7520633541617937, 'f1-score': 0.6972915776644995, 'support': 29927.0} |
78
+ | No log | 3.0 | 243 | 0.6374 | {'precision': 0.4406779661016949, 'recall': 0.2746478873239437, 'f1-score': 0.33839479392624733, 'support': 284.0} | {'precision': 0.6890756302521008, 'recall': 0.5815602836879432, 'f1-score': 0.6307692307692307, 'support': 141.0} | {'precision': 0.6152125279642058, 'recall': 0.7768361581920904, 'f1-score': 0.6866416978776528, 'support': 708.0} | {'precision': 0.43018637335777576, 'recall': 0.6749760306807286, 'f1-score': 0.5254711699944019, 'support': 4172.0} | {'precision': 0.7759119861030689, 'recall': 0.6451612903225806, 'f1-score': 0.7045215562565721, 'support': 2077.0} | {'precision': 0.8966225233548917, 'recall': 0.6975481191598115, 'f1-score': 0.7846554667145808, 'support': 12521.0} | {'precision': 0.8359600857968852, 'recall': 0.8942537909018355, 'f1-score': 0.8641249337253579, 'support': 10024.0} | 0.7540 | {'precision': 0.6690924418472318, 'recall': 0.6492833657527048, 'f1-score': 0.6477969784662919, 'support': 29927.0} | {'precision': 0.7909400854003533, 'recall': 0.7539679887726802, 'f1-score': 0.7623016451053796, 'support': 29927.0} |
79
+ | No log | 4.0 | 324 | 0.6704 | {'precision': 0.49489795918367346, 'recall': 0.3415492957746479, 'f1-score': 0.4041666666666667, 'support': 284.0} | {'precision': 0.7155172413793104, 'recall': 0.5886524822695035, 'f1-score': 0.6459143968871596, 'support': 141.0} | {'precision': 0.6989869753979739, 'recall': 0.6822033898305084, 'f1-score': 0.6904932094353109, 'support': 708.0} | {'precision': 0.6432561851556265, 'recall': 0.38638542665388304, 'f1-score': 0.4827792752321055, 'support': 4172.0} | {'precision': 0.6661024121878968, 'recall': 0.757823784304285, 'f1-score': 0.7090090090090089, 'support': 2077.0} | {'precision': 0.8252104563579974, 'recall': 0.8925005989936906, 'f1-score': 0.8575375052756781, 'support': 12521.0} | {'precision': 0.8499952439836393, 'recall': 0.8914604948124502, 'f1-score': 0.8702342114232848, 'support': 10024.0} | 0.8006 | {'precision': 0.6991380676637311, 'recall': 0.6486536389484242, 'f1-score': 0.6657334677041735, 'support': 29927.0} | {'precision': 0.7904665918521213, 'recall': 0.8006148294182511, 'f1-score': 0.7899872403655044, 'support': 29927.0} |
80
+ | No log | 5.0 | 405 | 0.7047 | {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0} | {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0} | {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0} | {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0} | {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0} | {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0} | {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0} | 0.7978 | {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0} | {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0} |
81
+
82
+
83
+ ### Framework versions
84
+
85
+ - Transformers 4.37.2
86
+ - Pytorch 2.2.0+cu121
87
+ - Datasets 2.17.0
88
+ - Tokenizers 0.15.2
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0465492519f07d335890fa011362bb5d28152afd1e73e74c87bd553830cb9e4a
3
  size 2236921156
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a55840dea9ed8c0e589912d261f1e08351aaf2f9cbc1435163865604017d1476
3
  size 2236921156