Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,18 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: CONDITIONAL-multilabel-climatebert
|
8 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -13,7 +25,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# CONDITIONAL-multilabel-climatebert
|
15 |
|
16 |
-
This model is a fine-tuned version of [climatebert/distilroberta-base-climate-f](https://huggingface.co/climatebert/distilroberta-base-climate-f) on the
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 0.5460
|
19 |
- Precision-micro: 0.5020
|
@@ -28,15 +40,33 @@ It achieves the following results on the evaluation set:
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
-
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Intended uses & limitations
|
34 |
|
35 |
-
|
|
|
|
|
36 |
|
37 |
## Training and evaluation data
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
## Training procedure
|
42 |
|
@@ -63,10 +93,26 @@ The following hyperparameters were used during training:
|
|
63 |
| 0.069 | 5.0 | 1845 | 0.5016 | 0.5126 | 0.1920 | 0.5193 | 0.7439 | 0.1912 | 0.7439 | 0.6070 | 0.1899 | 0.6090 |
|
64 |
| 0.0353 | 6.0 | 2214 | 0.5460 | 0.5020 | 0.1954 | 0.5047 | 0.7530 | 0.1937 | 0.7530 | 0.6024 | 0.1927 | 0.6033 |
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
67 |
### Framework versions
|
68 |
|
69 |
- Transformers 4.38.1
|
70 |
- Pytorch 2.1.0+cu121
|
71 |
- Datasets 2.18.0
|
72 |
-
- Tokenizers 0.15.2
|
|
|
6 |
model-index:
|
7 |
- name: CONDITIONAL-multilabel-climatebert
|
8 |
results: []
|
9 |
+
datasets:
|
10 |
+
- GIZ/policy_classification
|
11 |
+
|
12 |
+
co2_eq_emissions:
|
13 |
+
emissions: 17.3317785017907
|
14 |
+
source: codecarbon
|
15 |
+
training_type: fine-tuning
|
16 |
+
on_cloud: true
|
17 |
+
cpu_model: Intel(R) Xeon(R) CPU @ 2.00GHz
|
18 |
+
ram_total_size: 12.6747894287109
|
19 |
+
hours_used: 0.384
|
20 |
+
hardware_used: 1 x Tesla T4
|
21 |
---
|
22 |
|
23 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
25 |
|
26 |
# CONDITIONAL-multilabel-climatebert
|
27 |
|
28 |
+
This model is a fine-tuned version of [climatebert/distilroberta-base-climate-f](https://huggingface.co/climatebert/distilroberta-base-climate-f) on the [Policy-Classification](https://huggingface.co/datasets/GIZ/policy_classification) dataset.
|
29 |
It achieves the following results on the evaluation set:
|
30 |
- Loss: 0.5460
|
31 |
- Precision-micro: 0.5020
|
|
|
40 |
|
41 |
## Model description
|
42 |
|
43 |
+
The purpose of this model is to predict multiple labels simultaneously from a given input data. Specifically, the model will predict 2 labels -
|
44 |
+
ConditionalLabel, UnconditionalLabel - that are relevant to a particular task or application
|
45 |
+
- **Conditional**: In context of climate policy documents if certain Target/Action/Plan/Policy commitment is being made conditionally.
|
46 |
+
- **Unconditional**: In context of climate policy documents if certain Target/Action/Plan/Policy commitment is being made unconditionally.
|
47 |
+
|
48 |
+
|
49 |
|
50 |
## Intended uses & limitations
|
51 |
|
52 |
+
The dataset sometimes does not include the sub-heading/heading which indicates that the paragraph belongs to Conditional/Unconditional category.
|
53 |
+
But has been copied from the relevant document from those sub-headings. This makes the assessment of Conditonality very difficult. Annotator when given only the paragraph without
|
54 |
+
the full long context had a difficulty in assessing the conditionality of commitments being made in paragraph.
|
55 |
|
56 |
## Training and evaluation data
|
57 |
|
58 |
+
- Training Dataset: 5901
|
59 |
+
| Class | Positive Count of Class|
|
60 |
+
|:-------------|:--------|
|
61 |
+
| ConditionalLabel | 1986 |
|
62 |
+
| UnconditionalLabel | 1312 |
|
63 |
+
|
64 |
+
|
65 |
+
- Validation Dataset: 1190
|
66 |
+
| Class | Positive Count of Class|
|
67 |
+
|:-------------|:--------|
|
68 |
+
| ConditionalLabel | 192 |
|
69 |
+
| UnconditionalLabel | 136 |
|
70 |
|
71 |
## Training procedure
|
72 |
|
|
|
93 |
| 0.069 | 5.0 | 1845 | 0.5016 | 0.5126 | 0.1920 | 0.5193 | 0.7439 | 0.1912 | 0.7439 | 0.6070 | 0.1899 | 0.6090 |
|
94 |
| 0.0353 | 6.0 | 2214 | 0.5460 | 0.5020 | 0.1954 | 0.5047 | 0.7530 | 0.1937 | 0.7530 | 0.6024 | 0.1927 | 0.6033 |
|
95 |
|
96 |
+
|label | precision |recall |f1-score| support|
|
97 |
+
|:-------------:|:---------:|:-----:|:------:|:------:|
|
98 |
+
|ConditionalLabel |0.477 |0.765 |0.588 | 192.0 |
|
99 |
+
|UnconditionalLabel |0.543 |0.735 | 0.625 | 136.0 |
|
100 |
+
|
|
101 |
+
|
102 |
+
### Environmental Impact
|
103 |
+
Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
|
104 |
+
- **Carbon Emitted**: 0.01733 kg of CO2
|
105 |
+
- **Hours Used**: 0.383 hours
|
106 |
+
|
107 |
+
### Training Hardware
|
108 |
+
- **On Cloud**: yes
|
109 |
+
- **GPU Model**: 1 x Tesla T4
|
110 |
+
- **CPU Model**: Intel(R) Xeon(R) CPU @ 2.00GHz
|
111 |
+
- **RAM Size**: 12.67 GB
|
112 |
|
113 |
### Framework versions
|
114 |
|
115 |
- Transformers 4.38.1
|
116 |
- Pytorch 2.1.0+cu121
|
117 |
- Datasets 2.18.0
|
118 |
+
- Tokenizers 0.15.2
|