VictorYeste
/

deberta-based-human-value-detection

@@ -10,6 +10,8 @@ model-index:
 - name: deberta-based-human-value-detection
   results: []
 ---
 The Human Value Detection at CLEF 2024 task consists of two sub-tasks: the first is to detect the presence or absence of each of these 19 values, while the second is to detect whether the value is attained or constrained.
 Our system introduces a cascade model approach for the detection and stance classification of the predefined set of human values. It consists of two subsystems: one for detecting the presence of each human value and another for establishing the stance (if the sentence attains or constrains) of each human value. Each subsystem is designed and fine-tuned separately using a DeBERTa model as base.
@@ -20,3 +22,67 @@ Our system introduces a cascade model approach for the detection and stance clas
 Given that subsystem 1 focuses on detecting the presence of human values in the text, and subsystem 2 focuses on the stances towards each detected human value, this cascade model approach improves the granularity of text classification.
 This model is the responsible of the Subsystem 1 and accomplishes the first sub-task.

 - name: deberta-based-human-value-detection
   results: []
 ---
+# Description
 The Human Value Detection at CLEF 2024 task consists of two sub-tasks: the first is to detect the presence or absence of each of these 19 values, while the second is to detect whether the value is attained or constrained.
 Our system introduces a cascade model approach for the detection and stance classification of the predefined set of human values. It consists of two subsystems: one for detecting the presence of each human value and another for establishing the stance (if the sentence attains or constrains) of each human value. Each subsystem is designed and fine-tuned separately using a DeBERTa model as base.
 Given that subsystem 1 focuses on detecting the presence of human values in the text, and subsystem 2 focuses on the stances towards each detected human value, this cascade model approach improves the granularity of text classification.
 This model is the responsible of the Subsystem 1 and accomplishes the first sub-task.
+# How to use
+You can use this model using a text classification pipeline, as in the example:
+```python
+from transformers import pipeline
+model = "VictorYeste/deberta-based-human-value-detection"
+tokenizer = "VictorYeste/deberta-based-human-value-detection"
+values_detection = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=None)
+values_detection("We would like to share this model with the research community.")
+```
+This returns the following:
+```
+[[{'label': 'Self-direction: thought', 'score': 0.02448045276105404},
+  {'label': 'Stimulation', 'score': 0.01451807003468275},
+  {'label': 'Universalism: concern', 'score': 0.006046739872545004},
+  {'label': 'Self-direction: action', 'score': 0.004837467335164547},
+  {'label': 'Benevolence: dependability', 'score': 0.001295178197324276},
+  {'label': 'Benevolence: caring', 'score': 0.0009907316416501999},
+  {'label': 'Conformity: interpersonal', 'score': 0.0004476217145565897},
+  {'label': 'Security: societal', 'score': 0.00039295252645388246},
+  {'label': 'Universalism: tolerance', 'score': 0.0003538706514518708},
+  {'label': 'Power: dominance', 'score': 0.00016191638133022934},
+  {'label': 'Power: resources', 'score': 0.0001522471575299278},
+  {'label': 'Universalism: nature', 'score': 0.00014803129306528717},
+  {'label': 'Humility', 'score': 0.0001100009903893806},
+  {'label': 'Face', 'score': 9.083452459890395e-05},
+  {'label': 'Conformity: rules', 'score': 8.524076838511974e-05},
+  {'label': 'Achievement', 'score': 6.411433423636481e-05},
+  {'label': 'Security: personal', 'score': 5.183048051549122e-05},
+  {'label': 'Hedonism', 'score': 3.167059549014084e-05},
+  {'label': 'Tradition', 'score': 2.4977327484521084e-05}]]
+```
+The model has been trained as a multi-label problem, so it can also be used to predict multiple labels as follows:
+```python
+import torch
+import numpy as np
+import transformers
+def multilabel_pipeline(text, model, tokenizer, id2label):
+    # Code adapted from: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb
+    """ Predicts the value probabilities (attained and constrained) for each sentence """
+    encoding = tokenizer(text, return_tensors="pt")
+    encoding = {k: v for k,v in encoding.items()}
+    outputs = model(**encoding)
+    logits = outputs.logits
+    sigmoid = torch.nn.Sigmoid()
+    probs = sigmoid(logits.squeeze().cpu())
+    predictions = np.zeros(probs.shape)
+    predictions[np.where(probs >= 0.5)] = 1
+    predicted_labels = [id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
+    return predicted_labels
+values = ["Self-direction: thought", "Self-direction: action", "Stimulation",  "Hedonism", "Achievement", "Power: dominance", "Power: resources", "Face", "Security: personal", "Security: societal", "Tradition", "Conformity: rules", "Conformity: interpersonal", "Humility", "Benevolence: caring", "Benevolence: dependability", "Universalism: concern", "Universalism: nature", "Universalism: tolerance" ]
+id2label = {idx:label for idx, label in enumerate(values)}
+model_name = "VictorYeste/deberta-based-human-value-detection"
+tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
+model = transformers.AutoModelForSequenceClassification.from_pretrained(model_name)