VictorYeste commited on
Commit
a95cc54
1 Parent(s): ea39c1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -10,6 +10,8 @@ model-index:
10
  - name: deberta-based-human-value-detection
11
  results: []
12
  ---
 
 
13
  The Human Value Detection at CLEF 2024 task consists of two sub-tasks: the first is to detect the presence or absence of each of these 19 values, while the second is to detect whether the value is attained or constrained.
14
 
15
  Our system introduces a cascade model approach for the detection and stance classification of the predefined set of human values. It consists of two subsystems: one for detecting the presence of each human value and another for establishing the stance (if the sentence attains or constrains) of each human value. Each subsystem is designed and fine-tuned separately using a DeBERTa model as base.
@@ -20,3 +22,67 @@ Our system introduces a cascade model approach for the detection and stance clas
20
  Given that subsystem 1 focuses on detecting the presence of human values in the text, and subsystem 2 focuses on the stances towards each detected human value, this cascade model approach improves the granularity of text classification.
21
 
22
  This model is the responsible of the Subsystem 1 and accomplishes the first sub-task.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - name: deberta-based-human-value-detection
11
  results: []
12
  ---
13
+ # Description
14
+
15
  The Human Value Detection at CLEF 2024 task consists of two sub-tasks: the first is to detect the presence or absence of each of these 19 values, while the second is to detect whether the value is attained or constrained.
16
 
17
  Our system introduces a cascade model approach for the detection and stance classification of the predefined set of human values. It consists of two subsystems: one for detecting the presence of each human value and another for establishing the stance (if the sentence attains or constrains) of each human value. Each subsystem is designed and fine-tuned separately using a DeBERTa model as base.
 
22
  Given that subsystem 1 focuses on detecting the presence of human values in the text, and subsystem 2 focuses on the stances towards each detected human value, this cascade model approach improves the granularity of text classification.
23
 
24
  This model is the responsible of the Subsystem 1 and accomplishes the first sub-task.
25
+
26
+ # How to use
27
+
28
+ You can use this model using a text classification pipeline, as in the example:
29
+
30
+ ```python
31
+ from transformers import pipeline
32
+
33
+ model = "VictorYeste/deberta-based-human-value-detection"
34
+ tokenizer = "VictorYeste/deberta-based-human-value-detection"
35
+
36
+ values_detection = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=None)
37
+
38
+ values_detection("We would like to share this model with the research community.")
39
+ ```
40
+
41
+ This returns the following:
42
+ ```
43
+ [[{'label': 'Self-direction: thought', 'score': 0.02448045276105404},
44
+ {'label': 'Stimulation', 'score': 0.01451807003468275},
45
+ {'label': 'Universalism: concern', 'score': 0.006046739872545004},
46
+ {'label': 'Self-direction: action', 'score': 0.004837467335164547},
47
+ {'label': 'Benevolence: dependability', 'score': 0.001295178197324276},
48
+ {'label': 'Benevolence: caring', 'score': 0.0009907316416501999},
49
+ {'label': 'Conformity: interpersonal', 'score': 0.0004476217145565897},
50
+ {'label': 'Security: societal', 'score': 0.00039295252645388246},
51
+ {'label': 'Universalism: tolerance', 'score': 0.0003538706514518708},
52
+ {'label': 'Power: dominance', 'score': 0.00016191638133022934},
53
+ {'label': 'Power: resources', 'score': 0.0001522471575299278},
54
+ {'label': 'Universalism: nature', 'score': 0.00014803129306528717},
55
+ {'label': 'Humility', 'score': 0.0001100009903893806},
56
+ {'label': 'Face', 'score': 9.083452459890395e-05},
57
+ {'label': 'Conformity: rules', 'score': 8.524076838511974e-05},
58
+ {'label': 'Achievement', 'score': 6.411433423636481e-05},
59
+ {'label': 'Security: personal', 'score': 5.183048051549122e-05},
60
+ {'label': 'Hedonism', 'score': 3.167059549014084e-05},
61
+ {'label': 'Tradition', 'score': 2.4977327484521084e-05}]]
62
+ ```
63
+
64
+ The model has been trained as a multi-label problem, so it can also be used to predict multiple labels as follows:
65
+ ```python
66
+ import torch
67
+ import numpy as np
68
+ import transformers
69
+
70
+ def multilabel_pipeline(text, model, tokenizer, id2label):
71
+ # Code adapted from: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb
72
+ """ Predicts the value probabilities (attained and constrained) for each sentence """
73
+ encoding = tokenizer(text, return_tensors="pt")
74
+ encoding = {k: v for k,v in encoding.items()}
75
+ outputs = model(**encoding)
76
+ logits = outputs.logits
77
+ sigmoid = torch.nn.Sigmoid()
78
+ probs = sigmoid(logits.squeeze().cpu())
79
+ predictions = np.zeros(probs.shape)
80
+ predictions[np.where(probs >= 0.5)] = 1
81
+ predicted_labels = [id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
82
+ return predicted_labels
83
+
84
+ values = ["Self-direction: thought", "Self-direction: action", "Stimulation", "Hedonism", "Achievement", "Power: dominance", "Power: resources", "Face", "Security: personal", "Security: societal", "Tradition", "Conformity: rules", "Conformity: interpersonal", "Humility", "Benevolence: caring", "Benevolence: dependability", "Universalism: concern", "Universalism: nature", "Universalism: tolerance" ]
85
+ id2label = {idx:label for idx, label in enumerate(values)}
86
+ model_name = "VictorYeste/deberta-based-human-value-detection"
87
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
88
+ model = transformers.AutoModelForSequenceClassification.from_pretrained(model_name)