JohnnyBoy00
commited on
Commit
•
331ef08
1
Parent(s):
c141cef
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ The outputs are formatted as follows:
|
|
29 |
[verification_feedback] Feedback: [feedback]
|
30 |
```
|
31 |
|
32 |
-
|
33 |
|
34 |
## Intended uses & limitations
|
35 |
|
@@ -78,7 +78,7 @@ The following hyperparameters were utilized during training:
|
|
78 |
|
79 |
## Evaluation results
|
80 |
|
81 |
-
The
|
82 |
|
83 |
The following results were achieved.
|
84 |
|
@@ -92,7 +92,7 @@ The script used to compute these metrics and perform evaluation can be found in
|
|
92 |
|
93 |
## Usage
|
94 |
|
95 |
-
The example below shows how the model can be applied to generate
|
96 |
|
97 |
```python
|
98 |
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
@@ -111,7 +111,7 @@ generated_tokens = model.generate(
|
|
111 |
output = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
|
112 |
```
|
113 |
|
114 |
-
The output
|
115 |
|
116 |
```
|
117 |
Partially correct Feedback: Sollte das Personal dies gestatten, kannst du den Check auch gerne noch abschließen. Bitte halte nur in fest, wann genau du auf deine Tätigkeit angesprochen wurdest.
|
|
|
29 |
[verification_feedback] Feedback: [feedback]
|
30 |
```
|
31 |
|
32 |
+
Hence, the `[verification_feedback]` label will be one of `Correct`, `Partially correct` or `Incorrect`, while `[feedback]` will be the textual feedback generated by the model according to the given answer.
|
33 |
|
34 |
## Intended uses & limitations
|
35 |
|
|
|
78 |
|
79 |
## Evaluation results
|
80 |
|
81 |
+
The generated feedback was evaluated through means of the [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu), [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge), [METEOR](https://huggingface.co/spaces/evaluate-metric/meteor), [BERTScore](https://huggingface.co/spaces/evaluate-metric/bertscore) metrics from HuggingFace, while the [accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) and [F1](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) scores from scikit-learn where used for evaluation of the labels.
|
82 |
|
83 |
The following results were achieved.
|
84 |
|
|
|
92 |
|
93 |
## Usage
|
94 |
|
95 |
+
The example below shows how the model can be applied to generate feedback to a given answer.
|
96 |
|
97 |
```python
|
98 |
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
|
|
111 |
output = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
|
112 |
```
|
113 |
|
114 |
+
The output produced by the model then looks as follows:
|
115 |
|
116 |
```
|
117 |
Partially correct Feedback: Sollte das Personal dies gestatten, kannst du den Check auch gerne noch abschließen. Bitte halte nur in fest, wann genau du auf deine Tätigkeit angesprochen wurdest.
|