Short-Answer-Feedback
/

mbart-finetuned-saf-micro-job

@@ -29,7 +29,7 @@ The outputs are formatted as follows:
 [verification_feedback] Feedback: [feedback]
 ```
-In this case, `[verification_feedback]` will be one of `Correct`, `Partially correct` or `Incorrect`, while `[feedback]` will be the textual feedback generated by the model according to the given answer.
 ## Intended uses & limitations
@@ -78,7 +78,7 @@ The following hyperparameters were utilized during training:
 ## Evaluation results
-The model was evaluated through means of the [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu), [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge), [METEOR](https://huggingface.co/spaces/evaluate-metric/meteor), [BERTScore](https://huggingface.co/spaces/evaluate-metric/bertscore) metrics from HuggingFace, as well as the [accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) and [F1](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) scores from scikit-learn.
 The following results were achieved.
@@ -92,7 +92,7 @@ The script used to compute these metrics and perform evaluation can be found in
 ## Usage
-The example below shows how the model can be applied to generate textual feedback to a given answer.
 ```python
 from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
@@ -111,7 +111,7 @@ generated_tokens = model.generate(
 output = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
 ```
-The output generated by the model then looks as follows:
 ```
 Partially correct Feedback: Sollte das Personal dies gestatten, kannst du den Check auch gerne noch abschließen. Bitte halte nur in fest, wann genau du auf deine Tätigkeit angesprochen wurdest.

 [verification_feedback] Feedback: [feedback]
 ```
+Hence, the `[verification_feedback]` label will be one of `Correct`, `Partially correct` or `Incorrect`, while `[feedback]` will be the textual feedback generated by the model according to the given answer.
 ## Intended uses & limitations
 ## Evaluation results
+The generated feedback was evaluated through means of the [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu), [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge), [METEOR](https://huggingface.co/spaces/evaluate-metric/meteor), [BERTScore](https://huggingface.co/spaces/evaluate-metric/bertscore) metrics from HuggingFace, while the [accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) and [F1](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) scores from scikit-learn where used for evaluation of the labels.
 The following results were achieved.
 ## Usage
+The example below shows how the model can be applied to generate feedback to a given answer.
 ```python
 from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
 output = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
 ```
+The output produced by the model then looks as follows:
 ```
 Partially correct Feedback: Sollte das Personal dies gestatten, kannst du den Check auch gerne noch abschließen. Bitte halte nur in fest, wann genau du auf deine Tätigkeit angesprochen wurdest.