robinsmits commited on
Commit
c25c91a
1 Parent(s): cca8d18

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +19 -6
README.md CHANGED
@@ -1,4 +1,6 @@
1
  ---
 
 
2
  license: cc-by-nc-4.0
3
  library_name: peft
4
  tags:
@@ -8,15 +10,13 @@ tags:
8
  - generated_from_trainer
9
  - qwen2
10
  base_model: Qwen/Qwen1.5-7B-Chat
11
- model-index:
12
- - name: Qwen1.5-7B-Dutch-Chat-Dpo
13
- results: []
14
- language:
15
- - nl
16
  datasets:
17
  - BramVanroy/ultra_feedback_dutch_cleaned
18
  pipeline_tag: text-generation
19
  inference: false
 
 
 
20
  ---
21
 
22
  # Qwen1.5-7B-Dutch-Chat-Dpo
@@ -144,4 +144,17 @@ Thanks to the creators of Qwen1.5 for there great work!
144
  journal={arXiv preprint arXiv:2309.16609},
145
  year={2023}
146
  }
147
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - nl
4
  license: cc-by-nc-4.0
5
  library_name: peft
6
  tags:
 
10
  - generated_from_trainer
11
  - qwen2
12
  base_model: Qwen/Qwen1.5-7B-Chat
 
 
 
 
 
13
  datasets:
14
  - BramVanroy/ultra_feedback_dutch_cleaned
15
  pipeline_tag: text-generation
16
  inference: false
17
+ model-index:
18
+ - name: Qwen1.5-7B-Dutch-Chat-Dpo
19
+ results: []
20
  ---
21
 
22
  # Qwen1.5-7B-Dutch-Chat-Dpo
 
144
  journal={arXiv preprint arXiv:2309.16609},
145
  year={2023}
146
  }
147
+ ```
148
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
149
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_robinsmits__Qwen1.5-7B-Dutch-Chat-Dpo)
150
+
151
+ | Metric |Value|
152
+ |---------------------------------|----:|
153
+ |Avg. |53.94|
154
+ |AI2 Reasoning Challenge (25-Shot)|50.77|
155
+ |HellaSwag (10-Shot) |74.24|
156
+ |MMLU (5-Shot) |60.70|
157
+ |TruthfulQA (0-shot) |42.37|
158
+ |Winogrande (5-shot) |68.11|
159
+ |GSM8k (5-shot) |27.45|
160
+