Files changed (1) hide show
  1. README.md +12 -15
README.md CHANGED
@@ -135,9 +135,19 @@ This model is suitable for a wide range of applications, including but not limit
135
  Coming soon!
136
 
137
 
138
- # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
 
 
 
 
 
 
 
 
 
 
139
 
140
- Coming soon!
141
 
142
  # Prompt Template
143
 
@@ -182,16 +192,3 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-rys-78b")
182
  # Ethical Considerations
183
 
184
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
185
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
186
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.1-rys-78b)
187
-
188
- | Metric |Value|
189
- |-------------------|----:|
190
- |Avg. |44.14|
191
- |IFEval (0-Shot) |81.36|
192
- |BBH (3-Shot) |59.47|
193
- |MATH Lvl 5 (4-Shot)|36.40|
194
- |GPQA (0-shot) |19.24|
195
- |MuSR (0-shot) |19.00|
196
- |MMLU-PRO (5-shot) |49.38|
197
-
 
135
  Coming soon!
136
 
137
 
138
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
139
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.1-rys-78b)
140
+
141
+ | Metric |Value|
142
+ |-------------------|----:|
143
+ |Avg. |44.14|
144
+ |IFEval (0-Shot) |81.36|
145
+ |BBH (3-Shot) |59.47|
146
+ |MATH Lvl 5 (4-Shot)|36.40|
147
+ |GPQA (0-shot) |19.24|
148
+ |MuSR (0-shot) |19.00|
149
+ |MMLU-PRO (5-shot) |49.38|
150
 
 
151
 
152
  # Prompt Template
153
 
 
192
  # Ethical Considerations
193
 
194
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.