lemon07r
/

Gemma-2-Ataraxy-9B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

lemon07r commited on Sep 3

Commit

3dd9200

•

1 Parent(s): 32bdf6d

Update README.md

Files changed (1) hide show

README.md +14 -14

README.md CHANGED Viewed

@@ -126,8 +126,6 @@ Use Gemma 2 format.
 ## Benchmarks and Leaderboard Rankings
-OpenLLM: Pending in Queue
 Creative Writing V2 - Score: 82.64 (Rank 1)
 https://eqbench.com/creative_writing.html
@@ -138,6 +136,20 @@ That's right, much to everyone's surprise (mine included) this model has topped
 ![Leaderboard](https://i.imgur.com/gJd9Pab.png)
 ## Preface and Rambling
 My favorite Gemma 2 9B models are the SPPO iter3 and SimPO finetunes, but I felt the slerp merge between the two (nephilim v3) wasn't as good for some reason. The Gutenberg Gemma 2 finetune by nbeerbower is another my favorites. It's trained on one of my favorite datasets, and actually improves the SPPO model's average openllm leaderboard 2 average score by a bit, on top of improving it's writing capabilities and making the LLM sound less AI-like. However I still liked the original SPPO finetune just a bit more.
@@ -192,16 +204,4 @@ slices:
     model: nbeerbower/gemma2-gutenberg-9B
 ```
-# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
-Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lemon07r__Gemma-2-Ataraxy-9B)
-|      Metric       |Value|
-|-------------------|----:|
-|Avg.               |22.43|
-|IFEval (0-Shot)    |30.09|
-|BBH (3-Shot)       |42.03|
-|MATH Lvl 5 (4-Shot)| 0.83|
-|GPQA (0-shot)      |11.30|
-|MuSR (0-shot)      |14.47|
-|MMLU-PRO (5-shot)  |35.85|

 ## Benchmarks and Leaderboard Rankings
 Creative Writing V2 - Score: 82.64 (Rank 1)
 https://eqbench.com/creative_writing.html
 ![Leaderboard](https://i.imgur.com/gJd9Pab.png)
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lemon07r__Gemma-2-Ataraxy-9B)
+|      Metric       |Value|
+|-------------------|----:|
+|Avg.               |22.43|
+|IFEval (0-Shot)    |30.09|
+|BBH (3-Shot)       |42.03|
+|MATH Lvl 5 (4-Shot)| 0.83|
+|GPQA (0-shot)      |11.30|
+|MuSR (0-shot)      |14.47|
+|MMLU-PRO (5-shot)  |35.85|
 ## Preface and Rambling
 My favorite Gemma 2 9B models are the SPPO iter3 and SimPO finetunes, but I felt the slerp merge between the two (nephilim v3) wasn't as good for some reason. The Gutenberg Gemma 2 finetune by nbeerbower is another my favorites. It's trained on one of my favorite datasets, and actually improves the SPPO model's average openllm leaderboard 2 average score by a bit, on top of improving it's writing capabilities and making the LLM sound less AI-like. However I still liked the original SPPO finetune just a bit more.
     model: nbeerbower/gemma2-gutenberg-9B
 ```