lemon07r commited on
Commit
30d3b12
1 Parent(s): d6beccc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  ---
2
  base_model:
3
  - princeton-nlp/gemma-2-9b-it-SimPO
@@ -14,16 +15,19 @@ This is a merge of pre-trained language models created using [mergekit](https://
14
 
15
  ## GGUF Quants
16
 
17
- Huge thanks to @mradermacher and @bartowski for making these GGUF quants available to us.
18
 
19
- Bartowski quants (imatrix): bartowski/Gemma-2-Ataraxy-9B-GGUF
20
 
21
- Mradermacher quants (static): mradermacher/Gemma-2-Ataraxy-9B-GGUF
22
 
23
- Mradermacher quants (imatrix): mradermacher/Gemma-2-Ataraxy-9B-i1-GGUF
24
 
25
  I think bartowski and mradermacher use different calibration data for imatrix quants, or maybe you prefer static quants. Pick your poison :).
26
 
 
 
 
27
 
28
  ## Preface and Rambling
29
 
@@ -33,7 +37,7 @@ Someone suggested that merging the base model on top of the gutenberg may help w
33
 
34
  I wasn't entirely too sure, since if nephilim v3 is anything to go by, it was probably going to also end up worse than the parent models. Normally when I try merges like these, they dont go anywhere. I'm pretty picky, and very skeptical usually, so most times I find that the merge is usually just not better than the original models or only marginally better. Tried this merge anyways to see how it goes, and much to my surprise, this time, I feel like I got very good results. Figured I'd share, and hopefully this wont be just me introducing more useless slop into a world that already has way too many unnecessary merges.
35
 
36
- If you're looking for a mistral nemo 12B model instead, I HIGHLY recommend Mistral Nemo Gutenberg v2 by nbeerbower. It's head and shoulders above the many other mistral nemo finetunes I've tried (romulus simpo and magnum mini 1.1 being close second favorites).
37
 
38
  ## Why is it 10b??
39
 
@@ -41,10 +45,6 @@ See https://github.com/arcee-ai/mergekit/issues/390
41
 
42
  Model is not actually 10b, mergekit is randomly adding lm_head for some reason when doing SLERP merge with Gemma 2 models. I believe Nephilim v3 had a similar issue before the used some sort of workaround that I'm not aware of. Doesn't seem like this affects the GGUF quants, as they're the correct size, so I will leave it as is until mergekit gets a commit that addresses this issue.
43
 
44
- ## Format
45
-
46
- Use Gemma 2 format.
47
-
48
  ## Merge Details
49
  ### Merge Method
50
 
@@ -77,4 +77,4 @@ slices:
77
  model: princeton-nlp/gemma-2-9b-it-SimPO
78
  - layer_range: [0, 42]
79
  model: nbeerbower/gemma2-gutenberg-9B
80
- ```
 
1
+
2
  ---
3
  base_model:
4
  - princeton-nlp/gemma-2-9b-it-SimPO
 
15
 
16
  ## GGUF Quants
17
 
18
+ Huge thanks to [@mradermacher](https://huggingface.co/mradermacher) and [@bartowski](https://huggingface.co/bartowski) for making these GGUF quants available to us.
19
 
20
+ Bartowski quants (imatrix): [bartowski/Gemma-2-Ataraxy-9B-GGUF](https://huggingface.co/bartowski/Gemma-2-Ataraxy-9B-GGUF)
21
 
22
+ Mradermacher quants (static): [mradermacher/Gemma-2-Ataraxy-9B-GGUF](https://huggingface.co/lemon07r/Gemma-2-Ataraxy-9B)
23
 
24
+ Mradermacher quants (imatrix): [mradermacher/Gemma-2-Ataraxy-9B-i1-GGUF](https://huggingface.co/mradermacher/Gemma-2-Ataraxy-9B-GGUF)
25
 
26
  I think bartowski and mradermacher use different calibration data for imatrix quants, or maybe you prefer static quants. Pick your poison :).
27
 
28
+ ## Format
29
+
30
+ Use Gemma 2 format.
31
 
32
  ## Preface and Rambling
33
 
 
37
 
38
  I wasn't entirely too sure, since if nephilim v3 is anything to go by, it was probably going to also end up worse than the parent models. Normally when I try merges like these, they dont go anywhere. I'm pretty picky, and very skeptical usually, so most times I find that the merge is usually just not better than the original models or only marginally better. Tried this merge anyways to see how it goes, and much to my surprise, this time, I feel like I got very good results. Figured I'd share, and hopefully this wont be just me introducing more useless slop into a world that already has way too many unnecessary merges.
39
 
40
+ If you're looking for a mistral nemo 12B model instead, I HIGHLY recommend Mistral Nemo Gutenberg v2 by nbeerbower. It's head and shoulders above the many other mistral nemo finetunes I've tried (the first version of mistral nemo gutenburg, romulus simpo, and magnum mini 1.1 being close second favorites).
41
 
42
  ## Why is it 10b??
43
 
 
45
 
46
  Model is not actually 10b, mergekit is randomly adding lm_head for some reason when doing SLERP merge with Gemma 2 models. I believe Nephilim v3 had a similar issue before the used some sort of workaround that I'm not aware of. Doesn't seem like this affects the GGUF quants, as they're the correct size, so I will leave it as is until mergekit gets a commit that addresses this issue.
47
 
 
 
 
 
48
  ## Merge Details
49
  ### Merge Method
50
 
 
77
  model: princeton-nlp/gemma-2-9b-it-SimPO
78
  - layer_range: [0, 42]
79
  model: nbeerbower/gemma2-gutenberg-9B
80
+ ```