Triangle104
/

TQ2.5-14B-Aletheia-v1-Q4_K_M-GGUF

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on 22 days ago

Commit

e7bb197

·

verified ·

1 Parent(s): 836bcc9

Update README.md

Files changed (1) hide show

README.md +66 -0

README.md CHANGED Viewed

@@ -15,6 +15,72 @@ language:
 This model was converted to GGUF format from [`allura-org/TQ2.5-14B-Aletheia-v1`](https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`allura-org/TQ2.5-14B-Aletheia-v1`](https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1) for more details on the model.
+---
+Model details:
+-
+RP/Story hybrid model, merge of Sugarquill and Neon. As with Gemma version, I wanted to preserve Sugarquill's creative spark, while making the model more steerable for RP. It proved to be more difficult this time, but I quite like the result regardless, even if the model is still somewhat temperamental.
+Should work for both RP and storywriting, either on raw completion or with back-and-forth cowriting in chat mode. Seems to be quite sensitive to low depth instructions and samplers.
+Thanks to Toasty and Fizz for testing and giving feedback
+Model was created by Auri.
+Notes about merging
+-
+It took me 20 something attempts to make this model. TIES didn't work at all, producing broken or nearly broken results every time. SLERP worked much better and after just 3 attempts I got something I like. Sugarquill was really prone to overtaking the merge, so I had to reduce it's part a lot, and still model has a lot of influence from it.
+Format
+-
+Model responds to ChatML instruct formatting, exactly like it's base model.
+<|im_start|>system
+{system message}<|im_end|>
+<|im_start|>user
+{user message}<|im_end|>
+<|im_start|>assistant
+{response}<|im_end|>
+Recommended Samplers
+-
+This one is a bit of a special snowflake, with special tastes. Those seem to work pretty well:
+Temperature - 0.8
+Top-A - 0.3
+TFS - 0.75
+DRY - Multiplier 0.8 - Base 1.75 - Allowed length 3 - Range 1024
+As a starting point, you can try this ST Master Import
+Merge Method
+-
+This model was merged using the SLERP merge method.
+Models Merged
+-
+The following models were included in the merge:
+    allura-org/TQ2.5-14B-Neon-v1
+    allura-org/TQ2.5-14B-Sugarquill-v1
+Configuration
+-
+The following YAML configuration was used to produce this model:
+base_model: allura-org/TQ2.5-14B-Sugarquill-v1
+dtype: bfloat16
+merge_method: slerp
+parameters:
+  t:
+  - value: 0.7
+slices:
+- sources:
+  - layer_range: [0, 48]
+    model: allura-org/TQ2.5-14B-Neon-v1
+  - layer_range: [0, 48]
+    model: allura-org/TQ2.5-14B-Sugarquill-v1
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)