Text Generation
Transformers
Safetensors
mixtral
Mixture of Experts
frankenmoe
Merge
mergekit
lazymergekit
Locutusque/TinyMistral-248M-v2
Locutusque/TinyMistral-248M-v2.5
Locutusque/TinyMistral-248M-v2.5-Instruct
jtatman/tinymistral-v2-pycoder-instruct-248m
Felladrin/TinyMistral-248M-SFT-v4
Locutusque/TinyMistral-248M-v2-Instruct
text-generation-inference
Inference Endpoints
Locutusque
commited on
Commit
•
2b4c079
1
Parent(s):
cd02536
Update README.md
Browse files
README.md
CHANGED
@@ -51,7 +51,19 @@ TinyMistral-6x248M is a Mixure of Experts (MoE) made with the following models u
|
|
51 |
* [Felladrin/TinyMistral-248M-SFT-v4](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v4)
|
52 |
* [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
|
53 |
|
54 |
-
The resulting model is then pre-trained on 600,000 examples of nampdn-ai/mini-peS2o
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
## 🧩 Configuration
|
57 |
|
|
|
51 |
* [Felladrin/TinyMistral-248M-SFT-v4](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v4)
|
52 |
* [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
|
53 |
|
54 |
+
The resulting model is then pre-trained on 600,000 examples of nampdn-ai/mini-peS2o.
|
55 |
+
|
56 |
+
We don't recommend using the Inference API as the model has serious performance degradation.
|
57 |
+
|
58 |
+
### Recommended inference parameters
|
59 |
+
|
60 |
+
```
|
61 |
+
do_sample: true
|
62 |
+
temperature: 0.2
|
63 |
+
top_p: 0.14
|
64 |
+
top_k: 12
|
65 |
+
repetition_penalty: 1.15
|
66 |
+
```
|
67 |
|
68 |
## 🧩 Configuration
|
69 |
|