Sao mentioned this model had some fixes, so make sure you read their notes at the bottom of the model card please. I ran some tests through it to be safe, it seemed to perform well at a 16k context limit, but can't confirm anything beyond that. Unlike Stardust it has some of the evil characters giving you second chances, so you might need to prompt that out.

You probably need to be using a Min P of - at most .1 - with Nemo models. This is of course assuming you aren't using any other sliders to modify token selection probability (e.g. Top K). Going farther than that and it seems like you start to get broken words.

Anyway, another Lyra model. Hype indeed. Sao delivers once again.

This is the 8bpw EXL2 quant of this model. You can find the original model here.
You can find the 6bpw version here
You can find the 4bpw version here

Mistral-NeMo-12B-Lyra-v4, a variation of Lyra-v4a1, layered over Lyra-v3, which was built on top of Lyra-v2a2, which itself was built upon Lyra-v2a1.

Model Versioning

[See Previous Models]
  |
Lyra-v4a1
  |
  ------------> Lyra-v4 [Seperate RL Step targeting Instruct and Coherency over Base Nemo instead of SFT First, Result is Merged with Lyra-v4a1, fixes most quant-based issues. Somehow.]

This uses ChatML, or any of its variants which were included in previous versions.


<|im_start|>system
This is the system prompt.<|im_end|>
<|im_start|>user
Instructions placed here.<|im_end|>
<|im_start|>assistant
The model's response will be here.<|im_end|>
--------------------------------------------------
[INST]system
This is another system prompt.[/INST]
[INST]user
Your instructions placed here.[/INST]
[INST]assistant
The model's response will be here.[/INST]

Recommended Samplers:

Temperature: 0.6 - 1 # Make sure min_p is set before Temperature in Sampler Orders
min_p: 0.1 - 0.2 # Crucial for NeMo

Recommended Stopping Strings:

<|im_end|>
</s>
[/INST]

Notes

- I think I fixed the extra token stuff some users seem to be facing, while retaining everything else? It's some error alright.
- If you're using XML tags, you may see weird malformed stopping strings. Just add them to your current list. and move on.
- Its pretty nice, imo. I've been messing around with it a lot.
- Make sure the ChatML template is correct, I think there's some issues with the one used in SillyTavern which might cause improper replies?