sampler, etc. settings for these MoE models

#1
by tachyphylaxis - opened

Hey,

I've been trying to run some of these, and they just seem ... bad lol. I mean, like they've been huffing model airplane glue or something. I can't imagine that they actually are that bad, or else people wouldn't be bothering with them. I've tried the various min-p settings in sillytavern and a bunch of other ones, and the text just doesn't seem particularly ... coherent. Are they particularly sensitive to settings, or ... I mean, it's like working with a mule. I have to ban the EOS token sometimes just to make the thing keep going, and, of course, that ultimately doesn't fix it. Best case, the prose they generate seems to lack detail, and they don't like doing anything for more than a few paragraphs. I try to adjust the settings, but anything that makes them keep going just makes them way dumber haha.

This are the System Prompt, and Story String i use for this model:

https://files.catbox.moe/9nhcqj.json
https://files.catbox.moe/b61tuk.json

Also if you are using SillyTavern's extras the chromaDB custom wrapper:
[Consider previous chats below as long-term memories for context. Analyze their relevance to the current conversation. If unrelated, disregard and focus on the present interaction.
{{memories}}
End of memories]

So, I'm no expert or anything, very novice, but I've been greatly enjoying this one and other other similar models. (Noromaid 8x7B, FlatOrcamaid 13B, Noromaid 13B, etc)
I pretty much exclusively use it for RP/Character chat stuff, so ymmv, of course. But I've never been overly disappointed. I find that sometimes you have to swipe to get a 'just right' response. Usually the swipe is for that, not because anything is gibberish or unusable. Another thing, I feel with Mixtral stuff, you really need to mind your input. I've had responses change greatly when I added more information.

For settings, I've been using a mishmash of what was originally recommended to me by someone when Noromaid first launched, tweaked over time with feedback from other people. I still don't know what some of these settings even do!
Text Completion Settings - https://files.catbox.moe/hr80n7.json
I'm on a 3090 so I have it at 24k context, and I use the 8bit cache thing and it just about fits on the card. For reference, my card is still outputting display so it's not taking the entire exact 24gb.

For Context Temp and Instruct Mode, these are also from the original Noromaid recommended settings, but with additions. Particularly, another user recommended their system prompt, though it's rather large. I've found it to be rather nice, so I use it with a lot of things now.
Context Template - https://files.catbox.moe/g9ix7q.json
Instruct Mode - https://files.catbox.moe/jaoohs.json

Again, very much not an expert, but I've at least been having fun with the model for RP/Chat. I've enjoyed it more than a lot of the yi34B models and some of the other Mixtral 8x7B models. Hopefully this helps!

Sign up or log in to comment