Best Mistral-Nemo 12B Finetune

#1
by WesPro - opened

I really like this model it's the best finetune of MN-12b I've tried so far and I tried probably over 20 so far. Did you just switch the way you trained/merged the model or did you also alter the datasets significantly compared to your earlier models? Are planning to train other base models with the same datasets and a similar approach? If yes maybe the new-ish aya-expanse-32b model would be a good fit. I think nobody has done/released a finetune on it so far, which I don't understand because I really like the model and it somehow feels a bit similar to this one.

Thanks for the kind words! It is really encouraging when people take the time to give feedback.

Did you just switch the way you trained/merged the model or did you also alter the datasets significantly compared to your earlier models?

The datasets are constantly evolving, but overall it's the same. I do swap stuff in/out as I see how it affects the model though (e.g. a lot of horror content was removed in the slush models as it didn't seem to cure the positivity bias to any noticeable degree). A significant difference from earlier models is that I made distinct pretraining/completion phases and instruction phases rather than mixing the two together.

Are planning to train other base models with the same datasets and a similar approach?

Yeah, I am working on other Slush models at the moment. Aya-expanse-32b is an interesting idea. I haven't played with that one at all, but will look into it.

Hi! I am just curious, when you say positivity bias, how strong is it? If we are to take a horror setting for a novel style roleplaying, would the model shy away from ending NPC's if the character card emphasizes a violent killer? Or try to push for a redemption narrative on the main villain of the story? I am asking this since I am trying to ascertain the notion of positivity bias in models. I noticed that all of the MN I have tried thus far (even vanilla instruct), they are all capable of handling dark narratives and will portray an evil character as truly evil, without any redeeming qualities. I also don't get policed over any "toxic" type of behavior. So, I am trying to gather what exactly constitutes positivity bias in your model, for example.

Yes, for Nemo this is not a huge issue. For other models such as Llama 3.1 it can be. By positivity bias, I am talking about characters giving you challenges/making trouble for you/griefing and so on, rather than just all being one happy group of people hanging out being jolly with each other. Perhaps my use of the word positivity bias is a bit off here. I am aiming for a model that will e.g. slap your face if you suddenly kiss it, as would be a normal reaction in real life.

That's a fair point. I did notice in my earlier days that L3.1 is a bit too eager for "and we all held hands and skipped to the sunset". However, lately as I got better with prompting and setting instructions, I noticed that you can steer the model into more realistic character portrayals (based on normal interaction as you mentioned with the kissing). I haven't played with L3.1 in a while, so far i have been sticking to Mistral models and some Qwen once in a while, so I am not sure how much you can reinforce "troubled" characters in a model, even if the default behavior is to happily comply with anything.

Sign up or log in to comment