Best Mistral-Nemo 12B Finetune

by WesPro - opened 6 days ago

6 days ago

I really like this model it's the best finetune of MN-12b I've tried so far and I tried probably over 20 so far. Did you just switch the way you trained/merged the model or did you also alter the datasets significantly compared to your earlier models? Are planning to train other base models with the same datasets and a similar approach? If yes maybe the new-ish aya-expanse-32b model would be a good fit. I think nobody has done/released a finetune on it so far, which I don't understand because I really like the model and it somehow feels a bit similar to this one.

crestf411

Owner 6 days ago

Thanks for the kind words! It is really encouraging when people take the time to give feedback.

Did you just switch the way you trained/merged the model or did you also alter the datasets significantly compared to your earlier models?

The datasets are constantly evolving, but overall it's the same. I do swap stuff in/out as I see how it affects the model though (e.g. a lot of horror content was removed in the slush models as it didn't seem to cure the positivity bias to any noticeable degree). A significant difference from earlier models is that I made distinct pretraining/completion phases and instruction phases rather than mixing the two together.

Are planning to train other base models with the same datasets and a similar approach?

Yeah, I am working on other Slush models at the moment. Aya-expanse-32b is an interesting idea. I haven't played with that one at all, but will look into it.

GhostGate

6 days ago

•

edited 6 days ago

Hi! I am just curious, when you say positivity bias, how strong is it? If we are to take a horror setting for a novel style roleplaying, would the model shy away from ending NPC's if the character card emphasizes a violent killer? Or try to push for a redemption narrative on the main villain of the story? I am asking this since I am trying to ascertain the notion of positivity bias in models. I noticed that all of the MN I have tried thus far (even vanilla instruct), they are all capable of handling dark narratives and will portray an evil character as truly evil, without any redeeming qualities. I also don't get policed over any "toxic" type of behavior. So, I am trying to gather what exactly constitutes positivity bias in your model, for example.

crestf411

Owner 6 days ago

•

edited 6 days ago

Yes, for Nemo this is not a huge issue. For other models such as Llama 3.1 it can be. By positivity bias, I am talking about characters giving you challenges/making trouble for you/griefing and so on, rather than just all being one happy group of people hanging out being jolly with each other. Perhaps my use of the word positivity bias is a bit off here. I am aiming for a model that will e.g. slap your face if you suddenly kiss it, as would be a normal reaction in real life.

GhostGate

6 days ago

That's a fair point. I did notice in my earlier days that L3.1 is a bit too eager for "and we all held hands and skipped to the sunset". However, lately as I got better with prompting and setting instructions, I noticed that you can steer the model into more realistic character portrayals (based on normal interaction as you mentioned with the kissing). I haven't played with L3.1 in a while, so far i have been sticking to Mistral models and some Qwen once in a while, so I am not sure how much you can reinforce "troubled" characters in a model, even if the default behavior is to happily comply with anything.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment