Is this based on 8x7B instruct? and if so on which version v0.1 or v0.2?
Hi @seedboxai
Thanks a lot for the great work with KafkaLM!
I gave the model a spin and in depth testing and observed some weird behaviour which would make sense if it is either not using some correct settings (see sliding window fix in v0.2 of Mixtral-8x7B) or if it is not an instruction tuned model. Can you help me understand if this is supposed to be a model usable for chat interactions and if so what Mixtral base was used?
The behaviour observed was that the model almost always went of the rails talking about theoretical scenarious writing for pages. It stopped correctly, so i believe the EOS sequence tokens to be fine.
Many greetings
Robert
Hi Robert,
Thanks for the Feedback. The model was trained with the chat template also used for zephyr. Can you provide an example of prompts you used.
I used this model last week for creating large scale synthetic datasets with almost no weird behavior observed. Therefore, I am sure that the issue can be clarified π
Cheers,
Dennis
And I used the Mixtral 8x7B base, not the instruct model.