Question - Llama-3 configs.
I wanted to ask a question: is it currently necessary to replace files in the Llama 3 model, since this is written in your README file?
Because I replaced the files and the created models, every 4 or 5 answers began to repeat the same word, like: she walked along the road and found red red red red red red red red.
Now I’ll check the created models without replacing files.
In short, I looked at one repository on hugging face and perhaps the issue may be in the model itself that I downloaded for quantization. But I'm not sure yet.
I just found another author who made models and his models have the same problem.
It's still necessary to get the correct EOS tokens just go make sure it doesn't generate forever. I make and use my own quants with these and don't run into issues. Make sure you're using the correct Promoting format when using the model.
For SillyTavern, which I use, we use Presets, you can get them here (simple) or here (Virt's). Use the latest version of KoboldCpp.
If it doesn't work under these circumstances the model you're quanting might be unstable.
Related Discussion: LLM-Discussions #5
This model uses the same config and quantization process, for your testing, and it's considered a good performer:
OK, thanks for answer.