Clarify what is what, please.
If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?
Is that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?
Thx for clarification and keep up the good work!
If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?
q8_0 is quantized f16 for outpt and embed and q8_0 for the others.
q8_p is quantized using the flag --pure in the quantization program
Is that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?
you are right, the naming is confusing because I changed it in time and I'm too lazy to change them all back.
Thx for clarification and keep up the good work!
everything is explained here: https://huggingface.co/RobertSinclair
If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?
q8_0 is quantized f16 for outpt and embed and q8_0 for the others.
q8_p is quantized using the flag --pure in the quantization programIs that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?
you are right, the naming is confusing because I changed it in time and I'm too lazy to change them all back.
Thx for clarification and keep up the good work!
everything is explained here: https://huggingface.co/RobertSinclair
Ok! Thx. Great job, again - and thank you.