Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.

#2
by TheYuriLover - opened

Source : https://github.com/ggerganov/llama.cpp/discussions/5006
The problem we have when using a calibration dataset is the overfitting to a certain style and then in consequence, make the model worse on other aspects.
image.png
Supposedly, the suggestion to fix this is to use a calibration dataset composed of random tokens instead.

TheYuriLover changed discussion title from Using the new gguf quant method may result in a woese overall performance than that of the old gguf quants. to Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.
NousResearch org

Thank you, we reverted to old llama cpp and it fixed it afaik

Sign up or log in to comment