Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.
#2
by
TheYuriLover
- opened
Source : https://github.com/ggerganov/llama.cpp/discussions/5006
The problem we have when using a calibration dataset is the overfitting to a certain style and then in consequence, make the model worse on other aspects.
Supposedly, the suggestion to fix this is to use a calibration dataset composed of random tokens instead.
TheYuriLover
changed discussion title from
Using the new gguf quant method may result in a woese overall performance than that of the old gguf quants.
to Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.
Thank you, we reverted to old llama cpp and it fixed it afaik