IQ1 quants

#2
by OrangeApples - opened

Hello @Artefact2 . I noticed some new IQ1 quants on Nexesenex's repo. Would you be willing to upload some of those quants for this 103B model?

In relation, I'm also curious to see how well IQ1 quants of Goliath 120B perform. If they perform even at least as well as the Q2_K gguf, then it would open up a whole new world of models for 24GB VRAM users (assuming they're small enough to be fully offloaded).

No, that quant format is still a draft/WIP. Maybe after it gets merged into mainline.

If they perform even at least as well as the Q2_K gguf

It doesn't. It's already much worse than IQ2_XXS.

That's unfortunate. Was hoping for more but I guess my expectations were too high for a 1-bit quant. Thanks anyway!

OrangeApples changed discussion status to closed

Sign up or log in to comment