Exllamav2 Quants
#2
by
llmixer
- opened
Thanks! I added a link in the model card.
Wish there was >4.0. 6bpw doesn't fit in 48g.
I added a 4.6 bpw quant, loads ok for me at 16K context in 48GB with auto-split.
alchemonaut/QuartetAnemoi-70B-t0.0001-b4.6-h8-exl2
Here is 3.75bpw if anyone intrested. Fits nice with 16K context on 40GB.
Great model!