Is there a working/quantized/exl2 (etc) version that will fit on a single 24GB video card (4090)
#170
by
cleverest
- opened
...or am I just dreaming? I'm using Text-Generation-WebUI in Windows 11. Thank you.
@cleverest yeah just search mixtral8x7b exl2 and you should get a lot. Find something thats below 4 bpw and it will fit. If it still somehow doesnt fit, try using 8bitcache or even 4 bit cache.
Get yourself the LM studio i found it the easiest way