Quantization

#3
by setareh1 - opened

I'm wondering if MiniChat-3B can be quantized or not?
like using ExLlamaV2 for that purpose.

Yes, it can be, since it uses the same architecture as that of Llama. But I am not very familiar with exllama, maybe you could have a try.

It works very well with exllama, no issues with quantization script or using quantized models.

Sign up or log in to comment