Quantization
#3
by
setareh1
- opened
I'm wondering if MiniChat-3B can be quantized or not?
like using ExLlamaV2 for that purpose.
Yes, it can be, since it uses the same architecture as that of Llama. But I am not very familiar with exllama, maybe you could have a try.
It works very well with exllama, no issues with quantization script or using quantized models.