EXL2 Quantizations
Collection
A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2
•
8 items
•
Updated
This model is based on the Qwen2.5-0.5B-Instruct model and is quantized in 4bits in the EXL2 format using the AutoQuant system : https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4
You can learn more about the EXL2 format here : https://github.com/turboderp/exllamav2 Feel free to use it as you want
Base model
Qwen/Qwen2.5-0.5B