Official AQLM quantization of microsoft/Phi-3-mini-4k-instruct .
For this quantization, we used 1 codebook of 16 bits.
Results:
Model | Quantization | MMLU (5-shot) | ArcC | ArcE | Hellaswag | Winogrande | PiQA | Model size, Gb |
---|---|---|---|---|---|---|---|---|
microsoft/Phi-3-mini-4k-instruct | None | 0.6949 | 0.5529 | 0.8325 | 0.6055 | 0.8020 | 0.7364 | 7.6 |
1x16 | 0.5818 | 0.4642 | 0.7807 | 0.5311 | 0.7715 | 0.7072 | 1.4 |
- Downloads last month
- 61
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.