Official AQLM quantization of microsoft/Phi-3-mini-128k-instruct .
For this quantization, we used 1 codebook of 16 bits.
Results:
Model | Quantization | MMLU (5-shot) | ArcC | ArcE | Hellaswag | Winogrande | PiQA | Model size, Gb |
---|---|---|---|---|---|---|---|---|
microsoft/Phi-3-mini-128k-instruct | None | 0.6881 | 0.5418 | 0.8127 | 0.5980 | 0.7873 | 0.7340 | 7.6 |
1x16 | 0.5815 | 0.4599 | 0.7845 | 0.5235 | 0.7666 | 0.6930 | 1.4 |
- Downloads last month
- 40
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.