phi-3-mini-4k-instruct-gptq-4bit
phi-3-mini-4k-instruct-gptq-4bit is a version of the Microsoft Phi 3 mini 4k Instruct model that was quantized using the GPTQ method developed by Lin et al. (2023).
Please refer to the Original Phi 3 mini model card for details about the model preparation and training processes.
Dependencies
auto-gptq
– AutoGPTQ was used to quantize the phi-3 model.vllm==0.4.2
– vLLM was used to host models for benchmarking.
- Downloads last month
- 39
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.