4-bit GPTQ quantized version of Qwen2.5-14B-Instruct.

Inference Examples

Inference API (serverless) does not yet support mlc-llm models for this pipeline type.

Model tree for Jeethu/Qwen2.5-14B-Instruct-PLLM

Base model

Qwen/Qwen2.5-14B

Finetuned

Quantized

(77)

this model