EricB's picture
EricB HF staff
Update with HQQ4 and HQQ8
c6ceb2c verified
metadata
tags:
  - uqff
  - mistral.rs
base_model: meta-llama/Llama-3.2-11B-Vision-Instruct
base_model_relation: quantized

meta-llama/Llama-3.2-11B-Vision-Instruct, UQFF quantization

Run with mistral.rs. Documentation: UQFF docs.

  1. Flexible ๐ŸŒ€: Multiple quantization formats in one file format with one framework to run them all.
  2. Reliable ๐Ÿ”’: Compatibility ensured with embedded and checked semantic versioning information from day 1.
  3. Easy ๐Ÿค—: Download UQFF models easily and quickly from Hugging Face, or use a local file.
  4. Customizable ๐Ÿ› ๏ธ: Make and publish your own UQFF files in minutes.

Files

Name Quantization type(s) Example
llama-3.2-11b-vision-q4k.uqff Q4K ./mistralrs-server -i vision-plain -m meta-llama/Llama-3.2-11B-Vision-Instruct -a vllama --from-uqff EricB/Llama-3.2-11B-Vision-Instruct-UQFF/llama-3.2-11b-vision-q4k.uqff
llama-3.2-11b-vision-q8_0.uqff Q8_0 ./mistralrs-server -i vision-plain -m meta-llama/Llama-3.2-11B-Vision-Instruct -a vllama --from-uqff EricB/Llama-3.2-11B-Vision-Instruct-UQFF/llama-3.2-11b-vision-q8_0.uqff
llama-3.2-11b-vision-hqq4.uqff HQQ4 ./mistralrs-server -i vision-plain -m meta-llama/Llama-3.2-11B-Vision-Instruct -a vllama --from-uqff EricB/Llama-3.2-11B-Vision-Instruct-UQFF/llama-3.2-11b-vision-hqq4.uqff
llama-3.2-11b-vision-hqq8.uqff HQQ8 ./mistralrs-server -i vision-plain -m meta-llama/Llama-3.2-11B-Vision-Instruct -a vllama --from-uqff EricB/Llama-3.2-11B-Vision-Instruct-UQFF/llama-3.2-11b-vision-hqq8.uqff