|
--- |
|
tags: |
|
- uqff |
|
- mistral.rs |
|
base_model: google/gemma-1.1-7b-it |
|
base_model_relation: quantized |
|
--- |
|
|
|
<!-- Autogenerated from user input. --> |
|
|
|
# `google/gemma-1.1-7b-it`, UQFF quantization |
|
|
|
|
|
Run with [mistral.rs](https://github.com/EricLBuehler/mistral.rs). Documentation: [UQFF docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/UQFF.md). |
|
|
|
1) **Flexible** π: Multiple quantization formats in *one* file format with *one* framework to run them all. |
|
2) **Reliable** π: Compatibility ensured with *embedded* and *checked* semantic versioning information from day 1. |
|
3) **Easy** π€: Download UQFF models *easily* and *quickly* from Hugging Face, or use a local file. |
|
3) **Customizable** π οΈ: Make and publish your own UQFF files in minutes. |
|
## Files |
|
|
|
|Quantization type(s)|Example| |
|
|--|--| |
|
|FP8|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-f8e4m3.uqff`| |
|
|HQQ4|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-hqq4.uqff`| |
|
|HQQ8|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-hqq8.uqff`| |
|
|Q3K|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-q3k.uqff`| |
|
|Q4K|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-q4k.uqff`| |
|
|Q5K|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-q5k.uqff`| |
|
|Q8_0|`./mistralrs-server -i plain -m EricB/gemma-1.1-7b-it-UQFF --from-uqff gemma1.1-7b-instruct-q8_0.uqff`| |
|
|