Mixtral HQQ Quantized Models - a mobiuslabsgmbh Collection

mobiuslabsgmbh 's Collections

Aana

Llama2 HQQ Quantized Models

Mixtral HQQ Quantized Models

ViT HQQ Quantized Models

Mixtral HQQ Quantized Models

updated Mar 29

4-bit and 2-bit Mixtral models quantized using https://github.com/mobiusml/hqq

mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-4bit_g64-HQQ

Text Generation • Updated Jan 8 • 13 • 9
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-2bit_g16_s128-HQQ

Text Generation • Updated Jan 8 • 4 • 9
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-2bit_g16_s128-HQQ

Text Generation • Updated Jan 8 • 10 • 4
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-4bit_g64-HQQ

Text Generation • Updated Dec 11, 2023 • 13 • 1
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ

Text Generation • Updated Jan 8 • 7 • 38

Note If you are considering 2-bit instruct model use this one.
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-attn-4bit-moe-2bit-HQQ

Text Generation • Updated Jan 8 • 16 • 6

Note If you are considering 2-bit base model use this one.
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-metaoffload-HQQ

Text Generation • Updated Feb 29 • 25 • 15

Note If you are considering 2-bit base model but is GPU pure this is a good option. Requires 13GB of RAM, but it will be slower.
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-3bit-metaoffload-HQQ

Text Generation • Updated Feb 29 • 5 • 13
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ

Text Generation • Updated Feb 29 • 7 • 20