Alternative quantizatioons.

by ZeroWw - opened Jun 28

ZeroWw

Jun 28

My own (ZeroWw) quantizations. output and embed tensors quantized to f16. all other tensors quantized to q5_k or q6_k.

Result: both f16.q6 and f16.q5 are smaller than q8_0 standard quantization and they perform as well as the pure f16.

ZeroWw

Jun 29

@failspy hello! thanks for the abliterated versions.. can you please also do this one: https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k

And, mistral instruct v03?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment