GGUF Quantum model to be used in llama.cpp after:
https://github.com/ggerganov/llama.cpp/issues/6747
4-bit