GGUF Quantizations

Volko76 's Collections

updated 2 days ago

A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp