TheBloke
/

koala-7B-GGML

Model card Files Files and versions Community

TheBloke commited on May 12, 2023

Commit

cd66b60

•

1 Parent(s): 92cfc84

Update README.md

Files changed (1) hide show

README.md +10 -2

README.md CHANGED Viewed

@@ -23,13 +23,21 @@ I have the following Koala model repositories available:
 **13B models:**
 * [Unquantized 13B model in HF format](https://huggingface.co/TheBloke/koala-13B-HF)
 * [GPTQ quantized 4bit 13B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g)
-* [GPTQ quantized 4bit 13B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g-GGML)
 **7B models:**
 * [Unquantized 7B model in HF format](https://huggingface.co/TheBloke/koala-7B-HF)
 * [Unquantized 7B model in GGML format for llama.cpp](https://huggingface.co/TheBloke/koala-7b-ggml-unquantized)
 * [GPTQ quantized 4bit 7B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g)
-* [GPTQ quantized 4bit 7B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g-GGML)
 ## How to run in `llama.cpp`

 **13B models:**
 * [Unquantized 13B model in HF format](https://huggingface.co/TheBloke/koala-13B-HF)
 * [GPTQ quantized 4bit 13B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 13B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-13B-GGML)
 **7B models:**
 * [Unquantized 7B model in HF format](https://huggingface.co/TheBloke/koala-7B-HF)
 * [Unquantized 7B model in GGML format for llama.cpp](https://huggingface.co/TheBloke/koala-7b-ggml-unquantized)
 * [GPTQ quantized 4bit 7B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 7B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-7B-GGML)
+## REQUIRES LATEST LLAMA.CPP (May 12th 2023 - commit b9fd7ee)!
+llama.cpp recently made a breaking change to its quantisation methods.
+I have re-quantised the GGML files in this repo. Therefore you will require llama.cpp compiled on May 12th or later (commit `b9fd7ee` or later) to use them.
+The previous files, which will still work in older versions of llama.cpp, can be found in branch `previous_llama`.
 ## How to run in `llama.cpp`