What?
It's the model from https://huggingface.co/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g, converted for use in the latest llama.cpp release.
Why?
Update: They made yet another breaking change of the same nature here, so I repeated the same procedure and reuploaded the result.
Starting with this PR, the llama.cpp team decided to make a breaking change so that all GGML version of the models created prior to this are no longer supported. I just re-did the conversion from the non-GGML model using the latest conversion scripts and posted it here.
How?
- Clone the llama.cpp repo and
cd
into that folder; - Download the latest release from the same page and extract it there;
- Take this file and put it into the
models/gpt-x-alpaca-13b-native-4bit-128g
folder; - Add this in the same folder;
- Add all other small files from the repository in the same folder;
- Run
python3 .\convert-pth-to-ggml.py models/gpt-x-alpaca-13b-native-4bit-128g 1
; - Run
./quantize ggml-model-f16.bin gpt4-x-alpaca-13b-ggml-q4_0.bin q4_0
;