Possible Loading Error with GPT4All
It may just be on my end, but this model doesn't load for me in GPT4All. I've tried about a dozen other recent GGUF models and they all work. It says it's an "invalid format".
Note: I tried both the Q4 0 and Q4 K M versions.
Same problem here. candle tells me it's in GGUF version 3, but I assume version 2 is expected.
bit off topic: what's the performance difference compared to llamacpp? @Phil337
I don't know much about how all this works. From what I read GPT4All uses a fork of llamacpp. Whenever I come across a bad response I test it on the full unquantized version online and it gives comparable outputs and makes the same mistakes, so there's little difference in terms of hallucinations and other errors. And in terms of speed it's about 3 tokens/second using 3 cores at 2.7 ghz with AVX2 CPU instructions. @qm9
Update: It seems all the new Bloke GGUF models are now incompatible with GPT4ALL. Earlier releases all worked (such as Dolphin 2.1 and Open Hermes 2), but all subsequent releases don't (just tested AshhLimaRP). Perhaps there was a recent change to the standard and the newer models will work with a future version of GPT4All.
There's actually no real difference in the file format except the version bump as far as I know (except if you happen to be on a big-endian platform which is unlikely), so you can just change the 5th byte (index 4) of the file from 0x03
to 0x02
to transform the GGMLv3 file to a GGMLv2 one. Some examples of how to edit a byte like that: https://stackoverflow.com/a/34524796
Thanks @KerfuffleV2 , I bet you're right since this LLM works with other apps, earlier Mistrals all worked in GPT4ALL, and it fails immediately before even trying to load a LLM into RAM. So detecting an unsupported format (GGUFv3), then refusing to load it, adds up.
I bet you're right since this LLM works with other apps
I actually checked the patch where the GGUF version changed to 3 to make sure as well, so I was pretty confident about it.
I just looked at the GPT4ALL project and it seems like they already pulled in the changes for supporting GGUFv3 files, so if you use the main branch the GGUFv3 models should work. Otherwise, I guess wait for them to make a release (or use the model revert workaround I suggested). The good news is it shouldn't be a pain point for much longer.
What ver of GPT4All are you using ?
Ok thx. Good to know coz i was just about to download the model and got the same ver of gpt4all. : )