Uploaded GGUF and exl2 as Phi 3.1

#80
by bartowski - opened

The change in performance is so huge you really are doing yourselves a disservice by not renaming it! It may get swept under the rug because people will assume you just updated the README

I've uploaded GGUF and EXL2 here as Phi 3.1:

https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF

https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-exl2

Looks like they bumped the mini-128k too.

yeah sadly 128k still isn't supported in llama.cpp :(

NotImplementedError: The rope scaling type longrope is not supported yet

it's possible you could create them but it would just be the same as the 4k model in practice

Thought this had been sorted... https://github.com/ggerganov/llama.cpp/pull/7225

see I thought it had too, thank you for finding that.. looking at the changelog they may have changed it to a new rope method :') it used to be a regular rope with short factor and long factor, now it's their new longrope...

All the more important to distinguish between the versions, then.

Sign up or log in to comment