internlm2-limarp-chat-20b.Q4_K_S_imx.gguf vs internlm2-limarp-chat-20b.Q4_K_S.gguf

by Alastar-Smith - opened Feb 5

Feb 5

Hello!

Really like this model!
Can you please explain the difference between IMX vs regular GGUFs?
I've googled it but nothing useful.

Thank you in advance!

intervitens

Owner Feb 5

Hello! I'm glad you like my model.
IMX here refers to quantizations done using the recent imatrix feature from llama.cpp. They should perform slightly better, while being the same size.
You can read more about this feature in these pull requests:
https://github.com/ggerganov/llama.cpp/pull/4861
https://github.com/ggerganov/llama.cpp/pull/4930

Alastar-Smith

Feb 5

Hello! I'm glad you like my model.
IMX here refers to quantizations done using the recent imatrix feature from llama.cpp. They should perform slightly better, while being the same size.
You can read more about this feature in these pull requests:
https://github.com/ggerganov/llama.cpp/pull/4861
https://github.com/ggerganov/llama.cpp/pull/4930

Thank you for answer! Got it! Sounds like a nice improvement over an old GGUF!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment