a few interesting models
would you consider making 2 bit and 1.5 bit quants of:
https://huggingface.co/deepnight-research/Saily_220B
https://huggingface.co/quantumaikr/falcon-180B-WizardLM_Orca
I'm doing sally 100b now, but that's pretty much the last one I'm gonna do.
@dranger003
, please make the GGUF version of this model. It seems so promising. https://huggingface.co/Wtzwho/Prometh-222B
As its a 222B model, please share the iq1_s.gguf of it too.
An iMat quantization on Prometh-222B would be fantastic. As a noob (this would be my first importance matrix quantization), I gave it a shot on a RunPod pod, but I kept running into the error "ggml_new_object: not enough space in the context's memory pool," even though my pod had 400GB of VRAM and 880GB of RAM. Maybe @dranger003 can pull it off.
A 222B model, really? Sounds like I need to get a new mortgage... let me see what I can do, no promise.