4bit safetensor file triton or cuda?

#1
by RiggityWrckd - opened

Hi, I was just wondering if anyone had tried out the 4bit safetensor file yet on triton? Was it quantized with triton or with cuda? My setup is all triton right now. This looks like a really cool model. Thanks for putting it together :)

Edit: I answered my own question and downloaded the 4bit.safetensor it works on my triton branch textgen install with no groupsize set. Hope this helps someone out there

Apologies for the late reply; I'm glad you found the answer, thank you for sharing!

Sign up or log in to comment