tsumeone
/

gpt4-x-alpaca-13b-native-4bit-128g-cuda

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tsumeone commited on Apr 8, 2023

Commit

53c85a7

•

1 Parent(s): da2bfec

Create README.md

Files changed (1) hide show

README.md +7 -0

README.md ADDED Viewed

	@@ -0,0 +1,7 @@

+Quant of https://huggingface.co/chavinlo/gpt4-x-alpaca
+There's already one located at https://huggingface.co/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g, but neither the triton nor cuda version they uploaded seem to want to work on older versions of GPTQ-for-LLaMA such as the one currently used with KoboldAI for 4bit support on 0cc4m's fork.
+This was quantized with cuda, not triton.
+python llama.py ./gpt4-x-alpaca c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors gpt-x-alpaca-13b-native-4bit-128g-cuda.safetensors