quantized this model
CUDA_VISIBLE_DEVICES=0 python llama.py ehartford/WizardLM-13B-Uncensored c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors
quantized this model
CUDA_VISIBLE_DEVICES=0 python llama.py ehartford/WizardLM-13B-Uncensored c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors