YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Quantized version of this: https://huggingface.co/ausboss/llama-30b-supercot
GPTQ quantization using https://github.com/0cc4m/GPTQ-for-LLaMa for compatibility with 0cc4m's fork of KoboldAI
This one is without groupsize to save on VRAM, so that you can enjoy the full 2048 max context if you have 24GB VRAM (or at least get a lot closer to it versus the groupsize version)
Command used to quantize:python llama.py c:\llama-30b-supercot c4 --wbits 4 --act-order --true-sequential --save_safetensors 4bit.safetensors
Evaluation & Score (Lower is better):
- WikiText2: 4.66
- PTB: 17.64
- C4: 6.50
Groupsize version is here: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-128g-cuda
- Downloads last month
- 25
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.