Information

OpenAssistant-Llama-13B-4-bit working with GPTQ versions used in Oobabooga's Text Generation Webui and KoboldAI.
This was made using Serpdotai's Open Assistant 13b LoRa trained for 4 epochs using Open Assistant's dataset.

python llama.py /KoboldAI/repos/gptq/llama13b-oasst-4-epochs-lora c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors llama13b-oasst-4-epochs-lora-4bit-128g.safetensors

Benchmarks

--true-sequential --groupsize 128

Wikitext2: 5.380471229553223

Ptb-New: 31.921072006225586

C4-New: 7.140256881713867

Note: This version uses --groupsize 128, resulting in better evaluations. However, it consumes more VRAM.

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Monero/oasst-llama-13b-4-epochs-4bit-128g