Edit model card

Exl2 version of Undi95/Nethena-MLewd-Xwin-23B

branch

main : 3.8bpw h8

I checked that main branch runs on 24G GPU (tested on Runpod 3090 server)
This time I quantized it with pippa parquet made by Undi95, testing for differences between this dataset and wikitext.
I hope this version give better result than wikitext version I did before.
Quantization settings : python convert.py -i models/Undi95_Nethena-MLewd-Xwin-23B -o Nethena-MLewd-Xwin-23B-temp -cf Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2 -c pippa.parquet -l 4096 -b 3.8 -hb 8 -ml 4096

below this line is original readme

Undi doing chemistry again.

Layer of Xwin-Mlewd was added in a different way than I do before, result seem good, but I'm a VRAMlet so I can only run the Q2 at 2k context for now.

Need to see if it really work good or I was just lucky with my prompt.

OG model : NeverSleep/Nethena-13B

Prompt template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

LimaRP is always kicking in and thus, this can be used to have more control on the size of the output.

image/png

Thanks Ikari.

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.