IHaBiS/Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2-pippa

Exl2 version of Undi95/Nethena-MLewd-Xwin-23B

branch

main : 3.8bpw h8

I checked that main branch runs on 24G GPU (tested on Runpod 3090 server)
This time I quantized it with pippa parquet made by Undi95, testing for differences between this dataset and wikitext.
I hope this version give better result than wikitext version I did before.
Quantization settings : python convert.py -i models/Undi95_Nethena-MLewd-Xwin-23B -o Nethena-MLewd-Xwin-23B-temp -cf Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2 -c pippa.parquet -l 4096 -b 3.8 -hb 8 -ml 4096

below this line is original readme

Undi doing chemistry again.

Layer of Xwin-Mlewd was added in a different way than I do before, result seem good, but I'm a VRAMlet so I can only run the Q2 at 2k context for now.

Need to see if it really work good or I was just lucky with my prompt.

OG model : NeverSleep/Nethena-13B

Prompt template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

LimaRP is always kicking in and thus, this can be used to have more control on the size of the output.

Thanks Ikari.