Exl2 version of Undi95/Nethena-MLewd-Xwin-23B
branch
main : 3.8bpw h8
I checked that main branch runs on 24G GPU (tested on Runpod 3090 server)
This time I quantized it with pippa parquet made by Undi95, testing for differences between this dataset and wikitext.
I hope this version give better result than wikitext version I did before.
Quantization settings : python convert.py -i models/Undi95_Nethena-MLewd-Xwin-23B -o Nethena-MLewd-Xwin-23B-temp -cf Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2 -c pippa.parquet -l 4096 -b 3.8 -hb 8 -ml 4096
below this line is original readme
Undi doing chemistry again.
Layer of Xwin-Mlewd was added in a different way than I do before, result seem good, but I'm a VRAMlet so I can only run the Q2 at 2k context for now.
Need to see if it really work good or I was just lucky with my prompt.
OG model : NeverSleep/Nethena-13B
Prompt template: Alpaca
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{prompt}
### Response:
LimaRP is always kicking in and thus, this can be used to have more control on the size of the output.
Thanks Ikari.
- Downloads last month
- 12