exl2 quants for ReMM V2.2

This repository includes the quantized models for the ReMM V2.2 model by Undi. ReMM is a model merge attempting to recreate MythoMax using the SLERP merging method and newer models.

Current models

exl2 Quant	Model Branch	Model Size	Minimum Recommended VRAM (4096 Context, fp16 cache)	BPW
3-Bit	main	5.44 GB	8GB GPU	3.14
3-Bit	3bit	6.36 GB	10GB GPU	3.72
4-Bit	4bit	7.13 GB	12GB GPU (10GB with swap)	4.2
4-Bit	4.6bit	7.81 GB	12GB GPU	4.63
5-Bit	R136a1's Repo	8.96 GB	16GB GPU (12GB with swap)	5.33

Where to use

There are a couple places you can use an exl2 model, here are a few:

tabbyAPI
Aphrodite Engine
ExUI
oobabooga's Text Gen Webui
- When using the downloader, make sure to format like this: Anthonyg5005/ReMM-v2.2-L2-13B-exl2:QuantBranch
- With 5-Bit download: R136a1/ReMM-v2.2-L2-13B-exl2
KoboldAI (Clone repo, don't use snapshot)

How to download:

oobabooga's downloader

use something like download-model.py to download with python requests.
Install requirements:

pip install requests tqdm

Example for downloading 3bpw:

python download-model.py Anthonyg5005/ReMM-v2.2-L2-13B-exl2:3bit

huggingface-cli

You may also use huggingface-cli
To install it, install python hf-hub

pip install huggingface-hub

Example for 3bpw:

huggingface-cli download Anthonyg5005/ReMM-v2.2-L2-13B-exl2 --local-dir ReMM-v2.2-L2-13B-exl2-3bpw --revision 3bit

Git LFS (not recommended)

I would recommend the http downloaders over using git, they can resume downloads if failed and are much easier to work with.
Make sure to have git and git LFS installed.
Example for 3bpw download with git:

Have LFS file skip disabled

# windows
set GIT_LFS_SKIP_SMUDGE=0
# linux
export GIT_LFS_SKIP_SMUDGE=0

Clone repo branch

git clone https://huggingface.co/Anthonyg5005/ReMM-v2.2-L2-13B-exl2 -b 3bit

Anthonyg5005
/

ReMM-v2.2-L2-13B-exl2