text-generation-webui / docs /What Works.md
markqiu's picture
Upload folder using huggingface_hub
cd36062

A newer version of the Gradio SDK is available: 5.8.0

Upgrade

What Works

Loader Loading 1 LoRA Loading 2 or more LoRAs Training LoRAs Multimodal extension Perplexity evaluation
Transformers βœ… βœ…*** βœ…* βœ… βœ…
ExLlama_HF βœ… ❌ ❌ ❌ βœ…
ExLlamav2_HF βœ… βœ… ❌ ❌ βœ…
ExLlama βœ… ❌ ❌ ❌ use ExLlama_HF
ExLlamav2 βœ… βœ… ❌ ❌ use ExLlamav2_HF
AutoGPTQ βœ… ❌ ❌ βœ… βœ…
GPTQ-for-LLaMa βœ…** βœ…*** βœ… βœ… βœ…
llama.cpp ❌ ❌ ❌ ❌ use llamacpp_HF
llamacpp_HF ❌ ❌ ❌ ❌ βœ…
ctransformers ❌ ❌ ❌ ❌ ❌
AutoAWQ ? ❌ ? ? βœ…

❌ = not implemented

βœ… = implemented

* Training LoRAs with GPTQ models also works with the Transformers loader. Make sure to check "auto-devices" and "disable_exllama" before loading the model.

** Requires the monkey-patch. The instructions can be found here.

*** Multi-LoRA in PEFT is tricky and the current implementation does not work reliably in all cases.