TheBloke
/

WizardLM-33B-V1.0-Uncensored-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Update README.md

#2

by ShleemLeemTeem - opened Jun 25, 2023

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -125,7 +125,7 @@ It was created without group_size to lower VRAM requirements, and with --act-ord
 * `wizardlm-33b-v1.0-uncensored-GPTQ-4bit--1g.act.order.safetensors`
   * Works with AutoGPTQ in CUDA or Triton modes.
-  * LLaMa models also work with [ExLlama](https://github.com/turboderp/exllama}, which usually provides much higher performance, and uses less VRAM, than AutoGPTQ.
   * Works with GPTQ-for-LLaMa in CUDA mode.  May have issues with GPTQ-for-LLaMa Triton mode.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.

 * `wizardlm-33b-v1.0-uncensored-GPTQ-4bit--1g.act.order.safetensors`
   * Works with AutoGPTQ in CUDA or Triton modes.
+  * LLaMa models also work with [ExLlama](https://github.com/turboderp/exllama), which usually provides much higher performance, and uses less VRAM, than AutoGPTQ.
   * Works with GPTQ-for-LLaMa in CUDA mode.  May have issues with GPTQ-for-LLaMa Triton mode.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.