Update README.md
Browse files
README.md
CHANGED
@@ -114,15 +114,14 @@ print(tokenizer.decode(output[0]))
|
|
114 |
|
115 |
**gptq_model-4bit-64g.safetensors**
|
116 |
|
117 |
-
This will work with AutoGPTQ
|
118 |
|
119 |
It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
|
120 |
|
121 |
* `gptq_model-4bit-64g.safetensors`
|
122 |
-
* Works
|
123 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
124 |
-
* Works with text-generation-webui using `--
|
125 |
-
* At this time it does NOT work with one-click-installers
|
126 |
* Does not work with any version of GPTQ-for-LLaMa
|
127 |
* Parameters: Groupsize = 64. No act-order.
|
128 |
|
|
|
114 |
|
115 |
**gptq_model-4bit-64g.safetensors**
|
116 |
|
117 |
+
This will work with AutoGPTQ 0.2.0 and later.
|
118 |
|
119 |
It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
|
120 |
|
121 |
* `gptq_model-4bit-64g.safetensors`
|
122 |
+
* Works with AutoGPTQ CUDA 0.2.0 and later.
|
123 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
124 |
+
* Works with text-generation-webui using `--trust-remote-code`
|
|
|
125 |
* Does not work with any version of GPTQ-for-LLaMa
|
126 |
* Parameters: Groupsize = 64. No act-order.
|
127 |
|