AI-Sweden-Models
/

gpt-sw3-6.7b-v2-instruct-4bit-gptq

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Ekgren commited on Dec 1, 2023

Commit

6f72481

•

1 Parent(s): cde3013

Create quantize_config.json

Files changed (1) hide show

quantize_config.json +22 -0

quantize_config.json ADDED Viewed

	@@ -0,0 +1,22 @@

+{
+  "bits": 4,
+  "block_name_to_quantize": "transformer.h",
+  "damp_percent": 0.1,
+  "dataset": "c4",
+  "desc_act": false,
+  "disable_exllama": true,
+  "group_size": 128,
+  "max_input_length": null,
+  "model_seqlen": 2048,
+  "module_name_preceding_first_block": [
+    "transformer.wte",
+    "transformer.wpe",
+    "transformer.drop"
+  ],
+  "pad_token_id": null,
+  "quant_method": "gptq",
+  "sym": true,
+  "tokenizer": null,
+  "true_sequential": true,
+  "use_cuda_fp16": true
+}