dranger003
/

dbrx-instruct-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Apr 14

Commit

f85aad8

•

1 Parent(s): f80614d

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ The quants here are meant to test imatrix quantized weights.
 Quants in this repo are tested running the following command (quants under IQ3 are very sensitive and unreliable so far - the imatrix may require to be trained on FP16 weights rather than Q8_0 and for longer than 200 chunks):
 ```
-./build/bin/main -ngl 41 -s 0 -e -p "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWrite an essay about AI.<|im_end|>\n<|im_start|>assistant\n" -m ggml-dbrx-instruct-16x12b-<<quant-to-test>>.gguf
 ```
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/databricks/dbrx-instruct

 Quants in this repo are tested running the following command (quants under IQ3 are very sensitive and unreliable so far - the imatrix may require to be trained on FP16 weights rather than Q8_0 and for longer than 200 chunks):
 ```
+./build/bin/main -ngl 41 -c 4096 -s 0 -e -p "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWrite an essay about AI.<|im_end|>\n<|im_start|>assistant\n" -m ggml-dbrx-instruct-16x12b-<<quant-to-test>>.gguf
 ```
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/databricks/dbrx-instruct