DiscoResearch
/

Llama3-DiscoLeo-8B-DARE-Experimental-4bit-awq

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

bjoernp commited on May 25

Commit

122a032

•

1 Parent(s): ff7a696

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -7,7 +7,10 @@ library_name: transformers
 # Llama3_DiscoLeo_8B_DARE_Experimental_4bit_awq_glc
-This model is a 4 bit quantization of [DiscoResearch/Llama3_DiscoLeo_8B_DARE_Experimental](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_8B_DARE_Experimental).
 Copy of the original model card:

 # Llama3_DiscoLeo_8B_DARE_Experimental_4bit_awq_glc
+This model is a 4 bit quantization of [DiscoResearch/Llama3_DiscoLeo_8B_DARE_Experimental](https://huggingface.co/DiscoResearch/Llama3_DiscoLeo_8B_DARE_Experimental)
+created using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) with a custom bilingual calibration dataset and `quant_config = {"zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM"}`.
 Copy of the original model card: