Kameshr
/

LLAMA-3-Quantized

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions Community

Kameshr commited on May 3

Commit

2e22d0a

•

1 Parent(s): a6a7513

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ inference: false
 ## Model Details
 **Model Description:**
-This model is a 4-bit quantized version of the Meta Llama 3 - 8B Instruct large language model (LLM). Quantization reduces the model size and improves inference speed, making it suitable for deployment on devices with limited computational resources. The original LLAma3-Instruct 8B model is an autoregressive transformer-based LLM, trained on a massive dataset of text and code. It is fine-tuned for instruction following and excels in dialogue tasks.
 ---
 ## Original README

 ## Model Details
 **Model Description:**
+This model is a 8-bit quantized version of the Meta Llama 3 - 8B Instruct large language model (LLM). Quantization reduces the model size and improves inference speed, making it suitable for deployment on devices with limited computational resources. The original LLAma3-Instruct 8B model is an autoregressive transformer-based LLM, trained on a massive dataset of text and code. It is fine-tuned for instruction following and excels in dialogue tasks.
 ---
 ## Original README