Directly converted and quantized into GGUF based on llama.cpp
(release tag: b2843) from the 'Mata-Llama-3' repo from Meta on Hugging Face.
Including the original LLaMA 3 models file cloning from the Meta HF repo. (https://huggingface.co/meta-llama/Meta-Llama-3-8B)
If you have issues downloading the models from Meta or converting models for llama.cpp
, feel free to download this one!
Perplexity table on LLaMA 3 70B
Less perplexity is better. (credit to: dranger003)
Quantization | Size (GiB) | Perplexity (wiki.test) | Delta (FP16) |
---|---|---|---|
IQ1_S | 14.29 | 9.8655 +/- 0.0625 | 248.51% |
IQ1_M | 15.60 | 8.5193 +/- 0.0530 | 201.94% |
IQ2_XXS | 17.79 | 6.6705 +/- 0.0405 | 135.64% |
IQ2_XS | 19.69 | 5.7486 +/- 0.0345 | 103.07% |
IQ2_S | 20.71 | 5.5215 +/- 0.0318 | 95.05% |
Q2_K_S | 22.79 | 5.4334 +/- 0.0325 | 91.94% |
IQ2_M | 22.46 | 4.8959 +/- 0.0276 | 72.35% |
Q2_K | 24.56 | 4.7763 +/- 0.0274 | 68.73% |
IQ3_XXS | 25.58 | 3.9671 +/- 0.0211 | 40.14% |
IQ3_XS | 27.29 | 3.7210 +/- 0.0191 | 31.45% |
Q3_K_S | 28.79 | 3.6502 +/- 0.0192 | 28.95% |
IQ3_S | 28.79 | 3.4698 +/- 0.0174 | 22.57% |
IQ3_M | 29.74 | 3.4402 +/- 0.0171 | 21.53% |
Q3_K_M | 31.91 | 3.3617 +/- 0.0172 | 18.75% |
Q3_K_L | 34.59 | 3.3016 +/- 0.0168 | 16.63% |
IQ4_XS | 35.30 | 3.0310 +/- 0.0149 | 7.07% |
IQ4_NL | 37.30 | 3.0261 +/- 0.0149 | 6.90% |
Q4_K_S | 37.58 | 3.0050 +/- 0.0148 | 6.15% |
Q4_K_M | 39.60 | 2.9674 +/- 0.0146 | 4.83% |
Q5_K_S | 45.32 | 2.8843 +/- 0.0141 | 1.89% |
Q5_K_M | 46.52 | 2.8656 +/- 0.0139 | 1.23% |
Q6_K | 53.91 | 2.8441 +/- 0.0138 | 0.47% |
Q8_0 | 69.83 | 2.8316 +/- 0.0138 | 0.03% |
F16 | 131.43 | 2.8308 +/- 0.0138 | 0.00% |
Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here.
License
See the License file for Meta Llama 3 here and Acceptable Use Policy here
- Downloads last month
- 212