Update README.md
Browse files
README.md
CHANGED
@@ -17,10 +17,18 @@ Based on [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama
|
|
17 |
|
18 |
For more details, please refer to [our blog post](https://note.com/elyza/n/n360b6084fdbd).
|
19 |
|
20 |
-
##
|
21 |
|
22 |
This model is quantized using the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
|
23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
## Use with vLLM
|
25 |
|
26 |
Install vLLM.
|
|
|
17 |
|
18 |
For more details, please refer to [our blog post](https://note.com/elyza/n/n360b6084fdbd).
|
19 |
|
20 |
+
## Quantization
|
21 |
|
22 |
This model is quantized using the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
|
23 |
|
24 |
+
We have prepared two quantized model options, GGUF and AWQ. Here is the table measuring the performance degradation due to quantization.
|
25 |
+
|
26 |
+
| Model | ELYZA-tasks-100 GPT4 score |
|
27 |
+
| :-------------------------------- | ---: |
|
28 |
+
| Llama-3-ELYZA-JP-8B | 3.655 |
|
29 |
+
| Llama-3-ELYZA-JP-8B-GGUF (Q4_K_M) | 3.57 |
|
30 |
+
| Llama-3-ELYZA-JP-8B-AWQ | 3.39 |
|
31 |
+
|
32 |
## Use with vLLM
|
33 |
|
34 |
Install vLLM.
|