tyoyo commited on
Commit
d6ac9a3
1 Parent(s): 0796d05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -19,6 +19,18 @@ Based on [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama
19
 
20
  For more details, please refer to [our blog post](https://note.com/elyza/n/n360b6084fdbd).
21
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## Use with llama.cpp
23
  Install llama.cpp through brew (works on Mac and Linux)
24
 
 
19
 
20
  For more details, please refer to [our blog post](https://note.com/elyza/n/n360b6084fdbd).
21
 
22
+ ## Quantization
23
+ We performed quantization using [llama.cpp](https://github.com/ggerganov/llama.cpp) and converted the model to GGUF format. Currently, we only offer quantized models in the Q4_K_M format.
24
+
25
+ We have prepared two quantized model options, GGUF and AWQ. Here is the table measuring the performance degradation due to quantization.
26
+
27
+ | Model | ELYZA-tasks-100 GPT4 score |
28
+ | :-------------------------------- | ---: |
29
+ | Llama-3-ELYZA-JP-8B | 3.655 |
30
+ | Llama-3-ELYZA-JP-8B-GGUF (Q4_K_M) | 3.57 |
31
+ | Llama-3-ELYZA-JP-8B-AWQ | 3.39 |
32
+
33
+
34
  ## Use with llama.cpp
35
  Install llama.cpp through brew (works on Mac and Linux)
36