elyza
/

Llama-3-ELYZA-JP-8B-AWQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

passaglia commited on Jun 25

Commit

8ea5584

•

1 Parent(s): 52f5b9f

Update README.md

Files changed (1) hide show

README.md +5 -8

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ For more details, please refer to [our blog post](https://note.com/elyza/n/n360b
 We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
-Here is a table showing the performance degradation due to quantization.
 | Model | ELYZA-tasks-100 GPT4 score |
 | :-------------------------------- | ---: |
@@ -31,7 +31,7 @@ Here is a table showing the performance degradation due to quantization.
 ## Use with vLLM
-Install vLLM.
 ```bash
 pip install vllm
@@ -74,8 +74,7 @@ for output in outputs:
 ### vLLM OpenAI Compatible Server
-Start the API server.
 ```bash
 python -m vllm.entrypoints.openai.api_server \
 --model elyza/Llama-3-ELYZA-JP-8B-AWQ \
@@ -85,8 +84,7 @@ python -m vllm.entrypoints.openai.api_server \
 ```
-Call the API using curl.
 ```bash
 curl http://localhost:8000/v1/chat/completions \
 -H "Content-Type: application/json" \
@@ -102,8 +100,7 @@ curl http://localhost:8000/v1/chat/completions \
 }'
 ```
-Call the API using Python.
 ```python
 import openai

 We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
+The following table shows the performance degradation due to quantization:
 | Model | ELYZA-tasks-100 GPT4 score |
 | :-------------------------------- | ---: |
 ## Use with vLLM
+Install vLLM:
 ```bash
 pip install vllm
 ### vLLM OpenAI Compatible Server
+Start the API server:
 ```bash
 python -m vllm.entrypoints.openai.api_server \
 --model elyza/Llama-3-ELYZA-JP-8B-AWQ \
 ```
+Call the API using curl:
 ```bash
 curl http://localhost:8000/v1/chat/completions \
 -H "Content-Type: application/json" \
 }'
 ```
+Call the API using Python:
 ```python
 import openai