Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ For more details, please refer to [our blog post](https://note.com/elyza/n/n360b
|
|
21 |
|
22 |
We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
|
23 |
|
24 |
-
|
25 |
|
26 |
| Model | ELYZA-tasks-100 GPT4 score |
|
27 |
| :-------------------------------- | ---: |
|
@@ -31,7 +31,7 @@ Here is a table showing the performance degradation due to quantization.
|
|
31 |
|
32 |
## Use with vLLM
|
33 |
|
34 |
-
Install vLLM
|
35 |
|
36 |
```bash
|
37 |
pip install vllm
|
@@ -74,8 +74,7 @@ for output in outputs:
|
|
74 |
|
75 |
### vLLM OpenAI Compatible Server
|
76 |
|
77 |
-
Start the API server
|
78 |
-
|
79 |
```bash
|
80 |
python -m vllm.entrypoints.openai.api_server \
|
81 |
--model elyza/Llama-3-ELYZA-JP-8B-AWQ \
|
@@ -85,8 +84,7 @@ python -m vllm.entrypoints.openai.api_server \
|
|
85 |
```
|
86 |
|
87 |
|
88 |
-
Call the API using curl
|
89 |
-
|
90 |
```bash
|
91 |
curl http://localhost:8000/v1/chat/completions \
|
92 |
-H "Content-Type: application/json" \
|
@@ -102,8 +100,7 @@ curl http://localhost:8000/v1/chat/completions \
|
|
102 |
}'
|
103 |
```
|
104 |
|
105 |
-
Call the API using Python
|
106 |
-
|
107 |
```python
|
108 |
import openai
|
109 |
|
|
|
21 |
|
22 |
We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
|
23 |
|
24 |
+
The following table shows the performance degradation due to quantization:
|
25 |
|
26 |
| Model | ELYZA-tasks-100 GPT4 score |
|
27 |
| :-------------------------------- | ---: |
|
|
|
31 |
|
32 |
## Use with vLLM
|
33 |
|
34 |
+
Install vLLM:
|
35 |
|
36 |
```bash
|
37 |
pip install vllm
|
|
|
74 |
|
75 |
### vLLM OpenAI Compatible Server
|
76 |
|
77 |
+
Start the API server:
|
|
|
78 |
```bash
|
79 |
python -m vllm.entrypoints.openai.api_server \
|
80 |
--model elyza/Llama-3-ELYZA-JP-8B-AWQ \
|
|
|
84 |
```
|
85 |
|
86 |
|
87 |
+
Call the API using curl:
|
|
|
88 |
```bash
|
89 |
curl http://localhost:8000/v1/chat/completions \
|
90 |
-H "Content-Type: application/json" \
|
|
|
100 |
}'
|
101 |
```
|
102 |
|
103 |
+
Call the API using Python:
|
|
|
104 |
```python
|
105 |
import openai
|
106 |
|