passaglia commited on
Commit
8ea5584
1 Parent(s): 52f5b9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -8
README.md CHANGED
@@ -21,7 +21,7 @@ For more details, please refer to [our blog post](https://note.com/elyza/n/n360b
21
 
22
  We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
23
 
24
- Here is a table showing the performance degradation due to quantization.
25
 
26
  | Model | ELYZA-tasks-100 GPT4 score |
27
  | :-------------------------------- | ---: |
@@ -31,7 +31,7 @@ Here is a table showing the performance degradation due to quantization.
31
 
32
  ## Use with vLLM
33
 
34
- Install vLLM.
35
 
36
  ```bash
37
  pip install vllm
@@ -74,8 +74,7 @@ for output in outputs:
74
 
75
  ### vLLM OpenAI Compatible Server
76
 
77
- Start the API server.
78
-
79
  ```bash
80
  python -m vllm.entrypoints.openai.api_server \
81
  --model elyza/Llama-3-ELYZA-JP-8B-AWQ \
@@ -85,8 +84,7 @@ python -m vllm.entrypoints.openai.api_server \
85
  ```
86
 
87
 
88
- Call the API using curl.
89
-
90
  ```bash
91
  curl http://localhost:8000/v1/chat/completions \
92
  -H "Content-Type: application/json" \
@@ -102,8 +100,7 @@ curl http://localhost:8000/v1/chat/completions \
102
  }'
103
  ```
104
 
105
- Call the API using Python.
106
-
107
  ```python
108
  import openai
109
 
 
21
 
22
  We have prepared two quantized model options, GGUF and AWQ. This is the [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) model.
23
 
24
+ The following table shows the performance degradation due to quantization:
25
 
26
  | Model | ELYZA-tasks-100 GPT4 score |
27
  | :-------------------------------- | ---: |
 
31
 
32
  ## Use with vLLM
33
 
34
+ Install vLLM:
35
 
36
  ```bash
37
  pip install vllm
 
74
 
75
  ### vLLM OpenAI Compatible Server
76
 
77
+ Start the API server:
 
78
  ```bash
79
  python -m vllm.entrypoints.openai.api_server \
80
  --model elyza/Llama-3-ELYZA-JP-8B-AWQ \
 
84
  ```
85
 
86
 
87
+ Call the API using curl:
 
88
  ```bash
89
  curl http://localhost:8000/v1/chat/completions \
90
  -H "Content-Type: application/json" \
 
100
  }'
101
  ```
102
 
103
+ Call the API using Python:
 
104
  ```python
105
  import openai
106