lightblue
/

Karasu-Mixtral-8x22B-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ptrdvn commited on Apr 11

Commit

f52630b

•

1 Parent(s): a1d9460

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -23,6 +23,7 @@ The accuracy of the model is surprisingly high, and has a decently fast inferenc
 We have tested (and thus recommend) running this model on vLLM. We recommend running it from the vLLM openAI server, using the following command:
 ```bash
 python -m vllm.entrypoints.openai.api_server --model lightblue/Karasu-Mixtral-8x22B-v0.1 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-model-len 1024
 ```
 which is how we ran it on a 4 x A100 (80GB) machine.

 We have tested (and thus recommend) running this model on vLLM. We recommend running it from the vLLM openAI server, using the following command:
 ```bash
+pip install vllm
 python -m vllm.entrypoints.openai.api_server --model lightblue/Karasu-Mixtral-8x22B-v0.1 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-model-len 1024
 ```
 which is how we ran it on a 4 x A100 (80GB) machine.