Request for support on faster inference engine

#10
by solankibhargav - opened

This model is really great and I would would love to use this. However, its really slow on transformers inference, can you please add support or a guide to use it on either lmdeploy/vllm/sglang/mistral.rs. ( or any other faster inference engines)

OpenGVLab org

Yes, you can use lmdeploy now. We have plans to support vllm soon, but due to a shortage of personnel, it might not be available in the near future.

czczup changed discussion status to closed

Sign up or log in to comment