Locally deployed models have poor performance. model:CodeLlama-34b-Instruct-hf
#18
by
nstl
- opened
When CodeLlama-34b-Instruct-hf was deployed and the default parameters were used for inference, it was found that the inference effect was very different from the hugging face, did the online inference do anything optimized? How are the model parameters and promot set?
online interface:https://huggingface.co/chat/
nstl
changed discussion title from
On-premises models perform poorly. model:CodeLlama-34b-Instruct-hf
to Locally deployed models have poor performance. model:CodeLlama-34b-Instruct-hf