How to get running using fastchat on a m1 mac?
#7
by
kkostecky
- opened
Hi there, can someone give me directions on how to get a ggml model like this running using fastchat on an m1 mac. I have the regular vicuna 7 and 13B models running, but these are not pytorch files. Thanks!
Fastchat doesn't support ggml as far as I know. You're gonna have to use either oobabooga or llama.cpp.
Also, since you're on an M1, make sure to get the q4_2 models. They're great on apple silicon.
Thanks. Yeah, I already have it running on llama.cpp and using the q4_2 model.
OK, fair enough regarding fastchat. Thank you!
This might be of interest
https://github.com/oobabooga/text-generation-webui/wiki/llama.cpp-models