Please add to llama.cpp and ollama

#21
by KeilahElla - opened

As the title says. It would be great to use this with ollama/llama.cpp. It is usually much faster compared to transformers.

Not sure that claim is super accurate :) torch compile can get you pretty far with transformers

It will become more convenient to use in ollama. Request support for ollama.

@ArthurZ you are right , but what about when we run it on cpu maybe ollama.cpp work very well what you think?

Sign up or log in to comment