Run this model with llama.cpp, get gibberish output
#12
by
xiaojinchuan
- opened
I converted this model to ggml, and quantized it to 4bit using https://github.com/ggerganov/llama.cpp/blob/master/convert.py
and run the quantized model with llama-cpp-python, get gibberish output as below.