TheBloke/Vigogne-2-7B-Chat-GGML · llama-cpp-python 'AssertionError' when loading model

# https://llama-cpp-python.readthedocs.io/en/latest/
from llama_cpp import Llama

# https://huggingface.co/TheBloke/Vigogne-2-7B-Chat-GGML/tree/main
MODEL_Q8_0 = Llama(model_path="./models/vigogne-2-7b-chat.ggmlv3.q8_0.bin", n_ctx=512)

I got this error:

AssertionError                            Traceback (most recent call last)

<ipython-input-4-ad0fd4308e15> in <cell line: 5>()
      4 # https://huggingface.co/TheBloke/Vigogne-2-7B-Chat-GGML/tree/main
----> 5 MODEL_Q8_0 = Llama(model_path="./models/vigogne-2-7b-chat.ggmlv3.q8_0.bin", n_ctx=512)

/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py in __init__(self, model_path, n_ctx, n_parts, n_gpu_layers, seed, f16_kv, logits_all, vocab_only, use_mmap, use_mlock, embedding, n_threads, n_batch, last_n_tokens_size, lora_base, lora_path, low_vram, tensor_split, rope_freq_base, rope_freq_scale, n_gqa, rms_norm_eps, mul_mat_q, verbose)
    321                     self.model_path.encode("utf-8"), self.params
    322                 )
--> 323         assert self.model is not None
    324 
    325         if verbose:

AssertionError:

it seems that llama_cpp.llama_load_model_from_file(...) failed => self.model is None.
Is there any known solution?
Thanks!