Number of tokens exceeded maximum context length
#22
by
praxis-dev
- opened
Hi, getting this error from time to time with Llama-2-13B-Chat-GGML.
Can I change maximum context length?
def load_llm():
llm = LlamaCpp(
model_path = model_path,
n_gpu_layers=5,
n_batch=128,
verbose=True,
f16_kv=True,
n_ctx=2048
)
return llm
I think using LlamaCpp(like so) instead of CTransformers might help.