Number of tokens exceeded maximum context length

#22

by praxis-dev - opened Sep 29, 2023

Discussion

praxis-dev

Sep 29, 2023

•

edited Sep 29, 2023

Hi, getting this error from time to time with Llama-2-13B-Chat-GGML.

Can I change maximum context length?

praxis-dev

Sep 29, 2023

every once in a while I see recommendation like this: Try passing n_ctx=4096 to LLama() and it seems to work, but where exactly should I pass it ?

Gupta-Aryaman

Oct 29, 2023

•

edited Oct 29, 2023

def load_llm():
    llm = LlamaCpp(
        model_path = model_path,
        n_gpu_layers=5,
        n_batch=128,
        verbose=True,
        f16_kv=True,
        n_ctx=2048
    )
    return llm

I think using LlamaCpp(like so) instead of CTransformers might help.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment