What are the limitations for prompt length and number of output tokens? I am finding the model takes an inordinately long time to respond. Are there other limitations regarding memory we should know about?
· Sign up or log in to comment