The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens

#175
by gebaltso - opened

How can I overcome this? Perhaps by changing text-encoder? And if so is there any example for that on how to do it? Thanks in advance.

*Edit using both prompt and prompt_3 (T5):
image = pipe(
prompt=prompt,
prompt_3=prompt_3,
negative_prompt="",
num_inference_steps=28,
guidance_scale=4.5,
max_sequence_length=512,
).images[0]

Sign up or log in to comment