Why model only outputs one token each time?

by luhuitong - opened Aug 13

Aug 13

When inference, only one token is output each time.
I use official weights and llava/serve/cli.py in llava-med source repo to make inference. It successfully loads the weights. No other parameters changed.

luhuitong changed discussion status to closed Aug 13

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment