Why model only outputs one token each time?

#3
by luhuitong - opened

When inference, only one token is output each time.
I use official weights and llava/serve/cli.py in llava-med source repo to make inference. It successfully loads the weights. No other parameters changed.

image.png

luhuitong changed discussion status to closed

Sign up or log in to comment