When inference, only one token is output each time.I use official weights and llava/serve/cli.py in llava-med source repo to make inference. It successfully loads the weights. No other parameters changed.
· Sign up or log in to comment