Support for multiple images..

by wamozart - opened

I'm trying to pass multiple images in the prompt and ask the model to find the differences between these two models.
image1 =, stream=True).raw)
image2 =, stream=True).raw)
images = (image1, image2)
prompt = """
[INST] \nYou are giving two images of , determine if these are the same image or not and the reason. [/INST]
It seems to ignore the second image. Any suggestion?

Llava Hugging Face org


Yes, LLaVa-NeXT can accept multiple images as input as shown here. But since the model was not pre-trained with several images interleaved in one prompt, it might not perform well.

I recommend to fine-tune it for your use case, if you want decent quality in generating based on several images.

How should i use this model to generate captions for 3 millions images, like what resources to use(where to solve)? what will be the cost computation? what parallelizations to use?

Llava Hugging Face org

@LBS-LENKA you can either use TGI to serve it which comes with many optimizations under the hood:
I'm also building this project to optimize vision/multimodal models that you can find recipes inside depending on your hardware:

Hi, I was trying out the example given here:

But I am getting an error while trying to apply chat template. Below are the code and the error:


import torch
from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration, AutoProcessor, AutoTokenizer
from PIL import Image
import requests

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
processor = LlavaNextProcessor.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf")
model = LlavaNextForConditionalGeneration.from_pretrained(

url = ""
image_stop =, stream=True).raw)

url = ""
image_cats =, stream=True).raw)

url = ""
image_snowman =, stream=True).raw)

# Prepare a batch of two prompts, where the first one is a multi-turn conversation and the second is not
conversation_1 = [
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What is shown in this image?"},
        "role": "assistant",
        "content": [
            {"type": "text", "text": "There is a red stop sign in the image."},
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What about this image? How many cats do you see?"},

conversation_2 = [
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What is shown in this image?"},

prompt_1 = processor.apply_chat_template(conversation_1, add_generation_prompt=True)
prompt_2 = processor.apply_chat_template(conversation_2, add_generation_prompt=True)
prompts = [prompt_1, prompt_2]


ValueError                                Traceback (most recent call last)
Cell In[17], line 46
     13 conversation_1 = [
     14     {
     15         "role": "user",
     33     },
     34 ]
     36 conversation_2 = [
     37     {
     38         "role": "user",
     43     },
     44 ]
---> 46 prompt_1 = processor.apply_chat_template(conversation_1, add_generation_prompt=True)
     47 prompt_2 = processor.apply_chat_template(conversation_2, add_generation_prompt=True)
     48 prompts = [prompt_1, prompt_2]

File /opt/conda/lib/python3.10/site-packages/transformers/, in ProcessorMixin.apply_chat_template(self, conversation, chat_template, tokenize, **kwargs)
    924         chat_template = self.default_chat_template
    925     else:
--> 926         raise ValueError(
    927             "No chat template is set for this processor. Please either set the `chat_template` attribute, "
    928             "or provide a chat template as an argument. See "
    929             " for more information."
    930         )
    931 return self.tokenizer.apply_chat_template(
    932     conversation, chat_template=chat_template, tokenize=tokenize, **kwargs
    933 )

ValueError: No chat template is set for this processor. Please either set the `chat_template` attribute, or provide a chat template as an argument. See for more information.
Llava Hugging Face org

@biswadeep49 which version of transformers do you have? You need at least v4.43 for chat templates, that is when we added support for it

Hi, I also met the same issue with ValueError: No chat template is set for this processor. Please either set the chat_template attribute, or provide a chat template as an argument. See for more information.
I use transformers 4.45.0. Any suggestion?
Thank you.

Llava Hugging Face org

@zcchen I just verified that the templates work in the latest version from main and the latest patch release. If you're on a jupyter notebook, you might need to restart the kernel. It happends sometimes that the package isn't updated until the kernel restarts

Also, I recommend to use v4.44.2 for now, as the version on main branch in under refactoring and might give some errors. I am working on it, but the PR is not merged yet

Sign up or log in to comment