Multi-image
#2
by
pbarker
- opened
Can this support multi-image?
The model was not trained on any multi-image data, and the preprocessor in this codebase does not currently support interleaved image/text messages.
The model's design does, in principle, allow it to handle multiple images as input by concatenating them into a very long input sequence, so it is still possible to try multi-image input (although it would require tweaking the preprocessor). However we have not experimented with this ourselves.
Would be nice to have such a feature (especially for a multimodal RAG scenario...)
chrisc36
changed discussion status to
closed