Is it possible to perform a search using an image as the input query?
Hello, I need help with conducting a search using an image as the query input. I’ve checked various sources but couldn’t find clear information. Is this possible?
Yes, just simply pass the image embedding as the query set. This is with the older version, before the API refactor. Creating the embeddings as usual:
page_embeddings(images, colpali_model, colpali_processor):
dataloader = DataLoader(
images,
batch_size=4,
shuffle=False,
collate_fn=lambda x: process_images(colpali_processor, x),
)
ds = []
for batch_doc in dataloader:
with torch.no_grad():
batch_doc = {k: v.to(colpali_model.device) for k, v in batch_doc.items()}
embeddings_doc = colpali_model(**batch_doc)
ds.extend(list(torch.unbind(embeddings_doc.to("cpu"))))
return ds
Now these can be used as the query, against the rest of the document data set. Again, the API is now different, you need to adjust it to work with the refactored API:
scores = retriever_evaluator.evaluate(query_set, data_set)
We use this for document similarity search, and it works very well:
https://huggingface.co/blog/fsommers/document-similarity-colpali
Thanks Frank , perfect answer !
But yeah definitely possible, only thing is the matrix might get large so you might want to adjust the batch size argument in the processor.get_scores()
FYI the updated API is just as it is in the quickstart:
processor.process_images(x)
processor.get_scores(qs, ds)
Cheers,
Manu