Fork of salesforce/BLIP for a image-captioning
task on 🤗Inference endpoint.
This repository implements a custom
task for image-captioning
for 🤗 Inference Endpoints. The code for the customized pipeline is in the pipeline.py.
To use deploy this model a an Inference Endpoint you have to select Custom
as task to use the pipeline.py
file. -> double check if it is selected
expected Request payload
{
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
}
below is an example on how to run a request using Python and requests
.
Run Request
- prepare an image.
!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg
2.run request
import json
from typing import List
import requests as r
import base64
ENDPOINT_URL = ""
HF_TOKEN = ""
def predict(path_to_image: str = None):
with open(path_to_image, "rb") as i:
image = i.read()
payload = {
"inputs": [image],
"parameters": {
"do_sample": True,
"top_p":0.9,
"min_length":5,
"max_length":20
}
}
response = r.post(
ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
)
return response.json()
prediction = predict(
path_to_image="palace.jpg"
)
Example parameters depending on the decoding strategy:
- Beam search
"parameters": {
"num_beams":5,
"max_length":20
}
- Nucleus sampling
"parameters": {
"num_beams":1,
"max_length":20,
"do_sample": True,
"top_k":50,
"top_p":0.95
}
- Contrastive search
"parameters": {
"penalty_alpha":0.6,
"top_k":4
"max_length":512
}
See generate() doc for additional detail
expected output
['buckingham palace with flower beds and red flowers']
Inference API (serverless) does not yet support generic models for this pipeline type.