SohanAnisetty
/

ofa-vqa-tiny

Inference Endpoints

Model card Files Files and versions Community

ofa-vqa-tiny / README.md

Sohan Anisetty

add readme

3ef4848 over 1 year ago

|

No virus

1.58 kB

	---
	license: apache-2.0
	---

	# OFA-tiny

	## Introduction
	This is the tiny version of OFA pretrained model finetuned on vqaV2.

	The directory includes 4 files, namely `config.json` which consists of model configuration, `vocab.json` and `merge.txt` for our OFA tokenizer, and lastly `pytorch_model.bin` which consists of model weights.


	## How to use
	Download the models as shown below.
	```bash
	git clone https://github.com/sohananisetty/OFA_VQA.git
	git clone https://huggingface.co/SohanAnisetty/ofa-vqa-tiny
	```

	After, refer the path to ofa-vqa-tiny to `ckpt_dir`, and prepare an image for the testing example below.

	```python
	>>> from PIL import Image
	>>> from torchvision import transforms
	>>> from transformers import OFATokenizer, OFAModelForVQA

	>>> mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
	>>> resolution = 256
	>>> patch_resize_transform = transforms.Compose([
	lambda image: image.convert("RGB"),
	transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC),
	transforms.ToTensor(),
	transforms.Normalize(mean=mean, std=std)
	])


	>>> tokenizer = OFATokenizer.from_pretrained(ckpt_dir)

	>>> txt = " what does the image describe?"
	>>> inputs = tokenizer([txt], return_tensors="pt").input_ids
	>>> img = Image.open(path_to_image)
	>>> patch_img = patch_resize_transform(img).unsqueeze(0)


	>>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=False)
	>>> gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)

	>>> print(tokenizer.batch_decode(gen, skip_special_tokens=True))
	```