--- license: apache-2.0 --- # OFA-tiny ## Introduction This is the **tiny** version of OFA pretrained model finetuned on vqaV2. The directory includes 4 files, namely `config.json` which consists of model configuration, `vocab.json` and `merge.txt` for our OFA tokenizer, and lastly `pytorch_model.bin` which consists of model weights. ## How to use Download the models as shown below. ```bash git clone https://github.com/sohananisetty/OFA_VQA.git git clone https://huggingface.co/SohanAnisetty/ofa-vqa-tiny ``` After, refer the path to ofa-vqa-tiny to `ckpt_dir`, and prepare an image for the testing example below. ```python >>> from PIL import Image >>> from torchvision import transforms >>> from transformers import OFATokenizer, OFAModelForVQA >>> mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5] >>> resolution = 256 >>> patch_resize_transform = transforms.Compose([ lambda image: image.convert("RGB"), transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC), transforms.ToTensor(), transforms.Normalize(mean=mean, std=std) ]) >>> tokenizer = OFATokenizer.from_pretrained(ckpt_dir) >>> txt = " what does the image describe?" >>> inputs = tokenizer([txt], return_tensors="pt").input_ids >>> img = Image.open(path_to_image) >>> patch_img = patch_resize_transform(img).unsqueeze(0) >>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=False) >>> gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3) >>> print(tokenizer.batch_decode(gen, skip_special_tokens=True)) ```