anonymoussubmission2024
/

vlrm-blip2-opt-2.7b

visual-question-answering

image-captioning

Inference Endpoints

Model card Files Files and versions Community

anonymoussubmission2024 commited on May 20

Commit

efde4fc

•

1 Parent(s): 858da5f

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -11,10 +11,12 @@ pipeline_tag: image-to-text
 base_model: Salesforce/blip2-opt-2.7b
 ---
 # VLRM
-This repository contains the weights of BLIP-2 OPT-2.7B model fine-tuned by reinforcement learning method introduced in the paper VLRM: Vision-Language Models act as
 Reward Models for Image Captioning.
 The RL-tuned model is able to generate longer and more comprehensive descriptions with zero computational overhead compared to the original model.
 # Running the model
 ## Option 1
 <details>

 base_model: Salesforce/blip2-opt-2.7b
 ---
 # VLRM
+This repository contains the weights of BLIP-2 OPT-2.7B model fine-tuned by reinforcement learning method introduced in the paper VLRM: Vision-Language Models Act as
 Reward Models for Image Captioning.
 The RL-tuned model is able to generate longer and more comprehensive descriptions with zero computational overhead compared to the original model.
+# CLIP Recall
+CLIP Recall calculation scripts are provided in `validate` folder together with `README.md` and `captions.txt`.
 # Running the model
 ## Option 1
 <details>