q-future
/

co-instruct

Image-Text-to-Text

feature-extraction

Model card Files Files and versions Community

teowu commited on Jan 26

Commit

174048a

•

1 Parent(s): 281b093

Update README.md

Files changed (1) hide show

README.md +20 -4

README.md CHANGED Viewed

@@ -1,9 +1,25 @@
-Use `transformers==4.36.1`. Preview version only.
-This model has reached 75.99\% accuracy on Q-Bench A1 *test* (multi-choice questions), notably superior than GPT-4V (73.44\%).
-### Load Model
 ```python
 import torch
@@ -16,7 +32,7 @@ model = AutoModelForCausalLM.from_pretrained("q-future/co-instruct-preview",
                                              device_map={"":"cuda:0"})
 ```
-### Chat
 ```python
 import requests

+## Performance
+This model has reached 75.12\%(*12\% better than previous version*)/74.98\%(*8.5\% better than previous version*) on Q-Bench A1 *dev/test* (multi-choice questions).
+It also outperforms the following close-source models with much larger model capacities:
+| Model | *dev* | *test* |
+| ---- | ---- | ---- |
+| Co-Instruct-Preview (mPLUG-Owl2) | **75.12\%** | **74.98\%** |
+| \*GPT-4V-Turbo | 74.41\% | 74.10\% |
+| \*Qwen-VL-**Max** | 73.63\%  | 73.90\% |
+| \*GPT-4V (Nov. 2023) | 71.78\% | 73.44\% |
+| \*Gemini-Pro | 68.16\% | 69.46\% |
+| Q-Instruct (mPLUG-Owl2, Nov. 2023) | 67.42\% | 70.43\% |
+| \*Qwen-VL-Plus | 66.01\% | 68.93\% |
+| mPLUG-Owl2 | 62.14\% | 62.68\% |
+\*: Proprietary Models.
+We are also constructing multi-image benchmark sets (image pairs, triple-quadruple images), and the results on multi-image benchmarks will be released soon!
+## Load Model
 ```python
 import torch
                                              device_map={"":"cuda:0"})
 ```
+## Chat
 ```python
 import requests