Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ license: apache-2.0
|
|
11 |
|
12 |
π [Technical report](https://arxiv.org/abs/2402.11530) | π [Code](https://github.com/BAAI-DCAI/Bunny) | π° [Demo](https://wisemodel.cn/space/baai/Bunny)
|
13 |
|
14 |
-
Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language backbones, including Phi-1.5, StableLM-2 and Phi-2. To compensate for the decrease in model size, we construct more informative training data by curated selection from a broader data source. Remarkably, our Bunny-3B model built upon SigLIP and Phi-2 outperforms the state-of-the-art MLLMs, not only in comparison with models of similar size but also against larger MLLM frameworks (7B), and even achieves performance on par with 13B models.
|
15 |
|
16 |
The model is pretrained on LAION-2M and finetuned on Bunny-695K.
|
17 |
More details about this model can be found in [GitHub](https://github.com/BAAI-DCAI/Bunny).
|
@@ -21,7 +21,13 @@ More details about this model can be found in [GitHub](https://github.com/BAAI-D
|
|
21 |
# Quickstart
|
22 |
|
23 |
The merged weights can be found in [Bunny-v1_0-3B](https://huggingface.co/BAAI/Bunny-v1_0-3B).
|
24 |
-
To use the model with transformers, use the merged weights instead of the LoRA weights
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
```python
|
27 |
import torch
|
@@ -68,12 +74,6 @@ output_ids = model.generate(
|
|
68 |
print(tokenizer.decode(output_ids[input_ids.shape[1]:], skip_special_tokens=True).strip())
|
69 |
```
|
70 |
|
71 |
-
Before running the snippet, you need to install the following dependencies:
|
72 |
-
|
73 |
-
```shell
|
74 |
-
pip install torch transformers accelerate pillow
|
75 |
-
```
|
76 |
-
|
77 |
# License
|
78 |
This project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses.
|
79 |
The content of this project itself is licensed under the Apache license 2.0.
|
|
|
11 |
|
12 |
π [Technical report](https://arxiv.org/abs/2402.11530) | π [Code](https://github.com/BAAI-DCAI/Bunny) | π° [Demo](https://wisemodel.cn/space/baai/Bunny)
|
13 |
|
14 |
+
Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language backbones, including Phi-1.5, StableLM-2, Qwen1.5 and Phi-2. To compensate for the decrease in model size, we construct more informative training data by curated selection from a broader data source. Remarkably, our Bunny-3B model built upon SigLIP and Phi-2 outperforms the state-of-the-art MLLMs, not only in comparison with models of similar size but also against larger MLLM frameworks (7B), and even achieves performance on par with 13B models.
|
15 |
|
16 |
The model is pretrained on LAION-2M and finetuned on Bunny-695K.
|
17 |
More details about this model can be found in [GitHub](https://github.com/BAAI-DCAI/Bunny).
|
|
|
21 |
# Quickstart
|
22 |
|
23 |
The merged weights can be found in [Bunny-v1_0-3B](https://huggingface.co/BAAI/Bunny-v1_0-3B).
|
24 |
+
To use the model with transformers, use the merged weights instead of the LoRA weights.
|
25 |
+
|
26 |
+
Before running the snippet, you need to install the following dependencies:
|
27 |
+
|
28 |
+
```shell
|
29 |
+
pip install torch transformers accelerate pillow
|
30 |
+
```
|
31 |
|
32 |
```python
|
33 |
import torch
|
|
|
74 |
print(tokenizer.decode(output_ids[input_ids.shape[1]:], skip_special_tokens=True).strip())
|
75 |
```
|
76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
# License
|
78 |
This project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses.
|
79 |
The content of this project itself is licensed under the Apache license 2.0.
|