Safetensors
llava_next
seungone commited on
Commit
52fe19f
โ€ข
1 Parent(s): e9db639

Create README2.md

Browse files
Files changed (1) hide show
  1. README2.md +106 -0
README2.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - neulab/PangeaInstruct
5
+ language:
6
+ - am
7
+ - ar
8
+ - bg
9
+ - bn
10
+ - cs
11
+ - de
12
+ - el
13
+ - en
14
+ - es
15
+ - fa
16
+ - fr
17
+ - ga
18
+ - hi
19
+ - id
20
+ - ig
21
+ - it
22
+ - iw
23
+ - ja
24
+ - jv
25
+ - ko
26
+ - nl
27
+ - mn
28
+ - ms
29
+ - no
30
+ - pl
31
+ - pt
32
+ - ro
33
+ - ru
34
+ - si
35
+ - su
36
+ - sw
37
+ - ta
38
+ - te
39
+ - th
40
+ - tr
41
+ - uk
42
+ - ur
43
+ - vi
44
+ - zh
45
+ base_model:
46
+ - Qwen/Qwen2-7B-Instruct
47
+ ---
48
+ # Pangea-7B Model Card
49
+
50
+ [Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages](https://neulab.github.io/Pangea/)
51
+
52
+ ๐Ÿ‡ช๐Ÿ‡น ๐Ÿ‡ธ๐Ÿ‡ฆ ๐Ÿ‡ง๐Ÿ‡ฌ ๐Ÿ‡ง๐Ÿ‡ฉ ๐Ÿ‡จ๐Ÿ‡ฟ ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ฌ๐Ÿ‡ท ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ฎ๐Ÿ‡ท ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡ช ๐Ÿ‡ฎ๐Ÿ‡ณ ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ณ๐Ÿ‡ฌ ๐Ÿ‡ฎ๐Ÿ‡น ๐Ÿ‡ฎ๐Ÿ‡ฑ ๐Ÿ‡ฏ๐Ÿ‡ต ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ฐ๐Ÿ‡ท ๐Ÿ‡ณ๐Ÿ‡ฑ ๐Ÿ‡ฒ๐Ÿ‡ณ ๐Ÿ‡ฒ๐Ÿ‡พ ๐Ÿ‡ณ๐Ÿ‡ด ๐Ÿ‡ต๐Ÿ‡ฑ ๐Ÿ‡ต๐Ÿ‡น ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ท๐Ÿ‡ด ๐Ÿ‡ท๐Ÿ‡บ ๐Ÿ‡ฑ๐Ÿ‡ฐ ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ฐ๐Ÿ‡ช ๐Ÿ‡น๐Ÿ‡ฟ ๐Ÿ‡ฑ๐Ÿ‡ฐ ๐Ÿ‡น๐Ÿ‡ญ ๐Ÿ‡น๐Ÿ‡ท ๐Ÿ‡บ๐Ÿ‡ฆ ๐Ÿ‡ต๐Ÿ‡ฐ ๐Ÿ‡ป๐Ÿ‡ณ ๐Ÿ‡จ๐Ÿ‡ณ ๐Ÿ‡น๐Ÿ‡ผ
53
+
54
+ [๐Ÿ  Homepage](https://neulab.github.io/Pangea/) | [๐Ÿค– Pangea-7B](https://huggingface.co/neulab/Pangea-7B) | [๐Ÿ“Š PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct) | [๐Ÿงช PangeaBench](https://huggingface.co/collections/neulab/pangea-6713c3b0d78a453906eb2ed8) | [๐Ÿ’ป Github](https://github.com/neulab/Pangea/tree/main) | [๐Ÿ“„ Arxiv](https://arxiv.org/abs/2410.16153) | [๐Ÿ“• PDF](https://arxiv.org/pdf/2410.16153) | [๐Ÿ–ฅ๏ธ Demo](https://huggingface.co/spaces/neulab/Pangea)
55
+
56
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6230d750d93e84e233882dbc/ZjVTKnIsyshWpo-PWg9gM.png" alt="description" style="width:300px;">
57
+
58
+
59
+ ## Model details
60
+
61
+ - **Model:** Pangea is a fully open-source Multilingual Multimodal Multicultural LLM.
62
+ - **Date:** Pangea-7B was trained in 2024.
63
+ - **Training Dataset:** [6M PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct).
64
+ - **Architecture:** Pangea-7B follows the architecture of [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT), with a [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) backbone.
65
+
66
+ ### Uses
67
+ The hf version is intended so that you could use Pangea-7B with the huggingface generate function.
68
+ If you want to use it with the Llava-Next codebase, please refer to our [original checkpoint](https://huggingface.co/neulab/Pangea-7B).
69
+
70
+ ```python
71
+ # Assuming that you have text_input and image_path
72
+ from transformers import LlavaNextForConditionalGeneration, AutoProcessor
73
+ import torch
74
+ from PIL import Image
75
+
76
+ image_input = Image.open(image_path)
77
+
78
+ model = LlavaNextForConditionalGeneration.from_pretrained(
79
+ "neulab/Pangea-7B-hf",
80
+ torch_dtype=torch.float16
81
+ ).to(0)
82
+ processor = AutoProcessor.from_pretrained("neulab/Pangea-7B-hf")
83
+ model.resize_token_embeddings(len(processor.tokenizer))
84
+
85
+ text_input = f"<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n<image>\n{text_input}<|im_end|>\n<|im_start|>assistant\n"
86
+ model_inputs = processor(images=image_input, text=text_input, return_tensors='pt').to("cuda", torch.float16)
87
+ output = model.generate(**model_inputs, max_new_tokens=1024, min_new_tokens=32, temperature=1.0, top_p=0.9, do_sample=True)
88
+ output = output[0]
89
+ result = processor.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=False)
90
+
91
+ print(result)
92
+ ```
93
+
94
+ ## Citing the Model
95
+
96
+ **BibTeX Citation:**
97
+
98
+ ```
99
+ @article{yue2024pangeafullyopenmultilingual,
100
+ title={Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages},
101
+ author={Xiang Yue and Yueqi Song and Akari Asai and Seungone Kim and Jean de Dieu Nyandwi and Simran Khanuja and Anjali Kantharuban and Lintang Sutawika and Sathyanarayanan Ramamoorthy and Graham Neubig},
102
+ year={2024},
103
+ journal={arXiv preprint arXiv:2410.16153},
104
+ url={https://arxiv.org/abs/2410.16153}
105
+ }
106
+ ```