finalf0 commited on
Commit
051e2df
1 Parent(s): 17d693a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: visual-question-answering
3
+ ---
4
+
5
+ ## MiniCPM-V 2.6 int4
6
+ This is the int4 quantized version of [MiniCPM-V 2.6](https://huggingface.co/openbmb/MiniCPM-V-2_6).
7
+ Running with int4 version would use lower GPU memory (about 7GB).
8
+
9
+
10
+ ## Usage
11
+ Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.10:
12
+ ```
13
+ Pillow==10.1.0
14
+ torch==2.1.2
15
+ torchvision==0.16.2
16
+ transformers==4.40.0
17
+ sentencepiece==0.1.99
18
+ accelerate==0.30.1
19
+ bitsandbytes==0.43.1
20
+ ```
21
+
22
+ ```python
23
+ # test.py
24
+ import torch
25
+ from PIL import Image
26
+ from transformers import AutoModel, AutoTokenizer
27
+
28
+ model = AutoModel.from_pretrained('openbmb/MiniCPM-V-2_6-int4', trust_remote_code=True)
29
+ tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-2_6-int4', trust_remote_code=True)
30
+ model.eval()
31
+
32
+ image = Image.open('xx.jpg').convert('RGB')
33
+ question = 'What is in the image?'
34
+ msgs = [{'role': 'user', 'content': [image, question]}]
35
+
36
+ res = model.chat(
37
+ image=None,
38
+ msgs=msgs,
39
+ tokenizer=tokenizer
40
+ )
41
+ print(res)
42
+
43
+ ## if you want to use streaming, please make sure sampling=True and stream=True
44
+ ## the model.chat will return a generator
45
+ res = model.chat(
46
+ image=None,
47
+ msgs=msgs,
48
+ tokenizer=tokenizer,
49
+ sampling=True,
50
+ temperature=0.7,
51
+ stream=True
52
+ )
53
+
54
+ generated_text = ""
55
+ for new_text in res:
56
+ generated_text += new_text
57
+ print(new_text, flush=True, end='')
58
+ ```