yuanzhoulvpi commited on
Commit
baf6b33
1 Parent(s): 35770cc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -0
README.md ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zh
5
+ - en
6
+ pipeline_tag: image-text-to-text
7
+ ---
8
+
9
+ # 从0到1训练一个定制版的llava模型
10
+ 1. 基于openai/clip-vit-large-patch14-336 和Qwen1.5-4B-Chat模型,构建一个llava模型
11
+ 2. 使用数据liuhaotian/LLaVA-CC3M-Pretrain-595K
12
+ 3. 训练方式是deepspeed-zero2、lora进行微调。
13
+
14
+
15
+
16
+ # 关联的github
17
+ 1. [https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/train_llava](https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/train_llava)
18
+
19
+
20
+
21
+ # 关联的b站学习视频
22
+
23
+ 1. 待填充
24
+
25
+
26
+
27
+ # 推理代码
28
+
29
+
30
+ ```python
31
+
32
+ from transformers import LlavaForConditionalGeneration, AutoProcessor
33
+ import torch
34
+ from PIL import Image
35
+ ```
36
+
37
+ ```python
38
+
39
+ raw_model_name_or_path = "yuanzhoulvpi/llava_qwen15-4b-chat_openai-clip-vit-large-patch14-336"
40
+ model = LlavaForConditionalGeneration.from_pretrained(raw_model_name_or_path,device_map="cuda:0", torch_dtype=torch.bfloat16)
41
+ processor = AutoProcessor.from_pretrained(raw_model_name_or_path)
42
+ model.eval()
43
+ print('ok')
44
+ ```
45
+
46
+ ```python
47
+ testdata = (
48
+ '<image>\nRelay a brief, clear account of the picture shown.', # 提问
49
+ 'large kitchen island with an overhang and dining space next to it', # 真实答案
50
+ 'data/liuhaotian/LLaVA-CC3M-Pretrain-595K/images_dl/GCC_train_001899387.jpg' # 图片路径
51
+ )
52
+
53
+ ```
54
+
55
+
56
+ ```python
57
+ def build_model_input(model, processor, testdata:tuple):
58
+ messages = [
59
+ {"role": "system", "content": "You are a helpful assistant."},
60
+ {"role": "user", "content": testdata[0]},
61
+ ]
62
+ prompt = processor.tokenizer.apply_chat_template(
63
+ messages, tokenize=False, add_generation_prompt=True
64
+ )
65
+ # print(prompt)
66
+ # print("*"*20)
67
+ image = Image.open(testdata[2])
68
+ inputs = processor(text=prompt, images=image, return_tensors="pt")
69
+
70
+ for tk in inputs.keys():
71
+ inputs[tk] = inputs[tk].to(model.device)
72
+ generate_ids = model.generate(**inputs, max_new_tokens=20)
73
+
74
+ generate_ids = [
75
+ oid[len(iids):] for oid, iids in zip(generate_ids, inputs.input_ids)
76
+ ]
77
+
78
+
79
+
80
+ gen_text = processor.batch_decode(generate_ids, skip_special_tokens=False, clean_up_tokenization_spaces=False)[0]
81
+ return gen_text
82
+
83
+
84
+
85
+ ```
86
+
87
+ ```python
88
+ build_model_input(model, processor, testdata)
89
+
90
+ # 'the kitchen is a bright yellow with a glass top island and a large window that looks out to the'
91
+ ```
92
+