khang119966 commited on
Commit
01d21d9
Β·
verified Β·
1 Parent(s): fd0ddc6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -39
README.md CHANGED
@@ -10,16 +10,19 @@ base_model:
10
  pipeline_tag: image-text-to-text
11
  ---
12
 
13
-
14
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/-G297bBqMzYvTbD6_Bkd9.png)
15
-
16
 
17
  # Vintern-1B-v3.5 ❄️ (Viet-InternVL2-1B-v3.5) - The Ultimate Multimodal Solution 🌏
18
  We introduce **Vintern-1B-v3.5**, the latest version in the Vintern series, offering significant improvements over v2 across all evaluation benchmarks. This model has been fine-tuned from **InternVL-1B-2.5**, which already good in Vietnamese tasks because it used [Viet-ShareGPT-4o-Text-VQA](https://huggingface.co/datasets/5CD-AI/Viet-ShareGPT-4o-Text-VQA) data during its fine-tuning process by the InternVL 2.5 [1] team.
19
 
20
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/a1V1DA1o4Gf_MJblWTz-L.png)
21
-
22
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/36jb5bgyYCoVKx3NE8Iuv.png)
 
 
 
23
 
24
 
25
  To further enhance its performance in Vietnamese while maintaining robust capabilities on existing English datasets, **Vintern-1B-v3.5** has been fine-tuned using a vast amount of Vietnamese-specific data. This results in a model that is exceptionally powerful in text recognition, OCR, and understanding Vietnam-specific documents.
@@ -41,7 +44,9 @@ The model can be customized for specific tasks with minimal effort.
41
 
42
  ## Benchmarks πŸ“ˆ
43
 
44
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/DrUCZuXuMz47uVU4zqnJ4.png)
 
 
45
 
46
  | Benchmark | InternVL2_5 1B | Vintern-1B-v2 | Vintern-1B-v3.5 |
47
  |:-------------:|:--------------:|:-------------:|:---------------:|
@@ -57,54 +62,26 @@ The model can be customized for specific tasks with minimal effort.
57
  ## Examples
58
 
59
  <div align="center">
60
- <img src="ex_images/1.png" width="500"/>
61
  </div>
62
 
63
- ```
64
-
65
- ```
66
-
67
  <div align="center">
68
- <img src="ex_images/4.jpg" width="500"/>
69
  </div>
70
 
71
- ```
72
 
73
- ```
74
 
75
  <div align="center">
76
- <img src="ex_images/2.jpg" width="500"/>
77
  </div>
78
 
79
- ```
80
-
81
- ```
82
-
83
- <div align="center">
84
- <img src="ex_images/3.png" width="400"/>
85
- </div>
86
-
87
- ```
88
-
89
- ```
90
-
91
  <div align="center">
92
- <img src="ex_images/5.jpg" width="400"/>
93
  </div>
94
 
95
- ```
96
-
97
- ```
98
 
99
- <div align="center">
100
- <img src="ex_images/6.png" width="400"/>
101
- </div>
102
 
103
 
104
- ```
105
-
106
- ```
107
-
108
  ## Quickstart
109
 
110
  Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
 
10
  pipeline_tag: image-text-to-text
11
  ---
12
 
13
+ <div align="center">
14
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/-G297bBqMzYvTbD6_Bkd9.png" width="500"/>
15
+ </div>
16
 
17
  # Vintern-1B-v3.5 ❄️ (Viet-InternVL2-1B-v3.5) - The Ultimate Multimodal Solution 🌏
18
  We introduce **Vintern-1B-v3.5**, the latest version in the Vintern series, offering significant improvements over v2 across all evaluation benchmarks. This model has been fine-tuned from **InternVL-1B-2.5**, which already good in Vietnamese tasks because it used [Viet-ShareGPT-4o-Text-VQA](https://huggingface.co/datasets/5CD-AI/Viet-ShareGPT-4o-Text-VQA) data during its fine-tuning process by the InternVL 2.5 [1] team.
19
 
20
+ <div align="center">
21
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/a1V1DA1o4Gf_MJblWTz-L.png" width="500"/>
22
+ </div>
23
+ <div align="center">
24
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/36jb5bgyYCoVKx3NE8Iuv.png" width="500"/>
25
+ </div>
26
 
27
 
28
  To further enhance its performance in Vietnamese while maintaining robust capabilities on existing English datasets, **Vintern-1B-v3.5** has been fine-tuned using a vast amount of Vietnamese-specific data. This results in a model that is exceptionally powerful in text recognition, OCR, and understanding Vietnam-specific documents.
 
44
 
45
  ## Benchmarks πŸ“ˆ
46
 
47
+ <div align="center">
48
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/DrUCZuXuMz47uVU4zqnJ4.png" width="500"/>
49
+ </div>
50
 
51
  | Benchmark | InternVL2_5 1B | Vintern-1B-v2 | Vintern-1B-v3.5 |
52
  |:-------------:|:--------------:|:-------------:|:---------------:|
 
62
  ## Examples
63
 
64
  <div align="center">
65
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/1yos0APs6laTCAGhUbN9n.png" width="300"/>
66
  </div>
67
 
 
 
 
 
68
  <div align="center">
69
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/L5n35_3sz_Wp9fo0C7snq.png" width="300"/>
70
  </div>
71
 
 
72
 
 
73
 
74
  <div align="center">
75
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/E6aqBwFqK38XE1LL9lF2W.png" width="500"/>
76
  </div>
77
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  <div align="center">
79
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/Lkt8YLYlDP_VByFjFQX_t.png" width="500"/>
80
  </div>
81
 
 
 
 
82
 
 
 
 
83
 
84
 
 
 
 
 
85
  ## Quickstart
86
 
87
  Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.