khang119966
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -10,16 +10,19 @@ base_model:
|
|
10 |
pipeline_tag: image-text-to-text
|
11 |
---
|
12 |
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
|
17 |
# Vintern-1B-v3.5 βοΈ (Viet-InternVL2-1B-v3.5) - The Ultimate Multimodal Solution π
|
18 |
We introduce **Vintern-1B-v3.5**, the latest version in the Vintern series, offering significant improvements over v2 across all evaluation benchmarks. This model has been fine-tuned from **InternVL-1B-2.5**, which already good in Vietnamese tasks because it used [Viet-ShareGPT-4o-Text-VQA](https://huggingface.co/datasets/5CD-AI/Viet-ShareGPT-4o-Text-VQA) data during its fine-tuning process by the InternVL 2.5 [1] team.
|
19 |
|
20 |
-
|
21 |
-
|
22 |
-
|
|
|
|
|
|
|
23 |
|
24 |
|
25 |
To further enhance its performance in Vietnamese while maintaining robust capabilities on existing English datasets, **Vintern-1B-v3.5** has been fine-tuned using a vast amount of Vietnamese-specific data. This results in a model that is exceptionally powerful in text recognition, OCR, and understanding Vietnam-specific documents.
|
@@ -41,7 +44,9 @@ The model can be customized for specific tasks with minimal effort.
|
|
41 |
|
42 |
## Benchmarks π
|
43 |
|
44 |
-
|
|
|
|
|
45 |
|
46 |
| Benchmark | InternVL2_5 1B | Vintern-1B-v2 | Vintern-1B-v3.5 |
|
47 |
|:-------------:|:--------------:|:-------------:|:---------------:|
|
@@ -57,54 +62,26 @@ The model can be customized for specific tasks with minimal effort.
|
|
57 |
## Examples
|
58 |
|
59 |
<div align="center">
|
60 |
-
<img src="
|
61 |
</div>
|
62 |
|
63 |
-
```
|
64 |
-
|
65 |
-
```
|
66 |
-
|
67 |
<div align="center">
|
68 |
-
<img src="
|
69 |
</div>
|
70 |
|
71 |
-
```
|
72 |
|
73 |
-
```
|
74 |
|
75 |
<div align="center">
|
76 |
-
<img src="
|
77 |
</div>
|
78 |
|
79 |
-
```
|
80 |
-
|
81 |
-
```
|
82 |
-
|
83 |
-
<div align="center">
|
84 |
-
<img src="ex_images/3.png" width="400"/>
|
85 |
-
</div>
|
86 |
-
|
87 |
-
```
|
88 |
-
|
89 |
-
```
|
90 |
-
|
91 |
<div align="center">
|
92 |
-
<img src="
|
93 |
</div>
|
94 |
|
95 |
-
```
|
96 |
-
|
97 |
-
```
|
98 |
|
99 |
-
<div align="center">
|
100 |
-
<img src="ex_images/6.png" width="400"/>
|
101 |
-
</div>
|
102 |
|
103 |
|
104 |
-
```
|
105 |
-
|
106 |
-
```
|
107 |
-
|
108 |
## Quickstart
|
109 |
|
110 |
Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
|
|
|
10 |
pipeline_tag: image-text-to-text
|
11 |
---
|
12 |
|
13 |
+
<div align="center">
|
14 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/-G297bBqMzYvTbD6_Bkd9.png" width="500"/>
|
15 |
+
</div>
|
16 |
|
17 |
# Vintern-1B-v3.5 βοΈ (Viet-InternVL2-1B-v3.5) - The Ultimate Multimodal Solution π
|
18 |
We introduce **Vintern-1B-v3.5**, the latest version in the Vintern series, offering significant improvements over v2 across all evaluation benchmarks. This model has been fine-tuned from **InternVL-1B-2.5**, which already good in Vietnamese tasks because it used [Viet-ShareGPT-4o-Text-VQA](https://huggingface.co/datasets/5CD-AI/Viet-ShareGPT-4o-Text-VQA) data during its fine-tuning process by the InternVL 2.5 [1] team.
|
19 |
|
20 |
+
<div align="center">
|
21 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/a1V1DA1o4Gf_MJblWTz-L.png" width="500"/>
|
22 |
+
</div>
|
23 |
+
<div align="center">
|
24 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/36jb5bgyYCoVKx3NE8Iuv.png" width="500"/>
|
25 |
+
</div>
|
26 |
|
27 |
|
28 |
To further enhance its performance in Vietnamese while maintaining robust capabilities on existing English datasets, **Vintern-1B-v3.5** has been fine-tuned using a vast amount of Vietnamese-specific data. This results in a model that is exceptionally powerful in text recognition, OCR, and understanding Vietnam-specific documents.
|
|
|
44 |
|
45 |
## Benchmarks π
|
46 |
|
47 |
+
<div align="center">
|
48 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/DrUCZuXuMz47uVU4zqnJ4.png" width="500"/>
|
49 |
+
</div>
|
50 |
|
51 |
| Benchmark | InternVL2_5 1B | Vintern-1B-v2 | Vintern-1B-v3.5 |
|
52 |
|:-------------:|:--------------:|:-------------:|:---------------:|
|
|
|
62 |
## Examples
|
63 |
|
64 |
<div align="center">
|
65 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/1yos0APs6laTCAGhUbN9n.png" width="300"/>
|
66 |
</div>
|
67 |
|
|
|
|
|
|
|
|
|
68 |
<div align="center">
|
69 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/L5n35_3sz_Wp9fo0C7snq.png" width="300"/>
|
70 |
</div>
|
71 |
|
|
|
72 |
|
|
|
73 |
|
74 |
<div align="center">
|
75 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/E6aqBwFqK38XE1LL9lF2W.png" width="500"/>
|
76 |
</div>
|
77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
<div align="center">
|
79 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6336b5c831efcb5647f00170/Lkt8YLYlDP_VByFjFQX_t.png" width="500"/>
|
80 |
</div>
|
81 |
|
|
|
|
|
|
|
82 |
|
|
|
|
|
|
|
83 |
|
84 |
|
|
|
|
|
|
|
|
|
85 |
## Quickstart
|
86 |
|
87 |
Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
|