zR
commited on
Commit
•
8e2af17
1
Parent(s):
b478626
fix
Browse files- README.md +3 -3
- README_zh.md +25 -5
README.md
CHANGED
@@ -46,7 +46,7 @@ the [MVBench](https://github.com/OpenGVLab/Ask-Anything), [VideoChatGPT-Bench](h
|
|
46 |
and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
|
47 |
refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
|
48 |
|
49 |
-
![Quantitative Evaluation](https://
|
50 |
|
51 |
Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
|
52 |
|
@@ -99,9 +99,9 @@ our [github](https://github.com/THUDM/CogVLM2/tree/main/video_demo).
|
|
99 |
## License
|
100 |
|
101 |
This model is released under the
|
102 |
-
CogVLM2 [LICENSE](
|
103 |
For models built with Meta Llama 3, please also adhere to
|
104 |
-
the [LLAMA3_LICENSE](
|
105 |
|
106 |
## Training details
|
107 |
|
|
|
46 |
and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
|
47 |
refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
|
48 |
|
49 |
+
![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
|
50 |
|
51 |
Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
|
52 |
|
|
|
99 |
## License
|
100 |
|
101 |
This model is released under the
|
102 |
+
CogVLM2 [LICENSE](./LICENSE).
|
103 |
For models built with Meta Llama 3, please also adhere to
|
104 |
+
the [LLAMA3_LICENSE](./LLAMA3_LICENSE).
|
105 |
|
106 |
## Training details
|
107 |
|
README_zh.md
CHANGED
@@ -1,10 +1,29 @@
|
|
1 |
# CogVLM2-Video-Llama3-Chat
|
2 |
|
3 |
-
CogVLM2-Video
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
|
5 |
和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
|
6 |
|
7 |
-
![Quantitative Evaluation](https://
|
8 |
|
9 |
其中 VCG 指的是 VideoChatGPTBench,ZS 指的是零样本 VideoQA 数据集,MV-* 指的是 MVBench 中的主要类别。
|
10 |
|
@@ -57,9 +76,10 @@ prompt = f"The input consists of a sequence of key frames from a video. Answer t
|
|
57 |
|
58 |
## 模型协议
|
59 |
|
60 |
-
此模型根据
|
61 |
-
[
|
62 |
-
|
|
|
63 |
## 引用
|
64 |
|
65 |
我们即将发布技术报告,尽情期待。
|
|
|
1 |
# CogVLM2-Video-Llama3-Chat
|
2 |
|
3 |
+
CogVLM2-Video 在多个视频问答任务上达到了 state-of-the-art 的性能,能够实现一分钟内的视频理解。
|
4 |
+
我们提供了两个示例视频,分别展现了 CogVLM2-Video 的 视频理解和时间序列定位能力。
|
5 |
+
|
6 |
+
<table>
|
7 |
+
<tr>
|
8 |
+
<td>
|
9 |
+
<video width="100%" controls>
|
10 |
+
<source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/lion.mp4" type="video/mp4">
|
11 |
+
</video>
|
12 |
+
</td>
|
13 |
+
<td>
|
14 |
+
<video width="100%" controls>
|
15 |
+
<source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/basketball.mp4" type="video/mp4">
|
16 |
+
</video>
|
17 |
+
</td>
|
18 |
+
</tr>
|
19 |
+
</table>
|
20 |
+
|
21 |
+
|
22 |
+
下图显示了 CogVLM2-Video
|
23 |
在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
|
24 |
和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
|
25 |
|
26 |
+
![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
|
27 |
|
28 |
其中 VCG 指的是 VideoChatGPTBench,ZS 指的是零样本 VideoQA 数据集,MV-* 指的是 MVBench 中的主要类别。
|
29 |
|
|
|
76 |
|
77 |
## 模型协议
|
78 |
|
79 |
+
此模型根据
|
80 |
+
CogVLM2 [LICENSE](./LICENSE)
|
81 |
+
发布。对于使用 Meta Llama 3 构建的模型,还请遵守
|
82 |
+
[LLAMA3_LICENSE](./LLAMA3_LICENSE)。
|
83 |
## 引用
|
84 |
|
85 |
我们即将发布技术报告,尽情期待。
|