zR commited on
Commit
8e2af17
1 Parent(s): b478626
Files changed (2) hide show
  1. README.md +3 -3
  2. README_zh.md +25 -5
README.md CHANGED
@@ -46,7 +46,7 @@ the [MVBench](https://github.com/OpenGVLab/Ask-Anything), [VideoChatGPT-Bench](h
46
  and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
47
  refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
48
 
49
- ![Quantitative Evaluation](https://github.com/THUDM/CogVLM2/tree/main/resources/cogvlm2_video_bench.jpeg)
50
 
51
  Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
52
 
@@ -99,9 +99,9 @@ our [github](https://github.com/THUDM/CogVLM2/tree/main/video_demo).
99
  ## License
100
 
101
  This model is released under the
102
- CogVLM2 [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0).
103
  For models built with Meta Llama 3, please also adhere to
104
- the [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0).
105
 
106
  ## Training details
107
 
 
46
  and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
47
  refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
48
 
49
+ ![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
50
 
51
  Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
52
 
 
99
  ## License
100
 
101
  This model is released under the
102
+ CogVLM2 [LICENSE](./LICENSE).
103
  For models built with Meta Llama 3, please also adhere to
104
+ the [LLAMA3_LICENSE](./LLAMA3_LICENSE).
105
 
106
  ## Training details
107
 
README_zh.md CHANGED
@@ -1,10 +1,29 @@
1
  # CogVLM2-Video-Llama3-Chat
2
 
3
- CogVLM2-Video 在多个视频问答任务上实现了最先进的性能。下图显示了 CogVLM2-Video
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
5
  和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
6
 
7
- ![Quantitative Evaluation](https://github.com/THUDM/CogVLM2/tree/main/resources/cogvlm2_video_bench.jpeg)
8
 
9
  其中 VCG 指的是 VideoChatGPTBench,ZS 指的是零样本 VideoQA 数据集,MV-* 指的是 MVBench 中的主要类别。
10
 
@@ -57,9 +76,10 @@ prompt = f"The input consists of a sequence of key frames from a video. Answer t
57
 
58
  ## 模型协议
59
 
60
- 此模型根据 CogVLM2 [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0) 发布。对于使用 Meta Llama 3 构建的模型,还请遵守
61
- [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0)
62
-
 
63
  ## 引用
64
 
65
  我们即将发布技术报告,尽情期待。
 
1
  # CogVLM2-Video-Llama3-Chat
2
 
3
+ CogVLM2-Video 在多个视频问答任务上达到了 state-of-the-art 的性能,能够实现一分钟内的视频理解。
4
+ 我们提供了两个示例视频,分别展现了 CogVLM2-Video 的 视频理解和时间序列定位能力。
5
+
6
+ <table>
7
+ <tr>
8
+ <td>
9
+ <video width="100%" controls>
10
+ <source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/lion.mp4" type="video/mp4">
11
+ </video>
12
+ </td>
13
+ <td>
14
+ <video width="100%" controls>
15
+ <source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/basketball.mp4" type="video/mp4">
16
+ </video>
17
+ </td>
18
+ </tr>
19
+ </table>
20
+
21
+
22
+ 下图显示了 CogVLM2-Video
23
  在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
24
  和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
25
 
26
+ ![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
27
 
28
  其中 VCG 指的是 VideoChatGPTBench,ZS 指的是零样本 VideoQA 数据集,MV-* 指的是 MVBench 中的主要类别。
29
 
 
76
 
77
  ## 模型协议
78
 
79
+ 此模型根据
80
+ CogVLM2 [LICENSE](./LICENSE)
81
+ 发布。对于使用 Meta Llama 3 构建的模型,还请遵守
82
+ [LLAMA3_LICENSE](./LLAMA3_LICENSE)。
83
  ## 引用
84
 
85
  我们即将发布技术报告,尽情期待。