THUDM
/

cogvlm2-video-llama3-chat

@@ -46,7 +46,7 @@ the [MVBench](https://github.com/OpenGVLab/Ask-Anything), [VideoChatGPT-Bench](h
 and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
 refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
-![Quantitative Evaluation](https://github.com/THUDM/CogVLM2/tree/main/resources/cogvlm2_video_bench.jpeg)
 Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
@@ -99,9 +99,9 @@ our [github](https://github.com/THUDM/CogVLM2/tree/main/video_demo).
 ## License
 This model is released under the
-CogVLM2  [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0).
 For models built with Meta Llama 3, please also adhere to
-the [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0).
 ## Training details

 and Zero-shot VideoQA datasets (MSVD-QA, MSRVTT-QA, ActivityNet-QA). Where VCG-* refers to the VideoChatGPTBench, ZS-*
 refers to Zero-Shot VideoQA datasets and MV-* refers to main categories in the MVBench.
+![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
 Performance on VideoChatGPT-Bench and Zero-shot VideoQA dataset:
 ## License
 This model is released under the
+CogVLM2  [LICENSE](./LICENSE).
 For models built with Meta Llama 3, please also adhere to
+the [LLAMA3_LICENSE](./LLAMA3_LICENSE).
 ## Training details

README_zh.md CHANGED Viewed

@@ -1,10 +1,29 @@
 # CogVLM2-Video-Llama3-Chat
-CogVLM2-Video 在多个视频问答任务上实现了最先进的性能。下图显示了 CogVLM2-Video
 在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
 和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
-![Quantitative Evaluation](https://github.com/THUDM/CogVLM2/tree/main/resources/cogvlm2_video_bench.jpeg)
 其中 VCG 指的是 VideoChatGPTBench，ZS 指的是零样本 VideoQA 数据集，MV-* 指的是 MVBench 中的主要类别。
@@ -57,9 +76,10 @@ prompt = f"The input consists of a sequence of key frames from a video. Answer t
 ## 模型协议
-此模型根据 CogVLM2 [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0) 发布。对于使用 Meta Llama 3 构建的模型，还请遵守
-[LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0)。
 ## 引用
 我们即将发布技术报告，尽情期待。

 # CogVLM2-Video-Llama3-Chat
+CogVLM2-Video 在多个视频问答任务上达到了 state-of-the-art 的性能，能够实现一分钟内的视频理解。
+我们提供了两个示例视频，分别展现了 CogVLM2-Video 的 视频理解和时间序列定位能力。
+<table>
+  <tr>
+    <td>
+      <video width="100%" controls>
+        <source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/lion.mp4" type="video/mp4">
+      </video>
+    </td>
+    <td>
+      <video width="100%" controls>
+        <source src="https://github.com/THUDM/CogVLM2/raw/main/resources/videos/basketball.mp4" type="video/mp4">
+      </video>
+    </td>
+  </tr>
+</table>
+下图显示了 CogVLM2-Video
 在 [MVBench](https://github.com/OpenGVLab/Ask-Anything)、[VideoChatGPT-Bench](https://github.com/mbzuai-oryx/Video-ChatGPT)
 和 Zero-shot VideoQA 数据集 (MSVD-QA、MSRVTT-QA、ActivityNet-QA) 上的性能。
+![Quantitative Evaluation](https://raw.githubusercontent.com/THUDM/CogVLM2/main/resources/cogvlm2_video_bench.jpeg)
 其中 VCG 指的是 VideoChatGPTBench，ZS 指的是零样本 VideoQA 数据集，MV-* 指的是 MVBench 中的主要类别。
 ## 模型协议
+此模型根据
+CogVLM2 [LICENSE](./LICENSE)
+发布。对于使用 Meta Llama 3 构建的模型，还请遵守
+[LLAMA3_LICENSE](./LLAMA3_LICENSE)。
 ## 引用
 我们即将发布技术报告，尽情期待。