lmms-lab
/

LLaVA-Video-72B-Qwen2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ZhangYuanhan commited on Sep 16

Commit

09f76c2

•

1 Parent(s): c53413e

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 datasets:
 - lmms-lab/LLaVA-NeXT-Video-SFT-Data
 language:
 - en
 library_name: transformers
@@ -112,6 +113,8 @@ model-index:
       value: 70.5
       name: accuracy
       verified: true
 ---
@@ -128,7 +131,7 @@ model-index:
 ## Model Summary
-The LLaVA-OneVision models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
 - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
 - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)

 ---
 datasets:
 - lmms-lab/LLaVA-NeXT-Video-SFT-Data
+- lmms-lab/LLaVA-OneVision-Data
 language:
 - en
 library_name: transformers
       value: 70.5
       name: accuracy
       verified: true
+base_model:
+- lmms-lab/llava-onevision-qwen2-7b-si
 ---
 ## Model Summary
+The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
 - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
 - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)