Visual Question Answering
Transformers
Safetensors
English
videollama2_qwen2
text-generation
multimodal large language model
large video-language model
Inference Endpoints
merve HF staff commited on
Commit
2886c9a
1 Parent(s): b6fb50e

fix task tag

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ language:
9
  metrics:
10
  - accuracy
11
  library_name: transformers
12
- pipeline_tag: visual-question-answering
13
  tags:
14
  - multimodal large language model
15
  - large video-language model
@@ -103,4 +103,4 @@ If you find VideoLLaMA useful for your research and applications, please cite us
103
  year = {2023},
104
  url = {https://arxiv.org/abs/2306.02858}
105
  }
106
- ```
 
9
  metrics:
10
  - accuracy
11
  library_name: transformers
12
+ pipeline_tag: video-text-to-text
13
  tags:
14
  - multimodal large language model
15
  - large video-language model
 
103
  year = {2023},
104
  url = {https://arxiv.org/abs/2306.02858}
105
  }
106
+ ```