What's the difference between this llava-hf and lmms-lab? Why the checkpoints of Llava-Next-Video is different?
As said in title, why the LLava-next-videos-7b in llava-hf is pretrained by vicuna-7b-v1.5, but the same size in lmms-labs is released by Qwen? Why release different models?
Hope to get you response soon! Thank you!
Best wishes
These are the prev video models, one of which is https://huggingface.co/lmms-lab/LLaVA-NeXT-Video-7B/blob/main/config.json. They were based on vicuna. The new video models are not still converted to HF format
The general aim of out org is to convert llava models into HF compatible format so that it can be used by directly calling from_pretrained
and support various generation tecniques. I can look into the new model series and convert them soon, thanks!
Thank you for your reply!
@RaushanTurganbay Hi, any plans to support the most up-to-date Llava-Video?
@liyucheng yes, I will add those when I have bandwidth, some time this month I hope