video LM - a poonyZ Collection

poonyZ 's Collections

T2I

agi

fancy

VLM

llm

video LM

updated 3 days ago

StreamChat: Chatting with Streaming Video

Paper • 2412.08646 • Published 15 days ago • 17
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Paper • 2412.04432 • Published 21 days ago • 14
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published 25 days ago • 26
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 14 days ago • 91
Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 13 days ago • 131
VidTok: A Versatile and Open-Source Video Tokenizer

Paper • 2412.13061 • Published 9 days ago • 7