Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate Paper • 2410.07167 • Published Oct 9 • 37
Visual Context Window Extension: A New Perspective for Long Video Understanding Paper • 2409.20018 • Published Sep 30 • 9
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22 • 88
Mitigating Object Hallucination via Concentric Causal Attention Paper • 2410.15926 • Published Oct 21 • 14
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16 • 30