M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper β’ 2411.06176 β’ Published 22 days ago β’ 44
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models β’ 11 items β’ Updated Sep 25 β’ 628
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper β’ 2410.17243 β’ Published Oct 22 β’ 88
Mitigating Object Hallucination via Concentric Causal Attention Paper β’ 2410.15926 β’ Published Oct 21 β’ 14
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper β’ 2410.12787 β’ Published Oct 16 β’ 30
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper β’ 2410.12787 β’ Published Oct 16 β’ 30 β’ 2
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages Paper β’ 2407.19672 β’ Published Jul 29 β’ 55
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper β’ 2407.03320 β’ Published Jul 3 β’ 92
VideoLLaMA 2 Collection Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability β’ 13 items β’ Updated 18 days ago β’ 21