MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs Paper • 2411.15296 • Published 5 days ago • 15
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration Paper • 2411.17686 • Published 1 day ago • 14
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation Paper • 2411.14525 • Published 6 days ago • 18
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline Paper • 2411.12814 • Published 8 days ago • 20
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published 6 days ago • 29
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published 6 days ago • 13
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published 6 days ago • 51
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 12 days ago • 61
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Paper • 2411.14257 • Published 7 days ago • 9
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published 6 days ago • 13
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published 6 days ago • 19
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published 6 days ago • 37
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published 8 days ago • 13
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization Paper • 2411.11909 • Published 11 days ago • 20
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering Paper • 2411.09213 • Published 14 days ago • 6