Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration Paper • 2411.17686 • Published 6 days ago • 18
PiTe: Pixel-Temporal Alignment for Large Video-Language Model Paper • 2409.07239 • Published Sep 11 • 11
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper • 2403.14520 • Published Mar 21 • 33