LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models Paper • 2311.17043 • Published Nov 28, 2023