Perception Tokens Enhance Visual Reasoning in Multimodal Language Models Paper • 2412.03548 • Published 24 days ago • 16
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment Paper • 2411.17188 • Published Nov 26 • 21
MLP-KAN: Unifying Deep Representation and Function Learning Paper • 2410.03027 • Published Oct 3 • 29