MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models Paper • 2410.09733 • Published 26 days ago • 8
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality Paper • 2410.05210 • Published Oct 7 • 10