-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 33 -
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Paper • 2311.07575 • Published • 13 -
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Paper • 2311.03354 • Published • 4 -
Language-Informed Visual Concept Learning
Paper • 2312.03587 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2311.03354
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 27 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 20 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 64
-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 33 -
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Paper • 2311.03354 • Published • 4 -
CogVLM: Visual Expert for Pretrained Language Models
Paper • 2311.03079 • Published • 23 -
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework
Paper • 2311.10125 • Published • 4
-
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 28 -
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 33 -
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Paper • 2311.03354 • Published • 4