LLM Multimodal - a ubuntu16 Collection

ubuntu16 's Collections

LLM Domain-Specific

Video

LLM Multimodal

updated about 16 hours ago

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published 5 days ago • 23
Open-Sora Plan: Open-Source Large Video Generation Model

Paper • 2412.00131 • Published 6 days ago • 25
X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Paper • 2412.01824 • Published 2 days ago • 52