LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 3 days ago • 37
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 22 days ago • 25
Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills Paper • 2303.15349 • Published Mar 27, 2023
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning Paper • 2412.12953 • Published 24 days ago • 11
MoDE Collection Collection of pretrained MoDE Diffusion Policies. Variants include finetuned versions for all CALVIN benchmarks and LIBERO 90. • 9 items • Updated 22 days ago
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning Paper • 2412.12953 • Published 24 days ago • 11
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning Paper • 2412.12953 • Published 24 days ago • 11 • 2
MoDE Collection Collection of pretrained MoDE Diffusion Policies. Variants include finetuned versions for all CALVIN benchmarks and LIBERO 90. • 9 items • Updated 22 days ago