omni - a poonyZ Collection

poonyZ 's Collections

omni

T2I

agi

fancy

VLM

llm

omni

updated 1 day ago

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published 7 days ago • 32
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities

Paper • 2410.02155 • Published Oct 3, 2024 • 2
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

Paper • 2501.04561 • Published 2 days ago • 15