Music Generation - a kaizuberbuehler Collection

kaizuberbuehler 's Collections

Image Generation

Vision Language Models

Foundation Models

Synthetic Data and Self-Improvement

Agents

Video Generation

LM Prompt Engineering

LM Capabilities and Scaling

Music Generation

LM Architectures

Code Generation

Speech Synthesis

EXL2 Quantized Models

Music Generation

updated Oct 25

Long-form music generation with latent diffusion

Paper • 2404.10301 • Published Apr 16 • 24
MuPT: A Generative Symbolic Music Pretrained Transformer

Paper • 2404.06393 • Published Apr 9 • 14
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15 • 11
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation

Paper • 2406.10970 • Published Jun 16 • 1
FLUX that Plays Music

Paper • 2409.00587 • Published Sep 1 • 31
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17 • 17
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published Sep 13 • 47
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Paper • 2409.18042 • Published Sep 26 • 36
MuCodec: Ultra Low-Bitrate Music Codec

Paper • 2409.13216 • Published Sep 20 • 22
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4 • 17