Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 28
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models Paper • 2405.16537 • Published May 26 • 16
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Paper • 2405.17428 • Published May 27 • 17