Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.00945

a collection of text to speech papers.

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12 • 54
E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14
Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

Paper • 1712.05884 • Published Dec 16, 2017 • 2

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

Random Field Augmentations for Self-Supervised Representation Learning

Paper • 2311.03629 • Published Nov 7, 2023 • 6
Levels of AGI: Operationalizing Progress on the Path to AGI

Paper • 2311.02462 • Published Nov 4, 2023 • 33
Idempotent Generative Network

Paper • 2311.01462 • Published Nov 2, 2023 • 24
E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 40
Data Filtering Networks

Paper • 2309.17425 • Published Sep 29, 2023 • 6
FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 35
E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14
SoundCam: A Dataset for Finding Humans Using Room Acoustics

Paper • 2311.03517 • Published Nov 6, 2023 • 10

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Diffusion Model

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

Paper • 2309.03895 • Published Sep 7, 2023 • 13
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

Paper • 2309.16650 • Published Sep 28, 2023 • 10
CCEdit: Creative and Controllable Video Editing via Diffusion Models

Paper • 2309.16496 • Published Sep 28, 2023 • 9
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling

Paper • 2310.15169 • Published Oct 23, 2023 • 9

Large-Scale Automatic Audiobook Creation

Paper • 2309.03926 • Published Sep 7, 2023 • 53
FoleyGen: Visually-Guided Audio Generation

Paper • 2309.10537 • Published Sep 19, 2023 • 8
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24
UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Paper • 2310.00704 • Published Oct 1, 2023 • 19

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs