Heeseung Kim's picture

1

Heeseung Kim

KHS

gmltmd789

AI & ML interests

Speech Synthesis

Recent Activity

authored a paper 4 days ago

PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior

authored a paper 4 days ago

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

authored a paper 4 days ago

Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings

View all activity

Organizations

None yet

KHS's activity

authored 13 papers 4 days ago

PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior

Paper • 2106.06406 • Published Jun 11, 2021

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

Paper • 2306.16083 • Published Jun 28, 2023

Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings

Paper • 2109.03127 • Published Sep 7, 2021

Unified Speech-Text Pretraining for Spoken Dialog Modeling

Paper • 2402.05706 • Published Feb 8 • 6

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2 • 20

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Paper • 2411.15466 • Published 10 days ago • 33

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

Paper • 2111.11755 • Published Nov 23, 2021

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

Paper • 2205.15370 • Published May 30, 2022

Edit-A-Video: Single Video Editing with Object-Aware Consistency

Paper • 2303.07945 • Published Mar 14, 2023

Stein Latent Optimization for Generative Adversarial Networks

Paper • 2106.05319 • Published Jun 9, 2021

VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech

Paper • 2408.14739 • Published Aug 27

VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance

Paper • 2409.15759 • Published Sep 24 • 1

NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers

Paper • 2409.15760 • Published Sep 24 • 1

upvoted a paper 9 days ago

Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published 11 days ago • 35