Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ base_model: charactr/vocos-mel-24khz
|
|
10 |
|
11 |
Vocos is a fast neural vocoder designed to synthesize audio waveforms from acoustic features. Trained using a Generative Adversarial Network (GAN) objective, Vocos can generate waveforms in a single forward pass. Unlike other typical GAN-based vocoders, Vocos does not model audio samples in the time domain. Instead, it generates spectral coefficients, facilitating rapid audio reconstruction through inverse Fourier transform.
|
12 |
|
13 |
-
This is a ONNX version of the original mel spectrogram model. The model predicts spectrograms and the ISTFT is performed outside ONNX as ISTFT is still not implemented as an operator in ONNX.
|
14 |
|
15 |
## Usage
|
16 |
|
|
|
10 |
|
11 |
Vocos is a fast neural vocoder designed to synthesize audio waveforms from acoustic features. Trained using a Generative Adversarial Network (GAN) objective, Vocos can generate waveforms in a single forward pass. Unlike other typical GAN-based vocoders, Vocos does not model audio samples in the time domain. Instead, it generates spectral coefficients, facilitating rapid audio reconstruction through inverse Fourier transform.
|
12 |
|
13 |
+
This is a ONNX version of the original 24khz mel spectrogram [model](https://huggingface.co/charactr/vocos-mel-24khz). The model predicts spectrograms and the ISTFT is performed outside ONNX as ISTFT is still not implemented as an operator in ONNX.
|
14 |
|
15 |
## Usage
|
16 |
|