Add PESQ 3.15 model

Browse files

Files changed (3) hide show

README.md +76 -0
enhance_model.ckpt +0 -0
hyperparams.yaml +40 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+language: "en"
+tags:
+- Speech Enhancement
+- PyTorch
+license: "apache-2.0"
+datasets:
+- Voicebank
+- DEMAND
+metrics:
+- PESQ
+- STOI
+---
+# MetricGAN-trained model for Enhancement
+This repository provides all the necessary tools to perform enhancement with
+SpeechBrain. For a better experience we encourage you to learn more about
+[SpeechBrain](https://speechbrain.github.io). The given model performance is:
+| Release | Test PESQ | Test STOI |
+|:-----------:|:-----:| :-----:|
+| 21-04-27 | 3.15 | 93.0 |
+## Install SpeechBrain
+First of all, please install SpeechBrain with the following command:
+```
+pip install speechbrain
+```
+Please notice that we encourage you to read our tutorials and learn more about
+[SpeechBrain](https://speechbrain.github.io).
+## Pretrained Usage
+To use the mimic-loss-trained model for enhancement, use the following simple code:
+```python
+from speechbrain.pretrained import SpectralMaskEnhancement
+enhance_model = SpectralMaskEnhancement.from_hparams(
+    source="speechbrain/metricgan-plus-voicebank",
+    savedir="pretrained_models/metricgan-plus-voicebank",
+)
+enhance_model.enhance_file("/path/to/file.wav")
+```
+## Referencing MetricGAN+
+If you find MetricGAN+ useful, please cite:
+```
+@article{fu2021metricgan+,
+  title={MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement},
+  author={Fu, Szu-Wei and Yu, Cheng and Hsieh, Tsun-An and Plantinga, Peter and Ravanelli, Mirco and Lu, Xugang and Tsao, Yu},
+  journal={arXiv preprint arXiv:2104.03538},
+  year={2021}
+}
+```
+## Referencing SpeechBrain
+If you find SpeechBrain useful, please cite:
+```
+@misc{SB2021,
+author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
+title = {SpeechBrain},
+year = {2021},
+publisher = {GitHub},
+journal = {GitHub repository},
+howpublished = {\url{https://github.com/speechbrain/speechbrain}},
+}
+```

enhance_model.ckpt ADDED Viewed

Binary file (7.59 MB). View file

hyperparams.yaml ADDED Viewed

	@@ -0,0 +1,40 @@

+# STFT parameters
+sample_rate: 16000
+win_length: 32
+hop_length: 16
+n_fft: 512
+window_fn: !name:torch.hamming_window
+compute_stft: !new:speechbrain.processing.features.STFT
+    sample_rate: !ref <sample_rate>
+    n_fft: !ref <n_fft>
+    win_length: !ref <win_length>
+    hop_length: !ref <hop_length>
+    window_fn: !ref <window_fn>
+compute_istft: !new:speechbrain.processing.features.ISTFT
+    sample_rate: !ref <sample_rate>
+    n_fft: !ref <n_fft>
+    win_length: !ref <win_length>
+    hop_length: !ref <hop_length>
+    window_fn: !ref <window_fn>
+spectral_magnitude: !name:speechbrain.processing.features.spectral_magnitude
+    power: 0.5
+resynth: !name:speechbrain.processing.signal_processing.resynthesize
+    stft: !ref <compute_stft>
+    istft: !ref <compute_istft>
+enhance_model: !new:speechbrain.lobes.models.MetricGAN.EnhancementGenerator
+    input_size: !ref <n_fft> // 2 + 1
+    hidden_size: 200
+    num_layers: 2
+    dropout: 0
+modules:
+    enhance_model: !ref <enhance_model>
+pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer
+    loadables:
+        enhance_model: !ref <enhance_model>