cemsubakan commited on
Commit
c85db8d
1 Parent(s): 5562002

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ thumbnail:
4
+ tags:
5
+ - Source Separation
6
+ - Speech Separation
7
+ - Audio Source Separation
8
+ - WHAM!
9
+ - SepFormer
10
+ - Transformer
11
+ license: "apache-2.0"
12
+ datasets:
13
+ - WHAMR!
14
+ metrics:
15
+ - SI-SNRi
16
+ - SDRi
17
+
18
+ ---
19
+
20
+ # SepFormer trained on WHAM!
21
+ This repository provides all the necessary tools to perform audio source separation with a [SepFormer](https://arxiv.org/abs/2010.13154v2) model, implemented with SpeechBrain, and pretrained on [WHAMR!](http://wham.whisper.ai/) dataset with 16k sampling frequency, which is basically a version of WSJ0-Mix dataset with environmental noise and reverberation in 16k. For a better experience we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance is 13.5 dB SI-SNRi on the test set of WHAMR! dataset.
22
+
23
+ | Release | Test-Set SI-SNRi | Test-Set SDRi |
24
+ |:-------------:|:--------------:|:--------------:|
25
+ | 30-03-21 | 13.5 dB | 13.0 dB |
26
+
27
+
28
+ ## Install SpeechBrain
29
+
30
+ First of all, please install SpeechBrain with the following command:
31
+
32
+ ```
33
+ pip install speechbrain
34
+ ```
35
+
36
+ Please notice that we encourage you to read our tutorials and learn more about [SpeechBrain](https://speechbrain.github.io).
37
+
38
+ ### Perform source separation on your own audio file
39
+
40
+ ```python
41
+ from speechbrain.pretrained import SepformerSeparation as separator
42
+ import torchaudio
43
+
44
+ model = separator.from_hparams(source="speechbrain/sepformer-whamr-16k", savedir='pretrained_models/sepformer-whamr')
45
+
46
+ # for custom file, change path
47
+ est_sources = model.separate_file(path='speechbrain/sepformer-wsj02mix/test_mixture.wav')
48
+
49
+ torchaudio.save("source1hat.wav", est_sources[:, :, 0].detach().cpu(), 16000)
50
+ torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 16000)
51
+
52
+
53
+ ```
54
+
55
+ #### Referencing SpeechBrain
56
+
57
+ ```
58
+ @misc{SB2021,
59
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
60
+ title = {SpeechBrain},
61
+ year = {2021},
62
+ publisher = {GitHub},
63
+ journal = {GitHub repository},
64
+ howpublished = {\\\\\\\\url{https://github.com/speechbrain/speechbrain}},
65
+ }
66
+ ```
67
+
68
+ #### Referencing SepFormer
69
+ ```
70
+ @inproceedings{subakan2021attention,
71
+ title={Attention is All You Need in Speech Separation},
72
+ author={Cem Subakan and Mirco Ravanelli and Samuele Cornell and Mirko Bronzi and Jianyuan Zhong},
73
+ year={2021},
74
+ booktitle={ICASSP 2021}
75
+ }
76
+ ```