patrickvonplaten
commited on
Commit
•
abdabef
1
Parent(s):
17b4319
Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,9 @@ tags:
|
|
14 |
|
15 |
[Microsoft's UniSpeech](https://www.microsoft.com/en-us/research/publication/unispeech-unified-speech-representation-learning-with-labeled-and-unlabeled-data/)
|
16 |
|
17 |
-
The multi-lingual large model pretrained on 16kHz sampled speech audio and phonetic labels. When using the model make sure that your speech input is also sampled at 16kHz and your text in converted into a sequence of phonemes.
|
|
|
|
|
18 |
|
19 |
[Paper: UniSpeech: Unified Speech Representation Learning
|
20 |
with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597)
|
|
|
14 |
|
15 |
[Microsoft's UniSpeech](https://www.microsoft.com/en-us/research/publication/unispeech-unified-speech-representation-learning-with-labeled-and-unlabeled-data/)
|
16 |
|
17 |
+
The multi-lingual large model pretrained on 16kHz sampled speech audio and phonetic labels. When using the model make sure that your speech input is also sampled at 16kHz and your text in converted into a sequence of phonemes.
|
18 |
+
|
19 |
+
**Note**: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model **speech recognition**, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model.
|
20 |
|
21 |
[Paper: UniSpeech: Unified Speech Representation Learning
|
22 |
with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597)
|