jbetker
/

tortoise-tts-v2

Model card Files Files and versions Community

jbetker commited on May 3, 2022

Commit

a4cda68

•

1 Parent(s): f499d66

getting ready for 2.1 release

Browse files

Files changed (2) hide show

README.md +20 -2
tortoise/api.py +1 -1

README.md CHANGED Viewed

@@ -7,6 +7,15 @@ Tortoise is a text-to-speech program built with the following priorities:
 This repo contains all the code needed to run Tortoise TTS in inference mode.
 ## What's in a name?
 I'm naming my speech-related repos after Mojave desert flora and fauna. Tortoise is a bit tongue in cheek: this model
@@ -38,7 +47,7 @@ pip install -r requirements.txt
 This script allows you to speak a single phrase with one or more voices.
 ```shell
-python do_tts.py --text "I'm going to speak this" --voice dotrice --preset fast
 ```
 ### read.py
@@ -46,7 +55,7 @@ python do_tts.py --text "I'm going to speak this" --voice dotrice --preset fast
 This script provides tools for reading large amounts of text.
 ```shell
-python read.py --textfile <your text to be read> --voice dotrice
 ```
 This will break up the textfile into sentences, and then convert them to speech one at a time. It will output a series
@@ -72,6 +81,15 @@ Tortoise was specifically trained to be a multi-speaker model. It accomplishes t
 These reference clips are recordings of a speaker that you provide to guide speech generation. These clips are used to determine many properties of the output, such as the pitch and tone of the voice, speaking speed, and even speaking defects like a lisp or stuttering. The reference clip is also used to determine non-voice related aspects of the audio output like volume, background noise, recording quality and reverb.
 ### Provided voices
 This repo comes with several pre-packaged voices. You will be familiar with many of them. :)

 This repo contains all the code needed to run Tortoise TTS in inference mode.
+### New features
+#### v2.1; 2022/5/2
+- Added ability to produce totally random voices.
+- Added ability to download voice conditioning latent via a script, and then use a user-provided conditioning latent.
+- Added ability to use your own pretrained models.
+- Refactored directory structures.
+- Performance improvements & bug fixes.
 ## What's in a name?
 I'm naming my speech-related repos after Mojave desert flora and fauna. Tortoise is a bit tongue in cheek: this model
 This script allows you to speak a single phrase with one or more voices.
 ```shell
+python do_tts.py --text "I'm going to speak this" --voice random --preset fast
 ```
 ### read.py
 This script provides tools for reading large amounts of text.
 ```shell
+python read.py --textfile <your text to be read> --voice random
 ```
 This will break up the textfile into sentences, and then convert them to speech one at a time. It will output a series
 These reference clips are recordings of a speaker that you provide to guide speech generation. These clips are used to determine many properties of the output, such as the pitch and tone of the voice, speaking speed, and even speaking defects like a lisp or stuttering. The reference clip is also used to determine non-voice related aspects of the audio output like volume, background noise, recording quality and reverb.
+### Random voice
+I've included a feature which randomly generates a voice. These voices don't actually exist and will be random every time you run
+it. The results are quite fascinating and I recommend you play around with it!
+You can use the random voice by passing in 'random' as the voice name. Tortoise will take care of the rest.
+For the those in the ML space: this is created by projecting a random vector onto the voice conditioning latent space.
 ### Provided voices
 This repo comes with several pre-packaged voices. You will be familiar with many of them. :)

tortoise/api.py CHANGED Viewed

@@ -165,7 +165,7 @@ class TextToSpeech:
     Main entry point into Tortoise.
     """
-    def __init__(self, autoregressive_batch_size=16, models_dir='.models', enable_redaction=True):
         """
         Constructor
         :param autoregressive_batch_size: Specifies how many samples to generate per batch. Lower this if you are seeing

     Main entry point into Tortoise.
     """
+    def __init__(self, autoregressive_batch_size=16, models_dir='.models', enable_redaction=False):
         """
         Constructor
         :param autoregressive_batch_size: Specifies how many samples to generate per batch. Lower this if you are seeing