sdelangen commited on
Commit
320f41b
1 Parent(s): d0f351e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -80,6 +80,9 @@ With streaming, the results with different chunk sizes on test-clean are the fol
80
  This ASR system is a Conformer model trained with the RNN-T loss (with an auxiliary CTC loss to stabilize training). The model operates with a unigram tokenizer.
81
  Architecture details are described in the [training hyperparameters file](https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/ASR/transducer/hparams/conformer_transducer.yaml).
82
 
 
 
 
83
  The system is trained with recordings sampled at 16kHz (single channel).
84
  The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling `transcribe_file` if needed.
85
 
 
80
  This ASR system is a Conformer model trained with the RNN-T loss (with an auxiliary CTC loss to stabilize training). The model operates with a unigram tokenizer.
81
  Architecture details are described in the [training hyperparameters file](https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/ASR/transducer/hparams/conformer_transducer.yaml).
82
 
83
+ Streaming support makes use of Dynamic Chunk Training. Chunked attention is used for the multi-head attention module, and an implementation of [Dynamic Chunk Convolutions](https://www.amazon.science/publications/dynamic-chunk-convolution-for-unified-streaming-and-non-streaming-conformer-asr) were used.
84
+ The model was trained with support for different chunk sizes (and even full context), and so is suitable for various chunk sizes and offline transcription.
85
+
86
  The system is trained with recordings sampled at 16kHz (single channel).
87
  The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling `transcribe_file` if needed.
88