zinc75 commited on
Commit
d404be6
1 Parent(s): 4f7e2dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -36,13 +36,14 @@ model-index:
36
 
37
  Fine-tuned [facebook/wav2vec2-base-fr-voxpopuli-v2](https://huggingface.co/facebook/wav2vec2-base-fr-voxpopuli-v2) for **French speech-to-phoneme** using the train and validation splits of [Common Voice v13](https://huggingface.co/datasets/mozilla-foundation/common_voice_13_0).
38
 
39
- ## Samplerate of audio
40
 
41
  When using this model, make sure that your speech input is **sampled at 16kHz**.
42
 
43
- ## Training procedure details
 
 
44
 
45
- - The model has been trained for 14 epochs on 4x2080 Ti GPUs using a ddp strategy and gradient-accumulation procedure (256 audios per update, corresponding roughly to 25 minutes of speech per update -> 2k updates per epoch)
46
  - Learning rate schedule : Double Tri-state schedule
47
  - Warmup from 1e-5 for 7% of total updates
48
  - Constant at 1e-4 for 28% of total updates
@@ -51,5 +52,5 @@ When using this model, make sure that your speech input is **sampled at 16kHz**.
51
  - Constant at 3e-5 for 12% of total updates
52
  - Linear decrease to 1e-7 for remaining 14% of updates
53
 
54
- - The set of hyperparameters used for training are those detailed in Annex B and Table 6 of [wav2vec2 paper](https://arxiv.org/pdf/2006.11477.pdf).
55
 
 
36
 
37
  Fine-tuned [facebook/wav2vec2-base-fr-voxpopuli-v2](https://huggingface.co/facebook/wav2vec2-base-fr-voxpopuli-v2) for **French speech-to-phoneme** using the train and validation splits of [Common Voice v13](https://huggingface.co/datasets/mozilla-foundation/common_voice_13_0).
38
 
39
+ ## Audio samplerate for usage
40
 
41
  When using this model, make sure that your speech input is **sampled at 16kHz**.
42
 
43
+ ## Training procedure
44
+
45
+ The model has been finetuned on Coommonvoice-v13 (FR) for 14 epochs on 4x2080 Ti GPUs using a ddp strategy and gradient-accumulation procedure (256 audios per update, corresponding roughly to 25 minutes of speech per update -> 2k updates per epoch)
46
 
 
47
  - Learning rate schedule : Double Tri-state schedule
48
  - Warmup from 1e-5 for 7% of total updates
49
  - Constant at 1e-4 for 28% of total updates
 
52
  - Constant at 3e-5 for 12% of total updates
53
  - Linear decrease to 1e-7 for remaining 14% of updates
54
 
55
+ - The set of hyperparameters used for training are the same as those detailed in Annex B and Table 6 of [wav2vec2 paper](https://arxiv.org/pdf/2006.11477.pdf).
56