pere commited on
Commit
edc4ff2
1 Parent(s): f6ad795

Adding recover. Fix README

Browse files
Files changed (2) hide show
  1. README.md +4 -4
  2. run.recover.sh +40 -0
README.md CHANGED
@@ -26,7 +26,7 @@ model-index:
26
  metrics:
27
  - name: Wer
28
  type: wer
29
- value: 47.08
30
  ---
31
 
32
  # Whisper Tiny Norwegian Bokmål
@@ -34,8 +34,8 @@ model-index:
34
  This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) trained on several datasets.
35
 
36
  It is currently in the middle of a large training. Currently it achieves the following results on the evaluation set:
37
- - Loss: 1.464
38
- - Wer: 47.08
39
 
40
  ## Model description
41
 
@@ -55,7 +55,7 @@ The following hyperparameters were used during training:
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
  - lr_scheduler_warmup_steps: 1000
58
- - training_steps: 100.000 (currently @4.000)
59
  - mixed_precision_training: fp16
60
 
61
  ### Live Training results
 
26
  metrics:
27
  - name: Wer
28
  type: wer
29
+ value: 45.73
30
  ---
31
 
32
  # Whisper Tiny Norwegian Bokmål
 
34
  This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) trained on several datasets.
35
 
36
  It is currently in the middle of a large training. Currently it achieves the following results on the evaluation set:
37
+ - Loss: 1.4616
38
+ - Wer: 45.73
39
 
40
  ## Model description
41
 
 
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
  - lr_scheduler_warmup_steps: 1000
58
+ - training_steps: 100.000 (currently @5.000)
59
  - mixed_precision_training: fp16
60
 
61
  ### Live Training results
run.recover.sh ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ python run_speech_recognition_seq2seq_streaming.py \
2
+ --model_name_or_path="openai/whisper-tiny" \
3
+ --resume_from_checkpoint="checkpoint-5000" \
4
+ --dataset_name="NbAiLab/NCC_S" \
5
+ --language="Norwegian" \
6
+ --train_split_name="train" \
7
+ --eval_split_name="validation" \
8
+ --model_index_name="Whisper Tiny Norwegian Bokmål" \
9
+ --max_steps="100000" \
10
+ --output_dir="./" \
11
+ --per_device_train_batch_size="128" \
12
+ --per_device_eval_batch_size="32" \
13
+ --gradient_accumulation_step="1" \
14
+ --logging_steps="50" \
15
+ --learning_rate="3e-6" \
16
+ --lr_scheduler_type="constant_with_warmup" \
17
+ --warmup_steps="1000" \
18
+ --evaluation_strategy="steps" \
19
+ --eval_steps="1000" \
20
+ --save_strategy="steps" \
21
+ --save_steps="1000" \
22
+ --generation_max_length="225" \
23
+ --length_column_name="duration" \
24
+ --max_duration_in_seconds="30" \
25
+ --text_column_name="text" \
26
+ --freeze_feature_encoder="False" \
27
+ --report_to="tensorboard" \
28
+ --metric_for_best_model="wer" \
29
+ --greater_is_better="False" \
30
+ --load_best_model_at_end \
31
+ --gradient_checkpointing \
32
+ --fp16 \
33
+ --overwrite_output_dir="false" \
34
+ --do_train \
35
+ --do_eval \
36
+ --predict_with_generate \
37
+ --do_normalize_eval \
38
+ --use_auth_token \
39
+ --push_to_hub \
40
+ --hub_model_id="NbAiLab/whisper-tiny-nob"