vrdn23 commited on
Commit
8cd52b1
1 Parent(s): 286d1cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -44
README.md CHANGED
@@ -47,50 +47,54 @@ pipe(text.split())
47
  ## Training
48
  The `mini-bart-g2p` model was trained on a combination of both the [Librispeech Alignments dataset](https://zenodo.org/records/2619474#.YuCdaC8r1ZF) and the [CMUDict dataset](https://github.com/cmusphinx/cmudict).
49
  The model was trained using the [translation training script](https://github.com/huggingface/transformers/blob/main/examples/pytorch/translation/run_translation.py) provided by HuggingFace Transformers repo.
50
- The following parametrs were specified in the training script to produce the model.
51
- ```
52
- python run_translation.py \
53
- --model_name_or_path <MODEL DIR> \
54
- --source_lang wrd \
55
- --target_lang phon \
56
- --num_train_epochs 500 \
57
- --train_file <TRAIN SPLIT> \
58
- --validation_file <VAL SPLIT> \
59
- --test_file <TEST SPLIT> \
60
- --num_beams 5 \
61
- --generation_num_beams 5 \
62
- --max_source_length 128 \
63
- --max_target_length 128 \
64
- --overwrite_cache \
65
- --overwrite_output_dir \
66
- --do_train \
67
- --do_eval \
68
- --do_predict \
69
- --evaluation_strategy epoch \
70
- --eval_delay 3 \
71
- --save_strategy epoch \
72
- --per_device_train_batch_size 16 \
73
- --per_device_eval_batch_size 16 \
74
- --learning_rate 5e-4 \
75
- --label_smoothing_factor 0.1 \
76
- --weight_decay 0.00001 \
77
- --adam_beta1 0.9 \
78
- --adam_beta2 0.98 \
79
- --load_best_model_at_end True \
80
- --predict_with_generate True \
81
- --generation_max_length 20 \
82
- --output_dir <OUTPUT DIR> \
83
- --seed 4664427 \
84
- --lr_scheduler_type cosine_with_restarts \
85
- --warmup_steps 120000 \
86
- --optim adafactor \
87
- --group_by_length \
88
- --metric_for_best_model bleu \
89
- --greater_is_better True \
90
- --save_total_limit 10 \
91
- --log_level info \
92
- --logging_steps 500
93
- ```
 
 
 
 
94
 
95
 
96
  ## Limitations
 
47
  ## Training
48
  The `mini-bart-g2p` model was trained on a combination of both the [Librispeech Alignments dataset](https://zenodo.org/records/2619474#.YuCdaC8r1ZF) and the [CMUDict dataset](https://github.com/cmusphinx/cmudict).
49
  The model was trained using the [translation training script](https://github.com/huggingface/transformers/blob/main/examples/pytorch/translation/run_translation.py) provided by HuggingFace Transformers repo.
50
+ The following parameters were specified in the training script to produce the model.
51
+ <details>
52
+ <summary>Training script parameters</summary>
53
+
54
+ ```bash
55
+ python run_translation.py \
56
+ --model_name_or_path <MODEL DIR> \
57
+ --source_lang wrd \
58
+ --target_lang phon \
59
+ --num_train_epochs 500 \
60
+ --train_file <TRAIN SPLIT> \
61
+ --validation_file <VAL SPLIT> \
62
+ --test_file <TEST SPLIT> \
63
+ --num_beams 5 \
64
+ --generation_num_beams 5 \
65
+ --max_source_length 128 \
66
+ --max_target_length 128 \
67
+ --overwrite_cache \
68
+ --overwrite_output_dir \
69
+ --do_train \
70
+ --do_eval \
71
+ --do_predict \
72
+ --evaluation_strategy epoch \
73
+ --eval_delay 3 \
74
+ --save_strategy epoch \
75
+ --per_device_train_batch_size 16 \
76
+ --per_device_eval_batch_size 16 \
77
+ --learning_rate 5e-4 \
78
+ --label_smoothing_factor 0.1 \
79
+ --weight_decay 0.00001 \
80
+ --adam_beta1 0.9 \
81
+ --adam_beta2 0.98 \
82
+ --load_best_model_at_end True \
83
+ --predict_with_generate True \
84
+ --generation_max_length 20 \
85
+ --output_dir <OUTPUT DIR> \
86
+ --seed 4664427 \
87
+ --lr_scheduler_type cosine_with_restarts \
88
+ --warmup_steps 120000 \
89
+ --optim adafactor \
90
+ --group_by_length \
91
+ --metric_for_best_model bleu \
92
+ --greater_is_better True \
93
+ --save_total_limit 10 \
94
+ --log_level info \
95
+ --logging_steps 500
96
+ ```
97
+ </details>
98
 
99
 
100
  ## Limitations