Launch training | |
An example FSDP configuration file may look like: | |
yaml | |
compute_environment: LOCAL_MACHINE | |
debug: false | |
distributed_type: FSDP | |
downcast_bf16: 'no' | |
fsdp_config: | |
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP | |
fsdp_backward_prefetch_policy: BACKWARD_PRE | |
fsdp_cpu_ram_efficient_loading: true | |
fsdp_forward_prefetch: false | |
fsdp_offload_params: true | |
fsdp_sharding_strategy: 1 | |
fsdp_state_dict_type: SHARDED_STATE_DICT | |
fsdp_sync_module_states: true | |
fsdp_transformer_layer_cls_to_wrap: BertLayer | |
fsdp_use_orig_params: true | |
machine_rank: 0 | |
main_training_function: main | |
mixed_precision: bf16 | |
num_machines: 1 | |
num_processes: 2 | |
rdzv_backend: static | |
same_network: true | |
tpu_env: [] | |
tpu_use_cluster: false | |
tpu_use_sudo: false | |
use_cpu: false | |
To launch training, run the accelerate launch command and it'll automatically use the configuration file you previously created with accelerate config. |