Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
If you choose to use the "auto" option, [Trainer] sets train_micro_batch_size_per_gpu to the value of args.per_device_train_batch_size and train_batch_size to args.world_size * args.per_device_train_batch_size * args.gradient_accumulation_steps.