Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
yaml
{
"train_micro_batch_size_per_gpu": "auto",
"train_batch_size": "auto"
}
Gradient accumulation
Gradient accumulation can be auto-configured or explicitly set.