Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame
232 Bytes
yaml
{
"zero_allow_untested_optimizer": true
}
From DeepSpeed==0.8.3 on, if you want to use offload, you'll also need to the following to the top level configuration because offload works best with DeepSpeed's CPU Adam optimizer.