For example, if | |
a problem starts happening at batch number 150. So you can dump traces for batches 149 and 150 and compare where | |
numbers started to diverge. | |
You can also specify the batch number after which to stop the training, with: | |
python | |
debug_overflow = DebugUnderflowOverflow(model, trace_batch_nums=[1, 3], abort_after_batch_num=3). |