|
And the specific calls of the forward is T5LayerNorm. |
|
Let's look at the last few frames of that report: |
|
Detected inf/nan during batch_number=0 |
|
Last 21 forward frames: |
|
abs min abs max metadata |
|
[] |
|
encoder.block.2.layer.1.DenseReluDense.wi_0 Linear |
|
2.17e-07 4.50e+00 weight |
|
1.79e-06 4.65e+00 input[0] |
|
2.68e-06 3.70e+01 output |
|
encoder.block.2.layer.1.DenseReluDense.wi_1 Linear |
|
8.08e-07 2.66e+01 weight |
|
1.79e-06 4.65e+00 input[0] |
|
1.27e-04 2.37e+02 output |
|
encoder.block.2.layer.1.DenseReluDense.wo Linear |
|
1.01e-06 6.44e+00 weight |
|
0.00e+00 9.74e+03 input[0] |
|
3.18e-04 6.27e+04 output |
|
encoder.block.2.layer.1.DenseReluDense T5DenseGatedGeluDense |
|
1.79e-06 4.65e+00 input[0] |
|
3.18e-04 6.27e+04 output |
|
encoder.block.2.layer.1.dropout Dropout |
|
3.18e-04 6.27e+04 input[0] |
|
0.00e+00 inf output |
|
The last frame reports for Dropout.forward function with the first entry for the only input and the second for the |
|
only output. |