Calculating PPL with fixed-length models | |
If we weren't limited by a model's context size, we would evaluate the model's perplexity by autoregressively | |
factorizing a sequence and conditioning on the entire preceding subsequence at each step, as shown below. |