`sliding_window` is larger than `max_position_embeddings`

#21
by J22 - opened

While sliding_window of the 4k version is 2047, which means that sliding window is enabled.

But for the 128k version, sliding window is disabled. Sliding window is more useful for this model.

Microsoft org

sliding_window is not supported by the LongRoPE implementation according to the authors.

gugarosa changed discussion status to closed

Sign up or log in to comment