`sliding_window` is larger than `max_position_embeddings`
#21
by
J22
- opened
While sliding_window
of the 4k version is 2047, which means that sliding window is enabled.
But for the 128k version, sliding window is disabled. Sliding window is more useful for this model.
sliding_window
is not supported by the LongRoPE
implementation according to the authors.
gugarosa
changed discussion status to
closed