Maximum number of input tokens ?
#104
by
Kirolos
- opened
what is the maximum number of input tokens? is it the same as the original LLaMa (4096) or it increased?
from the paper https://arxiv.org/pdf/2310.06825.pdf (table 1) the window_size is 4096 and the context length is 8192
Here the full parameters (table 1 of the paper )
dim: 4096
n_layers: 32
head_dim: 128
hidden_dim: 14336
n_heads: 32
n_kv_heads: 8
window_size: 4096
context_len 8192
vocab_size 32000