Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame
128 Bytes
Local attention
Longformer uses local attention: often, the local context (e.g., what are the two tokens to the
left and right?)