Question about output embedding vector of ModernBERT

#12
by Youm9602 - opened

Are the output CLS and token embedding vectors L2 normalized on a per-token basis?

Sign up or log in to comment