nomic-ai/nomic-embed-text-v1.5 · Use of `layer

Sep 30

In this example from the model card I'm having trouble working out why F.layer_norm is being used.

import torch.nn.functional as F
from sentence_transformers import SentenceTransformer

matryoshka_dim = 512

model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True)
sentences = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?']
embeddings = model.encode(sentences, convert_to_tensor=True)
embeddings = F.layer_norm(embeddings, normalized_shape=(embeddings.shape[1],))
embeddings = embeddings[:, :matryoshka_dim]
embeddings = F.normalize(embeddings, p=2, dim=1)

It seems unusual, is this a mistake, or is there something I'm not understanding?

Or in other words, what's wrong with this:

embeddings = model.encode(sentences, convert_to_tensor=True)
embeddings = F.normalize(embeddings[:, :matryoshka_dim])  # limit dims and normalize

zpn

Nomic AI org Oct 1

You're right that it's non-standard! we used it to train our model to be binary-aware, inspired by this tweet/paper. We messed around with this during a hack week and found it worked fairly well and was simpler than using a STE

zpn changed discussion status to closed Oct 1

Davidg707

Oct 1

Ah so this is specific to the classification case. If I just want to use (and truncate) the embeddings for similarity search, I assume I don't need the layer_norm step.

I compared the distributions of values with and without the layer_norm step and they're close to identical (since the values out of the model are already mean close to 0 and std near-ish 1)

nomic-ai
/

nomic-embed-text-v1.5

Use of `layer_norm` in examples