Two questions: is max_seq_length = 75 ? If so, why 75?

#5
by hfsamhenry - opened

Because of this max_seq_length = 75 coming from https://huggingface.co/NbAiLab/nb-sbert-base/blob/main/sentence_bert_config.json I am getting different results when running the model using Sentence-Transformers vs. HuggingFace Transformers

The method from Sentence-Transformers is in fact limited to 75 tokens (including two special tokens)

The method from HuggingFace Transformers does not have a max length set so sequences up 512 tokens like the original BERT model will work

Is the 75 a mistake? Or is it intentional because the model wasn't fine tuned on sentences longer than 75 - 2 (CLS & SEP) = 73 tokens

See GitHub gist for details: https://gist.github.com/sam-h-long/a5874c55d2f4452651fe504fa607321f

hfsamhenry changed discussion title from Two questions: is actually max_seq_length = 75 ? If so, why 75? to Two questions: is max_seq_length = 75 ? If so, why 75?

Any thoughts on this?

Nasjonalbiblioteket AI Lab org

Hi. Sorry about the delayed response.

It seems the max length comes from this line in the script we used as the basis for our training.
I'm not sure why it's so low, but it might still work well if you override it.

Sign up or log in to comment