BGE模型的最大input_size是多少？

#12

by lehuangg - opened Jan 16

Jan 16

你好，请问一下BGE系列模型对input_size有限制吗？large模型最大能对多长的文本做embedding呢？

Shitao

Beijing Academy of Artificial Intelligence org Jan 16

您好，目前模型的最大输入是512个token。

Jan 16

好的，感谢您的回答。

lehuangg changed discussion status to closed Jan 16

lehuangg changed discussion status to open Jan 16

Jan 16

那如果我输入的是一个长度为512的字符串，他经过tokenizer处理后得到600个token，多余的token会被舍弃吗？输入的文本的长度在多少比较合适呢？

Shitao

Beijing Academy of Artificial Intelligence org Jan 16

是的，超过512之后会被舍弃。
输入不超过512就行。

Jan 16

好的，再次感谢。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment