Multiple arrays of vectors returned from feature extraction inference endpoint
Hello, I've created an inference endpoint for feature extraction with the BAAI/bge-small-en-v1.5 model. When I send off a text string for embedding I am expecting a response containing a single array of 384 vectors. Instead I receive multiple arrays of 384 vectors (the number of arrays varies between 5 and 11).
When I tested with the hosted inference API on the models huggingface page the response was only a single array of 384 vectors. https://huggingface.co/BAAI/bge-small-en-v1.5?text=Hello+world
Why am I receiving multiple vector arrays as a response and can I limiti it to one? If not can I just take the first of these for my embedding?
Thanks!
Hi, we use the first hidden state as the sentence embedding. You can refer to the code: https://github.com/FlagOpen/FlagEmbedding/tree/master#using-huggingface-transformers to get the sentence embedding.