Different behavior between SentenceTransformer and TEI/Infinity when using gte-large-en-v1.5
#17
by
smityz
- opened
System Info
Reproduction
# using TEI
# https://huggingface.co/docs/text-embeddings-inference/index
model=Alibaba-NLP/gte-large-en-v1.5
text-embeddings-router --model-id $model --port 8080
curl -X POST "http://localhost:8080/embeddings" \
-H "Content-Type: application/json" \
-d '{"input":["Dimension table for main account?"]}'
[
-0.0006371783,
-0.03931647,
-0.010235489,
-0.019322978,
-0.014273809,
0.022573953
]
# using infinity_emb
# https://github.com/michaelfeil/infinity
infinity_emb v2 --model-id Alibaba-NLP/gte-large-en-v1.5
curl -X POST http://localhost:7997/embeddings -H 'Content-Type: application/json' \
-d '{"input": ["Dimension table for main account?"]}' \
| jq '.data[0].embedding | .[:6]'
[
-0.000593528151512146,
-0.039367105811834335,
-0.010303903371095657,
-0.01923666149377823,
-0.014310694299638271,
0.02248678356409073
]
# using SentenceTransformer
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Alibaba-NLP/gte-large-en-v1.5",trust_remote_code=True)
embeddings = model.encode(['Dimension table for main account?'])
print(list(embeddings[0][:6]))
[-0.015188057, -0.9458093, -0.24485634, -0.4617836, -0.3435278, 0.53972]
When using SentenceTransformer
, it will download a new model named Alibaba-NLP/new-impl
, but TEI/infinity_emb may use the original model.
/home/smilencer/miniconda3/envs/ml/lib/python3.12/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
configuration.py: 7.13kB [00:00, 25.2MB/s]
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- configuration.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
modeling.py: 59.0kB [00:00, 350kB/s]
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- modeling.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
Is there anyway to make TEI/infinity_emb to use Alibaba-NLP/new-impl
?
I tried to modify the repo files ref https://huggingface.co/Alibaba-NLP/new-impl/discussions/2, but it's not working.
Expected behavior
the embedding results are the same