Compiling the model to GPU.0 or CPU gives different results

by Fede90 - opened

If I compute the cosine similarity with "GPU.0" or with "CPU" the same code gives me two different results. How is that possible? How to fix that?

embedding vector encoded with OVSentenceTransformer from

util.pytorch_cos_sim(embedding_encoded_vector_1, embedding_encoded_vector_2)

EmbeddedLLM org

How large is the difference? Can you compute the mean absolute percentage error too?

The difference was very big: 0.63 instead of 0.86.
I quantized the model with int8 and in this case the difference is practically irrelevant.

Fede90 changed discussion status to closed

Sign up or log in to comment