Compiling the model to GPU.0 or CPU gives different results

#1
by Fede90 - opened

If I compute the cosine similarity with "GPU.0" or with "CPU" the same code gives me two different results. How is that possible? How to fix that?

Code:
embedding vector encoded with OVSentenceTransformer from optimum.intel.openvino

util.pytorch_cos_sim(embedding_encoded_vector_1, embedding_encoded_vector_2)

EmbeddedLLM org

How large is the difference? Can you compute the mean absolute percentage error too?

The difference was very big: 0.63 instead of 0.86.
I quantized the model with int8 and in this case the difference is practically irrelevant.

Fede90 changed discussion status to closed

Sign up or log in to comment