Gemma Embeddings v1.0

GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 12, 2024, GemmaEmbed achieves the #1 position overall on the MTEB leaderboard, with a score of 72.72.

Important Notes

  • This is not an official Google product.
  • This is a research project.

Results summary

Results comparing with BGE-EN-ICL and NV-Embed-v2 on each task in MTEB:

Model Total (56) Classification (12) Classification Pair (3) STS (10) Clustering (11) Reranking (4) Retrieval (15) Summary (1)
bge-en-icl 0.7167 0.8895 0.8814 0.8425 0.5789 0.5986 0.6216 0.3077
NV-Embed-v2 0.7231 0.9037 0.8867 0.8431 0.5846 0.6065 0.6265 0.3070
Gemma-Embeddings-v1.0 0.7272 0.9000 0.8809 0.8423 0.5826 0.6214 0.6371 0.4052

Model & Data

Our base encoder model is Gemma2 9B.

We use the BGE-EN-ICL training data.

Research Team

  • Nicholas Monath
  • Michael Boratko
  • Seungyeon Kim
  • Andrew McCallum
  • Rob Fergus
  • Manzil Zaheer
Downloads last month
459
Inference API
Unable to determine this model's library. Check the docs .

Model tree for google/Gemma-Embeddings-v1.0

Base model

google/gemma-2-9b
Finetuned
(87)
this model

Evaluation results