Gemma Embeddings v1.0

GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 12, 2024, GemmaEmbed achieves the #1 position overall on the MTEB leaderboard, with a score of 72.72.

Important Notes

This is not an official Google product.
This is a research project.

Results summary

Results comparing with BGE-EN-ICL and NV-Embed-v2 on each task in MTEB:

Model	Total (56)	Classification (12)	Classification Pair (3)	STS (10)	Clustering (11)	Reranking (4)	Retrieval (15)	Summary (1)
bge-en-icl	0.7167	0.8895	0.8814	0.8425	0.5789	0.5986	0.6216	0.3077
NV-Embed-v2	0.7231	0.9037	0.8867	0.8431	0.5846	0.6065	0.6265	0.3070
Gemma-Embeddings-v1.0	0.7272	0.9000	0.8809	0.8423	0.5826	0.6214	0.6371	0.4052

Model & Data

Our base encoder model is Gemma2 9B.

We use the BGE-EN-ICL training data.

Research Team

Nicholas Monath
Michael Boratko
Seungyeon Kim
Andrew McCallum
Rob Fergus
Manzil Zaheer

Model tree for google/Gemma-Embeddings-v1.0

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

94.627
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

91.931
f1_weighted on MTEB AmazonCounterfactualClassification (en)
test set self-reported

94.770
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

77.826
ap_weighted on MTEB AmazonCounterfactualClassification (en)
test set self-reported

77.826
main_score on MTEB AmazonCounterfactualClassification (en)
test set self-reported

94.627
accuracy on MTEB AmazonPolarityClassification (default)
test set self-reported

97.038
f1 on MTEB AmazonPolarityClassification (default)
test set self-reported

97.038
f1_weighted on MTEB AmazonPolarityClassification (default)
test set self-reported

97.038
ap on MTEB AmazonPolarityClassification (default)
test set self-reported

95.872

View on Papers With Code