PM-AI
/

bi-encoder_msmarco_bert-base_german

@@ -186,7 +186,7 @@ The following table shows the evaluation results for different approaches and mo
 **model**|**NDCG@1**|**NDCG@10**|**NDCG@100**|**comment**
 :-----:|:-----:|:-----:|:-----:|:-----:
 bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br /> 🏆 | 0.7360 <br /> 🏆 | "OUR model"
-[deepset/gbert-base-germandpr-X](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) | 0.4828 | 0.6970 | 0.7147 | "has two encoder models (one for queries and one for corpus), is SOTA approach"
 [distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1) | 0.4561 | 0.6347 | 0.6613 | "trained on 15 languages"
 [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.4511 | 0.6328 | 0.6592 | "trained on huge corpus, support for 50+ languages"
 [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 0.4350 | 0.6103 | 0.6411 | "trained on 50+ languages"
@@ -195,10 +195,15 @@ bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br />
 [BM25](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#bm25) | 0.3196 | 0.5377 | 0.5740 | "lexical approach"
 **It is crucial to understand that the comparisons are also made with models based on other transformer approaches.**
-For example, in particular [deepset/gbert-base-germandpr-X](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) is theoretically a more up-to-date approach that is nevertheless beaten.
 A direct comparison based on the same approach can be made with [svalabs/bi-electra-ms-marco-german-uncased](svalabs/bi-electra-ms-marco-german-uncased).
 In this case, the model presented here outperforms its predecessor by up to 14 percentage points.
 Note:
 - Texts used for evaluation are sometimes very long. All models, except for BM25 approach, truncate the incoming texts some point. This can decrease performance.
 - Evaluation of deepset's gbert-base-germandpr model might give an incorrect impression. The model was originally trained on the data we used for evaluation (not 1:1 but almost).

 **model**|**NDCG@1**|**NDCG@10**|**NDCG@100**|**comment**
 :-----:|:-----:|:-----:|:-----:|:-----:
 bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br /> 🏆 | 0.7360 <br /> 🏆 | "OUR model"
+[deepset/gbert-base-germandpr-X_encoder](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) | 0.4828 | 0.6970 | 0.7147 | "has two encoder models (one for queries and one for corpus), is SOTA approach"
 [distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1) | 0.4561 | 0.6347 | 0.6613 | "trained on 15 languages"
 [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.4511 | 0.6328 | 0.6592 | "trained on huge corpus, support for 50+ languages"
 [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 0.4350 | 0.6103 | 0.6411 | "trained on 50+ languages"
 [BM25](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#bm25) | 0.3196 | 0.5377 | 0.5740 | "lexical approach"
 **It is crucial to understand that the comparisons are also made with models based on other transformer approaches.**
 A direct comparison based on the same approach can be made with [svalabs/bi-electra-ms-marco-german-uncased](svalabs/bi-electra-ms-marco-german-uncased).
 In this case, the model presented here outperforms its predecessor by up to 14 percentage points.
+Comparing with [deepset/gbert-base-germandpr-X_encoder](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) is theoretically a little unfair since deepset's approach is based on two models at the same time!
+Queries and passages are encoded separately which leads to a better, more superior contextualization.
+Still, our newly trained model is outperforming the other approach by around two percentage points.
+In addition, using two models at the same time also increases demands on memory and CPU power which causes higher costs.
+This makes the approach presented here even more valuable.
 Note:
 - Texts used for evaluation are sometimes very long. All models, except for BM25 approach, truncate the incoming texts some point. This can decrease performance.
 - Evaluation of deepset's gbert-base-germandpr model might give an incorrect impression. The model was originally trained on the data we used for evaluation (not 1:1 but almost).