Update README.md
Browse files
README.md
CHANGED
@@ -186,7 +186,7 @@ The following table shows the evaluation results for different approaches and mo
|
|
186 |
**model**|**NDCG@1**|**NDCG@10**|**NDCG@100**|**comment**
|
187 |
:-----:|:-----:|:-----:|:-----:|:-----:
|
188 |
bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br /> 🏆 | 0.7360 <br /> 🏆 | "OUR model"
|
189 |
-
[deepset/gbert-base-germandpr-
|
190 |
[distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1) | 0.4561 | 0.6347 | 0.6613 | "trained on 15 languages"
|
191 |
[paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.4511 | 0.6328 | 0.6592 | "trained on huge corpus, support for 50+ languages"
|
192 |
[distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 0.4350 | 0.6103 | 0.6411 | "trained on 50+ languages"
|
@@ -195,10 +195,15 @@ bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br />
|
|
195 |
[BM25](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#bm25) | 0.3196 | 0.5377 | 0.5740 | "lexical approach"
|
196 |
|
197 |
**It is crucial to understand that the comparisons are also made with models based on other transformer approaches.**
|
198 |
-
For example, in particular [deepset/gbert-base-germandpr-X](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) is theoretically a more up-to-date approach that is nevertheless beaten.
|
199 |
A direct comparison based on the same approach can be made with [svalabs/bi-electra-ms-marco-german-uncased](svalabs/bi-electra-ms-marco-german-uncased).
|
200 |
In this case, the model presented here outperforms its predecessor by up to 14 percentage points.
|
201 |
|
|
|
|
|
|
|
|
|
|
|
|
|
202 |
Note:
|
203 |
- Texts used for evaluation are sometimes very long. All models, except for BM25 approach, truncate the incoming texts some point. This can decrease performance.
|
204 |
- Evaluation of deepset's gbert-base-germandpr model might give an incorrect impression. The model was originally trained on the data we used for evaluation (not 1:1 but almost).
|
|
|
186 |
**model**|**NDCG@1**|**NDCG@10**|**NDCG@100**|**comment**
|
187 |
:-----:|:-----:|:-----:|:-----:|:-----:
|
188 |
bi-encoder_msmarco_bert-base_german (new) | 0.5300 <br /> 🏆 | 0.7196 <br /> 🏆 | 0.7360 <br /> 🏆 | "OUR model"
|
189 |
+
[deepset/gbert-base-germandpr-X_encoder](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) | 0.4828 | 0.6970 | 0.7147 | "has two encoder models (one for queries and one for corpus), is SOTA approach"
|
190 |
[distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1) | 0.4561 | 0.6347 | 0.6613 | "trained on 15 languages"
|
191 |
[paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 0.4511 | 0.6328 | 0.6592 | "trained on huge corpus, support for 50+ languages"
|
192 |
[distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 0.4350 | 0.6103 | 0.6411 | "trained on 50+ languages"
|
|
|
195 |
[BM25](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#bm25) | 0.3196 | 0.5377 | 0.5740 | "lexical approach"
|
196 |
|
197 |
**It is crucial to understand that the comparisons are also made with models based on other transformer approaches.**
|
|
|
198 |
A direct comparison based on the same approach can be made with [svalabs/bi-electra-ms-marco-german-uncased](svalabs/bi-electra-ms-marco-german-uncased).
|
199 |
In this case, the model presented here outperforms its predecessor by up to 14 percentage points.
|
200 |
|
201 |
+
Comparing with [deepset/gbert-base-germandpr-X_encoder](https://huggingface.co/deepset/gbert-base-germandpr-ctx_encoder) is theoretically a little unfair since deepset's approach is based on two models at the same time!
|
202 |
+
Queries and passages are encoded separately which leads to a better, more superior contextualization.
|
203 |
+
Still, our newly trained model is outperforming the other approach by around two percentage points.
|
204 |
+
In addition, using two models at the same time also increases demands on memory and CPU power which causes higher costs.
|
205 |
+
This makes the approach presented here even more valuable.
|
206 |
+
|
207 |
Note:
|
208 |
- Texts used for evaluation are sometimes very long. All models, except for BM25 approach, truncate the incoming texts some point. This can decrease performance.
|
209 |
- Evaluation of deepset's gbert-base-germandpr model might give an incorrect impression. The model was originally trained on the data we used for evaluation (not 1:1 but almost).
|