Update README.md
Browse files
README.md
CHANGED
@@ -2616,7 +2616,7 @@ The models are built upon the `transformer++` encoder [backbone](https://hugging
|
|
2616 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
2617 |
|
2618 |
We also present the [`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct),
|
2619 |
-
a SOTA instruction-tuned
|
2620 |
|
2621 |
<!-- Provide a longer summary of what this model is. -->
|
2622 |
|
@@ -2630,7 +2630,7 @@ a SOTA instruction-tuned bilingual embedding model that ranked 2nd in MTEB and 1
|
|
2630 |
|
2631 |
| Models | Language | Model Size | Max Seq. Length | Dimension | MTEB-en | LoCo |
|
2632 |
|:-----: | :-----: |:-----: |:-----: |:-----: | :-----: | :-----: |
|
2633 |
-
|[`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct)|
|
2634 |
|[`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | English | 434 | 8192 | 1024 | 65.39 | 86.71 |
|
2635 |
|[`gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | English | 137 | 8192 | 768 | 64.11 | 87.44 |
|
2636 |
|
@@ -2691,8 +2691,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
|
|
2691 |
### Training Data
|
2692 |
|
2693 |
- Masked language modeling (MLM): `c4-en`
|
2694 |
-
- Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
|
2695 |
-
- Supervised contrastive fine-tuning: GTE fine-tuning data
|
2696 |
|
2697 |
### Training Procedure
|
2698 |
|
@@ -2737,14 +2737,15 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
|
|
2737 |
|
2738 |
|
2739 |
|
2740 |
-
## Citation
|
2741 |
|
2742 |
-
|
2743 |
|
2744 |
-
|
2745 |
-
|
2746 |
-
|
2747 |
-
|
2748 |
-
|
2749 |
-
|
2750 |
-
|
|
|
|
2616 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
2617 |
|
2618 |
We also present the [`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct),
|
2619 |
+
a SOTA instruction-tuned multi-lingual embedding model that ranked 2nd in MTEB and 1st in C-MTEB.
|
2620 |
|
2621 |
<!-- Provide a longer summary of what this model is. -->
|
2622 |
|
|
|
2630 |
|
2631 |
| Models | Language | Model Size | Max Seq. Length | Dimension | MTEB-en | LoCo |
|
2632 |
|:-----: | :-----: |:-----: |:-----: |:-----: | :-----: | :-----: |
|
2633 |
+
|[`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct)| multi lingual | 7720 | 32768 | 4096 | 67.34 | 87.57 |
|
2634 |
|[`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | English | 434 | 8192 | 1024 | 65.39 | 86.71 |
|
2635 |
|[`gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | English | 137 | 8192 | 768 | 64.11 | 87.44 |
|
2636 |
|
|
|
2691 |
### Training Data
|
2692 |
|
2693 |
- Masked language modeling (MLM): `c4-en`
|
2694 |
+
- Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
|
2695 |
+
- Supervised contrastive fine-tuning: GTE(https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
|
2696 |
|
2697 |
### Training Procedure
|
2698 |
|
|
|
2737 |
|
2738 |
|
2739 |
|
2740 |
+
## Citation
|
2741 |
|
2742 |
+
If you find our paper or models helpful, please consider citing them as follows:
|
2743 |
|
2744 |
+
```
|
2745 |
+
@article{li2023towards,
|
2746 |
+
title={Towards general text embeddings with multi-stage contrastive learning},
|
2747 |
+
author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
|
2748 |
+
journal={arXiv preprint arXiv:2308.03281},
|
2749 |
+
year={2023}
|
2750 |
+
}
|
2751 |
+
```
|