thenlper commited on
Commit
47afa7f
1 Parent(s): a172bbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -2616,7 +2616,7 @@ The models are built upon the `transformer++` encoder [backbone](https://hugging
2616
  The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
2617
 
2618
  We also present the [`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct),
2619
- a SOTA instruction-tuned bilingual embedding model that ranked 2nd in MTEB and 1st in C-MTEB.
2620
 
2621
  <!-- Provide a longer summary of what this model is. -->
2622
 
@@ -2630,7 +2630,7 @@ a SOTA instruction-tuned bilingual embedding model that ranked 2nd in MTEB and 1
2630
 
2631
  | Models | Language | Model Size | Max Seq. Length | Dimension | MTEB-en | LoCo |
2632
  |:-----: | :-----: |:-----: |:-----: |:-----: | :-----: | :-----: |
2633
- |[`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct)| Chinese, English | 7720 | 32768 | 4096 | 67.34 | 87.57 |
2634
  |[`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | English | 434 | 8192 | 1024 | 65.39 | 86.71 |
2635
  |[`gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | English | 137 | 8192 | 768 | 64.11 | 87.44 |
2636
 
@@ -2691,8 +2691,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
2691
  ### Training Data
2692
 
2693
  - Masked language modeling (MLM): `c4-en`
2694
- - Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
2695
- - Supervised contrastive fine-tuning: GTE fine-tuning data
2696
 
2697
  ### Training Procedure
2698
 
@@ -2737,14 +2737,15 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
2737
 
2738
 
2739
 
2740
- ## Citation [TODO]
2741
 
2742
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
2743
 
2744
- **BibTeX:**
2745
-
2746
- [More Information Needed]
2747
-
2748
- **APA:**
2749
-
2750
- [More Information Needed]
 
 
2616
  The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
2617
 
2618
  We also present the [`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct),
2619
+ a SOTA instruction-tuned multi-lingual embedding model that ranked 2nd in MTEB and 1st in C-MTEB.
2620
 
2621
  <!-- Provide a longer summary of what this model is. -->
2622
 
 
2630
 
2631
  | Models | Language | Model Size | Max Seq. Length | Dimension | MTEB-en | LoCo |
2632
  |:-----: | :-----: |:-----: |:-----: |:-----: | :-----: | :-----: |
2633
+ |[`gte-Qwen1.5-7B-instruct`](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct)| multi lingual | 7720 | 32768 | 4096 | 67.34 | 87.57 |
2634
  |[`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | English | 434 | 8192 | 1024 | 65.39 | 86.71 |
2635
  |[`gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | English | 137 | 8192 | 768 | 64.11 | 87.44 |
2636
 
 
2691
  ### Training Data
2692
 
2693
  - Masked language modeling (MLM): `c4-en`
2694
+ - Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
2695
+ - Supervised contrastive fine-tuning: GTE(https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
2696
 
2697
  ### Training Procedure
2698
 
 
2737
 
2738
 
2739
 
2740
+ ## Citation
2741
 
2742
+ If you find our paper or models helpful, please consider citing them as follows:
2743
 
2744
+ ```
2745
+ @article{li2023towards,
2746
+ title={Towards general text embeddings with multi-stage contrastive learning},
2747
+ author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
2748
+ journal={arXiv preprint arXiv:2308.03281},
2749
+ year={2023}
2750
+ }
2751
+ ```