izhx commited on
Commit
a8e4f3e
1 Parent(s): 269b9ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -11
README.md CHANGED
@@ -2622,7 +2622,8 @@ a SOTA instruction-tuned multi-lingual embedding model that ranked 2nd in MTEB a
2622
 
2623
  - **Developed by:** Institute for Intelligent Computing, Alibaba Group
2624
  - **Model type:** Text Embeddings
2625
- - **Paper:** Coming soon.
 
2626
 
2627
  <!-- - **Demo [optional]:** [More Information Needed] -->
2628
 
@@ -2717,7 +2718,7 @@ console.log(similarities); // [34.504930869007296, 64.03973265120138, 19.5200426
2717
  ### Training Data
2718
 
2719
  - Masked language modeling (MLM): `c4-en`
2720
- - Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
2721
  - Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
2722
 
2723
  ### Training Procedure
@@ -2728,8 +2729,8 @@ And then, we resample the data, reducing the proportion of short texts, and cont
2728
 
2729
  The entire training process is as follows:
2730
  - MLM-2048: lr 5e-4, mlm_probability 0.3, batch_size 4096, num_steps 70000, rope_base 10000
2731
- - MLM-8192: lr 5e-5, mlm_probability 0.3, batch_size 1024, num_steps 20000, rope_base 500000
2732
- - WSC: max_len 512, lr 2e-4, batch_size 32768, num_steps 100000
2733
  - Fine-tuning: TODO
2734
 
2735
 
@@ -2766,12 +2767,22 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
2766
  If you find our paper or models helpful, please consider citing them as follows:
2767
 
2768
  ```
2769
- @article{li2023towards,
2770
- title={Towards general text embeddings with multi-stage contrastive learning},
2771
- author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
2772
- journal={arXiv preprint arXiv:2308.03281},
2773
- year={2023}
 
 
 
 
 
 
 
 
 
 
 
 
2774
  }
2775
  ```
2776
-
2777
-
 
2622
 
2623
  - **Developed by:** Institute for Intelligent Computing, Alibaba Group
2624
  - **Model type:** Text Embeddings
2625
+ - **Paper:** [mGTE: Generalized Long-Context Text Representation and Reranking
2626
+ Models for Multilingual Text Retrieval](https://arxiv.org/pdf/2407.19669)
2627
 
2628
  <!-- - **Demo [optional]:** [More Information Needed] -->
2629
 
 
2718
  ### Training Data
2719
 
2720
  - Masked language modeling (MLM): `c4-en`
2721
+ - Weak-supervised contrastive pre-training (CPT): [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
2722
  - Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
2723
 
2724
  ### Training Procedure
 
2729
 
2730
  The entire training process is as follows:
2731
  - MLM-2048: lr 5e-4, mlm_probability 0.3, batch_size 4096, num_steps 70000, rope_base 10000
2732
+ - [MLM-8192](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base): lr 5e-5, mlm_probability 0.3, batch_size 1024, num_steps 20000, rope_base 500000
2733
+ - CPT: max_len 512, lr 2e-4, batch_size 32768, num_steps 100000
2734
  - Fine-tuning: TODO
2735
 
2736
 
 
2767
  If you find our paper or models helpful, please consider citing them as follows:
2768
 
2769
  ```
2770
+ @misc{zhang2024mgte,
2771
+ title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
2772
+ author={Xin Zhang and Yanzhao Zhang and Dingkun Long and Wen Xie and Ziqi Dai and Jialong Tang and Huan Lin and Baosong Yang and Pengjun Xie and Fei Huang and Meishan Zhang and Wenjie Li and Min Zhang},
2773
+ year={2024},
2774
+ eprint={2407.19669},
2775
+ archivePrefix={arXiv},
2776
+ primaryClass={cs.CL},
2777
+ url={https://arxiv.org/abs/2407.19669},
2778
+ }
2779
+ @misc{li2023gte,
2780
+ title={Towards General Text Embeddings with Multi-stage Contrastive Learning},
2781
+ author={Zehan Li and Xin Zhang and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang},
2782
+ year={2023},
2783
+ eprint={2308.03281},
2784
+ archivePrefix={arXiv},
2785
+ primaryClass={cs.CL},
2786
+ url={https://arxiv.org/abs/2308.03281},
2787
  }
2788
  ```