Linq-AI-Research
/

Linq-Embed-Mistral

Feature Extraction

sentence-transformers

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

Junseong commited on May 29

Commit

e563c48

•

1 Parent(s): ff8be30

MOD: README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -11,7 +11,8 @@ license: cc-by-nc-4.0
 Linq-Embed-Mistral has been developed by building upon the foundations of the [E5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) and [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) models. We focus on improving text retrieval using advanced data refinement methods, including sophisticated data crafting, data filtering, and negative mining techniques. These methods are applied to both existing benchmark datasets and highly tailored synthetic datasets generated via LLMs. To enhance the quality of the synthetic data, we employ extensive prompt engineering and guidance from teacher models, ensuring these methods are specifically tailored to each task. Our efforts primarily aim to create high-quality triplet datasets (query, positive example, negative example), significantly improving text retrieval performance.
-Linq-Embed-Mistral excels in retrieval tasks, ranking `1st` among all models listed on the MTEB leaderboard with a performance score of `xxx`. The model performs well in the MTEB benchmarks, achieving an average score of `yyy` across 56 datasets. This performance ranks it 2nd among publicly accessible models on the MTEB leaderboard and 3rd overall among all evaluated models.
 This project is for research purposes only. Third-party datasets may be subject to additional terms and conditions under their associated licenses. Please refer to specific papers for more details:

 Linq-Embed-Mistral has been developed by building upon the foundations of the [E5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) and [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) models. We focus on improving text retrieval using advanced data refinement methods, including sophisticated data crafting, data filtering, and negative mining techniques. These methods are applied to both existing benchmark datasets and highly tailored synthetic datasets generated via LLMs. To enhance the quality of the synthetic data, we employ extensive prompt engineering and guidance from teacher models, ensuring these methods are specifically tailored to each task. Our efforts primarily aim to create high-quality triplet datasets (query, positive example, negative example), significantly improving text retrieval performance.
+Linq-Embed-Mistral performs well in the MTEB benchmarks. The model excels in retrieval tasks, ranking <ins>**`1st`**</ins> among all models listed on the MTEB leaderboard with a performance score of <ins>**`xxx`**</ins>. This outstanding performance underscores its superior capability in enhancing search precision and reliability. The model achieves an average score of <ins>**`yyy`**</ins> across 56 datasets in the MTEB benchmarks, making it the highest-ranking publicly accessible model and third overall.
 This project is for research purposes only. Third-party datasets may be subject to additional terms and conditions under their associated licenses. Please refer to specific papers for more details: