Asking an embedding model supporting both English and Chinese

by phamvantoan - opened


I am working on RAG (Retrieval-Augmented Generation) application and my local document includes both English and Chinese.

When I did testing for bge-large-zh-v1.5 and bge-large-en-v1.5, each of them works OK for their according language (Zh / En).

Besides, I tried bge-reranker-large but it doesn't work well for either English or Chinese. However, it works like a charm for re-ranker purpose.

So, do you have any plan to release an embedding model working well for both English and Chinese? If yes, can I know ETA?

Beijing Academy of Artificial Intelligence org

Hi, thanks for your interest in our work!
The reranker model directly computes the score of query and passage, and it cannot be used to map text into embedding.
We plan to release a new multilingual model in January.

Thank you for your quick response!

Hope to see the new model in January!

Sign up or log in to comment