This model has been first pretrained on the BEIR corpus and fine-tuned on the MS MARCO dataset following the approach described in the paper COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning. The associated GitHub repository is available here https://github.com/OpenMatch/COCO-DR.
This model is trained with BERT-large as the backbone with 335M hyperparameters. See the paper https://arxiv.org/abs/2210.15212 for details.
Usage
Pre-trained models can be loaded through the HuggingFace transformers library:
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("OpenMatch/cocodr-large-msmarco")
tokenizer = AutoTokenizer.from_pretrained("OpenMatch/cocodr-large-msmarco")
- Downloads last month
- 43,894
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.