juanluisdb commited on
Commit
5c9c5e5
1 Parent(s): 9837751

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -27
README.md CHANGED
@@ -45,37 +45,37 @@ scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Que
45
  ### BEIR (NDCG@10)
46
  I've run tests on different BEIR datasets. Cross Encoders rerank top100 BM25 results.
47
 
48
- | | bm25 | jina-reranker-v1-turbo-en | bge-reranker-v2-m3 | mxbai-rerank-base-v1 | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-refreshed-ablated | MiniLM-L-6-rerank-refreshed |
49
- |:---------------|-------:|----------------------------:|:---------------------|:-----------------------|-------------------------:|--------------------------------------:|:------------------------------|
50
- | nq | 0.305 | 0.533 | **0.597** | 0.535 | 0.523 | 0.541 | 0.580 |
51
- | fever | 0.638 | 0.852 | 0.857 | 0.767 | 0.801 | 0.822 | **0.867** |
52
- | fiqa | 0.238 | 0.336 | **0.397** | 0.382 | 0.349 | 0.36 | 0.364 |
53
- | trec-covid | 0.589 | 0.774 | 0.784 | **0.830** | 0.741 | 0.733 | 0.738 |
54
- | scidocs | 0.15 | 0.166 | 0.169 | **0.171** | 0.164 | 0.163 | 0.165 |
55
- | scifact | 0.676 | 0.739 | 0.731 | 0.719 | 0.688 | 0.738 | **0.750** |
56
- | nfcorpus | 0.318 | 0.353 | 0.336 | **0.353** | 0.349 | 0.35 | 0.350 |
57
- | hotpotqa | 0.629 | 0.745 | **0.794** | 0.668 | 0.724 | 0.758 | 0.775 |
58
- | dbpedia-entity | 0.319 | 0.421 | **0.445** | 0.416 | 0.445 | 0.438 | 0.444 |
59
- | quora | 0.787 | 0.858 | 0.858 | 0.747 | 0.825 | 0.862 | **0.871** |
60
- | climate-fever | 0.163 | 0.233 | **0.314** | 0.253 | 0.244 | 0.245 | 0.309 |
61
-
62
- | | nq* | fever* | fiqa | trec-covid | scidocs | scifact | nfcorpus | hotpotqa | dbpedia-entity | quora | climate-fever |
63
- |:--------------------------|:----------|:----------|:----------|:-------------|:----------|:----------|:-----------|:-----------|:-----------------|:----------|:----------------|
64
- | bm25 | 0.305 | 0.638 | 0.238 | 0.589 | 0.150 | 0.676 | 0.318 | 0.629 | 0.319 | 0.787 | 0.163 |
65
- | jina-reranker-v1-turbo-en | 0.533 | 0.852 | 0.336 | 0.774 | 0.166 | 0.739 | 0.353 | 0.745 | 0.421 | 0.858 | 0.233 |
66
- | bge-reranker-v2-m3 | **0.597** | 0.857 | **0.397** | 0.784 | 0.169 | 0.731 | 0.336 | **0.794** | **0.445** | 0.858 | **0.314** |
67
- | mxbai-rerank-base-v1 | 0.535 | 0.767 | 0.382 | **0.830** | **0.171** | 0.719 | **0.353** | 0.668 | 0.416 | 0.747 | 0.253 |
68
- | ms-marco-MiniLM-L-6-v2 | 0.523 | 0.801 | 0.349 | 0.741 | 0.164 | 0.688 | 0.349 | 0.724 | 0.445 | 0.825 | 0.244 |
69
- | MiniLM-L-6-rerank-reborn | 0.580 | **0.867** | 0.364 | 0.738 | 0.165 | **0.750** | 0.350 | 0.775 | 0.444 | **0.871** | 0.309 |
70
 
71
  \* Training splits of NQ and Fever were used as part of the training data.
72
 
73
  Comparison with [ablated model](https://huggingface.co/juanluisdb/MiniLM-L-6-rerank-reborn-ablated/settings) trained only on MSMarco:
74
- | | nq | fever | fiqa | trec-covid | scidocs | scifact | nfcorpus | hotpotqa | dbpedia-entity | quora | climate-fever |
75
- |:------------------------------------|-------:|--------:|-------:|-------------:|----------:|----------:|-----------:|-----------:|-----------------:|--------:|----------------:|
76
- | ms-marco-MiniLM-L-6-v2 | 0.5234 | 0.8007 | 0.349 | 0.741 | 0.1638 | 0.688 | 0.3493 | 0.7235 | 0.4445 | 0.8251 | 0.2438 |
77
- | MiniLM-L-6-rerank-refreshed-ablated | 0.5412 | 0.8221 | 0.3598 | 0.7331 | 0.163 | 0.7376 | 0.3495 | 0.7583 | 0.4382 | 0.8619 | 0.2449 |
78
- | improvement (%) | **3.40** | **2.67** | **3.08** | -1.07 | -0.47 | **7.22** | 0.08 | **4.80** | -1.41 | **4.45** | **0.47** |
 
 
 
 
 
 
 
 
 
79
 
80
 
81
  # Datasets Used
 
45
  ### BEIR (NDCG@10)
46
  I've run tests on different BEIR datasets. Cross Encoders rerank top100 BM25 results.
47
 
48
+ | | bm25 | jina-reranker-v1-turbo-en | bge-reranker-v2-m3 | mxbai-rerank-base-v1 | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-refreshed |
49
+ |:---------------|-------:|----------------------------:|:---------------------|:-----------------------|-------------------------:|:------------------------------|
50
+ | nq* | 0.305 | 0.533 | **0.597** | 0.535 | 0.523 | 0.580 |
51
+ | fever* | 0.638 | 0.852 | 0.857 | 0.767 | 0.801 | **0.867** |
52
+ | fiqa | 0.238 | 0.336 | **0.397** | 0.382 | 0.349 | 0.364 |
53
+ | trec-covid | 0.589 | 0.774 | 0.784 | **0.830** | 0.741 | 0.738 |
54
+ | scidocs | 0.15 | 0.166 | 0.169 | **0.171** | 0.164 | 0.165 |
55
+ | scifact | 0.676 | 0.739 | 0.731 | 0.719 | 0.688 | **0.750** |
56
+ | nfcorpus | 0.318 | 0.353 | 0.336 | **0.353** | 0.349 | 0.350 |
57
+ | hotpotqa | 0.629 | 0.745 | **0.794** | 0.668 | 0.724 | 0.775 |
58
+ | dbpedia-entity | 0.319 | 0.421 | **0.445** | 0.416 | 0.445 | 0.444 |
59
+ | quora | 0.787 | 0.858 | 0.858 | 0.747 | 0.825 | **0.871** |
60
+ | climate-fever | 0.163 | 0.233 | **0.314** | 0.253 | 0.244 | 0.309 |
 
 
 
 
 
 
 
 
 
61
 
62
  \* Training splits of NQ and Fever were used as part of the training data.
63
 
64
  Comparison with [ablated model](https://huggingface.co/juanluisdb/MiniLM-L-6-rerank-reborn-ablated/settings) trained only on MSMarco:
65
+
66
+ | | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-refreshed-ablated |
67
+ |:---------------|-------------------------:|--------------------------------------:|
68
+ | nq | 0.5234 | **0.5412** |
69
+ | fever | 0.8007 | **0.8221** |
70
+ | fiqa | 0.349 | **0.3598** |
71
+ | trec-covid | **0.741** | 0.7331 |
72
+ | scidocs | **0.1638** | 0.163 |
73
+ | scifact | 0.688 | **0.7376** |
74
+ | nfcorpus | 0.3493 | **0.3495** |
75
+ | hotpotqa | 0.7235 | **0.7583** |
76
+ | dbpedia-entity | **0.4445** | 0.4382 |
77
+ | quora | 0.8251 | **0.8619** |
78
+ | climate-fever | 0.2438 | **0.2449** |
79
 
80
 
81
  # Datasets Used