Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ datasets:
|
|
17 |
---
|
18 |
|
19 |
## Model Details
|
20 |
-
|
21 |
The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
|
22 |
Fine-tuning begins with weakly supervised learning using mc4 and MQA.
|
23 |
After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
|
@@ -120,25 +120,25 @@ You can finetune this model on your own dataset.
|
|
120 |
Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA](https://huggingface.co/datasets/hotchpotch/JQaRA) and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
|
121 |
|
122 |
| model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
|
123 |
-
|
124 |
-
| me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
|
125 |
| GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
|
126 |
-
| RoSEtta | 0.2B | 79.3 | 57.7 | 32.3 |
|
127 |
|
128 |
|
129 |
### JMTEB
|
130 |
Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
|
131 |
-
*
|
132 |
* The average is a macro-average per task.
|
133 |
|
134 |
| model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
|
135 |
-
|
136 |
| me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
|
137 |
-
| GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
|
138 |
-
| RoSEtta | 0.2B | 79.0 | 84.3 | 81.4 | 53.2 | 61.7 | 71.9 |
|
139 |
|
140 |
## Authors
|
141 |
-
Chihiro Yano, Go
|
142 |
|
143 |
## License
|
144 |
This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
|
|
17 |
---
|
18 |
|
19 |
## Model Details
|
20 |
+
This is a text embedding model based on RoFormer with a maximum input sequence length of 1024.
|
21 |
The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
|
22 |
Fine-tuning begins with weakly supervised learning using mc4 and MQA.
|
23 |
After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
|
|
|
120 |
Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA](https://huggingface.co/datasets/hotchpotch/JQaRA) and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
|
121 |
|
122 |
| model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
|
123 |
+
|:--:|:--:|:--:|:--:|:----:|
|
124 |
+
| me5-base | 0.3B | **84.2** | 47.2 | 25.4 |
|
125 |
| GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
|
126 |
+
| RoSEtta | 0.2B | 79.3 | **57.7** | **32.3** |
|
127 |
|
128 |
|
129 |
### JMTEB
|
130 |
Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
|
131 |
+
* The time-consuming datasets ['amazon_review_classification', 'mrtydi', 'jaqket', 'esci'] were excluded, and the evaluation was conducted on the other 12 datasets.
|
132 |
* The average is a macro-average per task.
|
133 |
|
134 |
| model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
|
135 |
+
|:--:|:--:|:--:|:--:|:----:|:-------:|:-------:|:------:|
|
136 |
| me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
|
137 |
+
| GLuCoSE | 0.1B | **82.6** | 69.8 | 78.2 | 51.5 | **66.2** | 69.7 |
|
138 |
+
| RoSEtta | 0.2B | 79.0 | **84.3** | **81.4** | **53.2** | 61.7 | **71.9** |
|
139 |
|
140 |
## Authors
|
141 |
+
Chihiro Yano, Mocho Go, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
|
142 |
|
143 |
## License
|
144 |
This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|