pkshatech
/

RoSEtta-base-ja

@@ -17,7 +17,7 @@ datasets:
 ---
 ## Model Details
-The model is 1024-context sentence embedding model based on the RoFormer.
 The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
 Fine-tuning begins with weakly supervised learning using mc4 and MQA.
 After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
@@ -120,25 +120,25 @@ You can finetune this model on your own dataset.
 Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA]（https://huggingface.co/datasets/hotchpotch/JQaRA） and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
 | model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
-|--------|--------|---------------------|-------------------|-------------------|
-| me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
 | GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
-| RoSEtta | 0.2B | 79.3 | 57.7 | 32.3 |
 ### JMTEB
 Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
-* Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
 * The average is a macro-average per task.
 | model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
-|--------|--------|--------|------|------|-------|-------|------|
 | me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
-| GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
-| RoSEtta | 0.2B | 79.0 | 84.3 | 81.4 | 53.2 | 61.7 | 71.9 |
 ## Authors
-Chihiro Yano, Go Mocho, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
 ## License
 This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).

 ---
 ## Model Details
+This is a text embedding model based on RoFormer with a maximum input sequence length of 1024.
 The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
 Fine-tuning begins with weakly supervised learning using mc4 and MQA.
 After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
 Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA]（https://huggingface.co/datasets/hotchpotch/JQaRA） and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
 | model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
+|:--:|:--:|:--:|:--:|:----:|
+| me5-base | 0.3B | **84.2** | 47.2 | 25.4 |
 | GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
+| RoSEtta | 0.2B | 79.3 | **57.7** | **32.3** |
 ### JMTEB
 Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
+* The time-consuming datasets ['amazon_review_classification', 'mrtydi', 'jaqket', 'esci'] were excluded, and the evaluation was conducted on the other 12 datasets.
 * The average is a macro-average per task.
 | model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
+|:--:|:--:|:--:|:--:|:----:|:-------:|:-------:|:------:|
 | me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
+| GLuCoSE | 0.1B | **82.6** | 69.8 | 78.2 | 51.5 | **66.2** | 69.7 |
+| RoSEtta | 0.2B | 79.0 | **84.3** | **81.4** | **53.2** | 61.7 | **71.9** |
 ## Authors
+Chihiro Yano, Mocho Go, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
 ## License
 This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).