pkshatech
/

RoSEtta-base-ja

@@ -1,64 +1,19 @@
 ---
-language: []
 library_name: sentence-transformers
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
-base_model: yano0/my_rope_bert_v2
 metrics:
-- pearson_cosine
-- spearman_cosine
-- pearson_manhattan
-- spearman_manhattan
-- pearson_euclidean
-- spearman_euclidean
-- pearson_dot
-- spearman_dot
-- pearson_max
-- spearman_max
 widget: []
 pipeline_tag: sentence-similarity
-model-index:
-- name: SentenceTransformer based on yano0/my_rope_bert_v2
-  results:
-  - task:
-      type: semantic-similarity
-      name: Semantic Similarity
-    dataset:
-      name: Unknown
-      type: unknown
-    metrics:
-    - type: pearson_cosine
-      value: 0.8363388345473755
-      name: Pearson Cosine
-    - type: spearman_cosine
-      value: 0.7829140815230603
-      name: Spearman Cosine
-    - type: pearson_manhattan
-      value: 0.8169134821588451
-      name: Pearson Manhattan
-    - type: spearman_manhattan
-      value: 0.7806182228552376
-      name: Spearman Manhattan
-    - type: pearson_euclidean
-      value: 0.8176194153920942
-      name: Pearson Euclidean
-    - type: spearman_euclidean
-      value: 0.7812646926795144
-      name: Spearman Euclidean
-    - type: pearson_dot
-      value: 0.790584312051173
-      name: Pearson Dot
-    - type: spearman_dot
-      value: 0.7341313863604967
-      name: Spearman Dot
-    - type: pearson_max
-      value: 0.8363388345473755
-      name: Pearson Max
-    - type: spearman_max
-      value: 0.7829140815230603
-      name: Spearman Max
 ---
 # SentenceTransformer based on yano0/my_rope_bert_v2
@@ -66,10 +21,13 @@ model-index:
 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
-- **Base model:** [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2) <!-- at revision a392086c08b3bf3a9b9030267a8965af0552d7fb -->
 - **Maximum Sequence Length:** 1024 tokens
 - **Output Dimensionality:** 768 tokens
 - **Similarity Function:** Cosine Similarity
@@ -181,41 +139,31 @@ You can finetune this model on your own dataset.
 *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
 -->
-## Training Details
-### Training Logs
-| Epoch | Step | spearman_cosine |
-|:-----:|:----:|:---------------:|
-| 0     | 0    | 0.7829          |
-### Framework Versions
-- Python: 3.10.13
-- Sentence Transformers: 3.0.0
-- Transformers: 4.44.0
-- PyTorch: 2.3.1+cu118
-- Accelerate: 0.30.1
-- Datasets: 2.19.2
-- Tokenizers: 0.19.1
-## Citation
-### BibTeX
-<!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
--->
-<!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
--->
-<!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

 ---
+language:
+- ja
 library_name: sentence-transformers
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
 metrics:
 widget: []
 pipeline_tag: sentence-similarity
+license: apache-2.0
+datasets:
+- hpprc/emb
+- hpprc/mqa-ja
+- google-research-datasets/paws-x
 ---
 # SentenceTransformer based on yano0/my_rope_bert_v2
 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
+The model is 1024-context sentence embedding model based on the RoFormer.
+The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
+Fine-tuning begins with weakly supervised learning using mc4 and MQA.
+After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
 ### Model Description
 - **Model Type:** Sentence Transformer
 - **Maximum Sequence Length:** 1024 tokens
 - **Output Dimensionality:** 768 tokens
 - **Similarity Function:** Cosine Similarity
 *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
 -->
+## Benchmarks
+### Retieval
+Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA]（https://huggingface.co/datasets/hotchpotch/JQaRA） and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
+| model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
+|--------|--------|---------------------|-------------------|-------------------|
+| me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
+| GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
+| RoSEtta | 0.2B | 79.3 | 57.7 | 32.3 |
+### JMTEB
+Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
+* Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
+* The average is a macro-average per task.
+| model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
+|--------|--------|--------|------|------|-------|-------|------|
+| me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
+| GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
+| RoSEtta | 0.2B | 79.0 | 84.3 | 81.4 | 53.2 | 61.7 | 71.9 |
+## Authors
+Chihiro Yano, Go Mocho, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
+## License
+This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).