yano0 commited on
Commit
d372cfb
1 Parent(s): 9221208

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -84
README.md CHANGED
@@ -1,64 +1,19 @@
1
  ---
2
- language: []
 
3
  library_name: sentence-transformers
4
  tags:
5
  - sentence-transformers
6
  - sentence-similarity
7
  - feature-extraction
8
- base_model: yano0/my_rope_bert_v2
9
  metrics:
10
- - pearson_cosine
11
- - spearman_cosine
12
- - pearson_manhattan
13
- - spearman_manhattan
14
- - pearson_euclidean
15
- - spearman_euclidean
16
- - pearson_dot
17
- - spearman_dot
18
- - pearson_max
19
- - spearman_max
20
  widget: []
21
  pipeline_tag: sentence-similarity
22
- model-index:
23
- - name: SentenceTransformer based on yano0/my_rope_bert_v2
24
- results:
25
- - task:
26
- type: semantic-similarity
27
- name: Semantic Similarity
28
- dataset:
29
- name: Unknown
30
- type: unknown
31
- metrics:
32
- - type: pearson_cosine
33
- value: 0.8363388345473755
34
- name: Pearson Cosine
35
- - type: spearman_cosine
36
- value: 0.7829140815230603
37
- name: Spearman Cosine
38
- - type: pearson_manhattan
39
- value: 0.8169134821588451
40
- name: Pearson Manhattan
41
- - type: spearman_manhattan
42
- value: 0.7806182228552376
43
- name: Spearman Manhattan
44
- - type: pearson_euclidean
45
- value: 0.8176194153920942
46
- name: Pearson Euclidean
47
- - type: spearman_euclidean
48
- value: 0.7812646926795144
49
- name: Spearman Euclidean
50
- - type: pearson_dot
51
- value: 0.790584312051173
52
- name: Pearson Dot
53
- - type: spearman_dot
54
- value: 0.7341313863604967
55
- name: Spearman Dot
56
- - type: pearson_max
57
- value: 0.8363388345473755
58
- name: Pearson Max
59
- - type: spearman_max
60
- value: 0.7829140815230603
61
- name: Spearman Max
62
  ---
63
 
64
  # SentenceTransformer based on yano0/my_rope_bert_v2
@@ -66,10 +21,13 @@ model-index:
66
  This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
67
 
68
  ## Model Details
 
 
 
 
69
 
70
  ### Model Description
71
  - **Model Type:** Sentence Transformer
72
- - **Base model:** [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2) <!-- at revision a392086c08b3bf3a9b9030267a8965af0552d7fb -->
73
  - **Maximum Sequence Length:** 1024 tokens
74
  - **Output Dimensionality:** 768 tokens
75
  - **Similarity Function:** Cosine Similarity
@@ -181,41 +139,31 @@ You can finetune this model on your own dataset.
181
  *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
182
  -->
183
 
184
- ## Training Details
185
 
186
- ### Training Logs
187
- | Epoch | Step | spearman_cosine |
188
- |:-----:|:----:|:---------------:|
189
- | 0 | 0 | 0.7829 |
190
 
 
 
 
 
 
191
 
192
- ### Framework Versions
193
- - Python: 3.10.13
194
- - Sentence Transformers: 3.0.0
195
- - Transformers: 4.44.0
196
- - PyTorch: 2.3.1+cu118
197
- - Accelerate: 0.30.1
198
- - Datasets: 2.19.2
199
- - Tokenizers: 0.19.1
200
 
201
- ## Citation
 
 
 
202
 
203
- ### BibTeX
 
 
 
 
204
 
205
- <!--
206
- ## Glossary
207
-
208
- *Clearly define terms in order to be accessible across audiences.*
209
- -->
210
-
211
- <!--
212
- ## Model Card Authors
213
-
214
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
215
- -->
216
-
217
- <!--
218
- ## Model Card Contact
219
 
220
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
221
- -->
 
1
  ---
2
+ language:
3
+ - ja
4
  library_name: sentence-transformers
5
  tags:
6
  - sentence-transformers
7
  - sentence-similarity
8
  - feature-extraction
 
9
  metrics:
 
 
 
 
 
 
 
 
 
 
10
  widget: []
11
  pipeline_tag: sentence-similarity
12
+ license: apache-2.0
13
+ datasets:
14
+ - hpprc/emb
15
+ - hpprc/mqa-ja
16
+ - google-research-datasets/paws-x
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ---
18
 
19
  # SentenceTransformer based on yano0/my_rope_bert_v2
 
21
  This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
22
 
23
  ## Model Details
24
+ The model is 1024-context sentence embedding model based on the RoFormer.
25
+ The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
26
+ Fine-tuning begins with weakly supervised learning using mc4 and MQA.
27
+ After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
28
 
29
  ### Model Description
30
  - **Model Type:** Sentence Transformer
 
31
  - **Maximum Sequence Length:** 1024 tokens
32
  - **Output Dimensionality:** 768 tokens
33
  - **Similarity Function:** Cosine Similarity
 
139
  *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
140
  -->
141
 
142
+ ## Benchmarks
143
 
144
+ ### Retieval
145
+ Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA](https://huggingface.co/datasets/hotchpotch/JQaRA) and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
 
 
146
 
147
+ | model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
148
+ |--------|--------|---------------------|-------------------|-------------------|
149
+ | me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
150
+ | GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
151
+ | RoSEtta | 0.2B | 79.3 | 57.7 | 32.3 |
152
 
 
 
 
 
 
 
 
 
153
 
154
+ ### JMTEB
155
+ Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
156
+ * Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
157
+ * The average is a macro-average per task.
158
 
159
+ | model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
160
+ |--------|--------|--------|------|------|-------|-------|------|
161
+ | me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
162
+ | GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
163
+ | RoSEtta | 0.2B | 79.0 | 84.3 | 81.4 | 53.2 | 61.7 | 71.9 |
164
 
165
+ ## Authors
166
+ Chihiro Yano, Go Mocho, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
+ ## License
169
+ This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).