akeyhero commited on
Commit
d252165
1 Parent(s): 0178777

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -49,7 +49,7 @@ model = AutoModelForTokenClassification.from_pretrained(model_name)
49
  - Hugging Faceで使いやすい
50
  - 大きすぎない語彙数
51
 
52
- 本家の DeBERTa V3 は大きな語彙数で学習されていることに特徴があるが、反面埋め込み層のパラメータ数が大きくなりすぎることから、本モデルでは小さめの語彙数を採用している。
53
 
54
  ---
55
  The tokenizer is trained using [the method introduced by Kudo](https://qiita.com/taku910/items/fbaeab4684665952d5a9).
@@ -60,7 +60,7 @@ Key points include:
60
  - Easy to use with Hugging Face
61
  - Smaller vocabulary size
62
 
63
- Although the original DeBERTa V3 is characterized by a large vocabulary size, which can result in a significant increase in the number of parameters in the embedding layer, this model adopts a smaller vocabulary size to address this.
64
 
65
  # Data
66
  | Dataset Name | Notes | File Size (with metadata) | Factor |
@@ -85,24 +85,24 @@ Although the original DeBERTa V3 is characterized by a large vocabulary size, wh
85
  - Precision: Mixed (fp16)
86
 
87
  # Evaluation
88
- | Model | JSTS | JNLI | JSQuAD | JCQA |
89
- | ----- | ---- | ---- | ------ | ---- |
90
- | ≤ small | | | | |
91
- | [izumi-lab/deberta-v2-small-japanese](https://huggingface.co/izumi-lab/deberta-v2-small-japanese) | 0.890/0.846 | 0.880 | - | 0.737 |
92
- | [globis-university/deberta-v3-japanese-xsmall](https://huggingface.co/globis-university/deberta-v3-japanese-xsmall) | **0.916**/**0.880** | **0.913** | **0.869**/**0.938** | **0.821** |
93
  | base | | | | |
94
- | [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3) | 0.919/0.881 | 0.907 | 0.880/0.946 | 0.848 |
95
- | [nlp-waseda/roberta-base-japanese](https://huggingface.co/nlp-waseda/roberta-base-japanese) | 0.913/0.873 | 0.895 | 0.864/0.927 | 0.840 |
96
- | [izumi-lab/deberta-v2-base-japanese](https://huggingface.co/izumi-lab/deberta-v2-base-japanese) | 0.919/0.882 | 0.912 | - | 0.859 |
97
- | [ku-nlp/deberta-v2-base-japanese](https://huggingface.co/ku-nlp/deberta-v2-base-japanese) | 0.922/0.886 | 0.922 | **0.899**/**0.951** | - |
98
- | [ku-nlp/deberta-v3-base-japanese](https://huggingface.co/ku-nlp/deberta-v3-base-japanese) | **0.927**/0.891 | **0.927** | 0.896/- | - |
99
- | [**globis-university/deberta-v3-japanese-base**](https://huggingface.co/globis-university/deberta-v3-japanese-base) | 0.925/**0.895** | 0.921 | 0.890/0.950 | **0.886** |
100
- | large | | | | |
101
- | [cl-tohoku/bert-large-japanese-v2](https://huggingface.co/cl-tohoku/bert-large-japanese-v2) | 0.926/0.893 | **0.929** | 0.893/0.956 | 0.893 |
102
- | [roberta-large-japanese](https://huggingface.co/nlp-waseda/roberta-large-japanese) | **0.930**/**0.896** | 0.924 | 0.884/0.940 | **0.907** |
103
- | [roberta-large-japanese-seq512](https://huggingface.co/nlp-waseda/roberta-large-japanese-seq512) | 0.926/0.892 | 0.926 | **0.918**/**0.963** | 0.891 |
104
- | [ku-nlp/deberta-v2-large-japanese](https://huggingface.co/ku-nlp/deberta-v2-large-japanese) | 0.925/0.892 | 0.924 | 0.912/0.959 | - |
105
- | [globis-university/deberta-v3-japanese-large](https://huggingface.co/globis-university/deberta-v3-japanese-large) | 0.928/**0.896** | 0.924 | 0.896/0.956 | 0.900 |
106
 
107
  ## License
108
  CC BY SA 4.0
 
49
  - Hugging Faceで使いやすい
50
  - 大きすぎない語彙数
51
 
52
+ 本家の DeBERTa V3 は大きな語彙数で学習されていることに特徴があるが、反面埋め込み層のパラメータ数が大きくなりすぎる ([microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) モデルの場合で埋め込み層が全体の 54%) ことから、本モデルでは小さめの語彙数を採用している。
53
 
54
  ---
55
  The tokenizer is trained using [the method introduced by Kudo](https://qiita.com/taku910/items/fbaeab4684665952d5a9).
 
60
  - Easy to use with Hugging Face
61
  - Smaller vocabulary size
62
 
63
+ Although the original DeBERTa V3 is characterized by a large vocabulary size, which can result in a significant increase in the number of parameters in the embedding layer (for the [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) model, the embedding layer accounts for 54% of the total), this model adopts a smaller vocabulary size to address this.
64
 
65
  # Data
66
  | Dataset Name | Notes | File Size (with metadata) | Factor |
 
85
  - Precision: Mixed (fp16)
86
 
87
  # Evaluation
88
+ | Model | #params | JSTS | JNLI | JSQuAD | JCQA |
89
+ | ----- | ------- | ---- | ---- | ------ | ---- |
90
+ | ≤ small | | | | | |
91
+ | [izumi-lab/deberta-v2-small-japanese](https://huggingface.co/izumi-lab/deberta-v2-small-japanese) | 17.8M | 0.890/0.846 | 0.880 | - | 0.737 |
92
+ | [globis-university/deberta-v3-japanese-xsmall](https://huggingface.co/globis-university/deberta-v3-japanese-xsmall) | 33.7M | **0.916**/**0.880** | **0.913** | **0.869**/**0.938** | **0.821** |
93
  | base | | | | |
94
+ | [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3) | 111M | 0.919/0.881 | 0.907 | 0.880/0.946 | 0.848 |
95
+ | [nlp-waseda/roberta-base-japanese](https://huggingface.co/nlp-waseda/roberta-base-japanese) | 111M | 0.913/0.873 | 0.895 | 0.864/0.927 | 0.840 |
96
+ | [izumi-lab/deberta-v2-base-japanese](https://huggingface.co/izumi-lab/deberta-v2-base-japanese) | 110M | 0.919/0.882 | 0.912 | - | 0.859 |
97
+ | [ku-nlp/deberta-v2-base-japanese](https://huggingface.co/ku-nlp/deberta-v2-base-japanese) | 112M | 0.922/0.886 | 0.922 | **0.899**/**0.951** | - |
98
+ | [ku-nlp/deberta-v3-base-japanese](https://huggingface.co/ku-nlp/deberta-v3-base-japanese) | 160M | **0.927**/0.891 | **0.927** | 0.896/- | - |
99
+ | [**globis-university/deberta-v3-japanese-base**](https://huggingface.co/globis-university/deberta-v3-japanese-base) | 110M | 0.925/**0.895** | 0.921 | 0.890/0.950 | **0.886** |
100
+ | large | | | | | |
101
+ | [cl-tohoku/bert-large-japanese-v2](https://huggingface.co/cl-tohoku/bert-large-japanese-v2) | 337M | 0.926/0.893 | **0.929** | 0.893/0.956 | 0.893 |
102
+ | [nlp-waseda/roberta-large-japanese](https://huggingface.co/nlp-waseda/roberta-large-japanese) | 337M | **0.930**/**0.896** | 0.924 | 0.884/0.940 | **0.907** |
103
+ | [nlp-waseda/roberta-large-japanese-seq512](https://huggingface.co/nlp-waseda/roberta-large-japanese-seq512) | 337M | 0.926/0.892 | 0.926 | **0.918**/**0.963** | 0.891 |
104
+ | [ku-nlp/deberta-v2-large-japanese](https://huggingface.co/ku-nlp/deberta-v2-large-japanese) | 339M | 0.925/0.892 | 0.924 | 0.912/0.959 | - |
105
+ | [globis-university/deberta-v3-japanese-large](https://huggingface.co/globis-university/deberta-v3-japanese-large) | 352M | 0.928/**0.896** | 0.924 | 0.896/0.956 | 0.900 |
106
 
107
  ## License
108
  CC BY SA 4.0