Taka008 commited on
Commit
95af86e
1 Parent(s): 6bce849

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -129,6 +129,31 @@ The models have been fine-tuned on the following datasets.
129
  | |[Synthetic-JP-EN-Coding-Dataset-567k](https://huggingface.co/datasets/Aratako/Synthetic-JP-EN-Coding-Dataset-567k)| A synthetic instruction dataset. We used sampled one.|
130
  |English |[FLAN](https://huggingface.co/datasets/Open-Orca/FLAN) | We used sampled one. |
131
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
  ## Risks and Limitations
133
 
134
  The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
 
129
  | |[Synthetic-JP-EN-Coding-Dataset-567k](https://huggingface.co/datasets/Aratako/Synthetic-JP-EN-Coding-Dataset-567k)| A synthetic instruction dataset. We used sampled one.|
130
  |English |[FLAN](https://huggingface.co/datasets/Open-Orca/FLAN) | We used sampled one. |
131
 
132
+
133
+ ## Evaluation
134
+
135
+ ### llm-jp-eval (v1.3.1)
136
+
137
+ | Model name | average | EL | FA | HE | MC | MR | MT | NLI | QA | RC |
138
+ | :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
139
+ | [llm-jp-3-1.8b](https://huggingface.co/llm-jp/llm-jp-3-1.8b) | 0.3767 | 0.3725 | 0.1948 | 0.2350 | 0.2500 | 0.0900 | 0.7730 | 0.3080 | 0.4629 | 0.7040 |
140
+ | [llm-jp-3-1.8b-instruct](https://huggingface.co/llm-jp/llm-jp-3-1.8b-instruct) | 0.4967 | 0.3695 | 0.1892 | 0.3850 | 0.4200 | 0.4200 | 0.7989 | 0.3700 | 0.5016 | 0.7729 |
141
+ | [llm-jp-3-3.7b](https://huggingface.co/llm-jp/llm-jp-3-3.7b) | 0.4231 | 0.3812 | 0.2440 | 0.2200 | 0.1900 | 0.3600 | 0.7947 | 0.3800 | 0.4688 | 0.7694 |
142
+ | [llm-jp-3-3.7b-instruct](https://huggingface.co/llm-jp/llm-jp-3-3.7b-instruct) | 0.5258 | 0.4220 | 0.2418 | 0.3950 | 0.5900 | 0.5600 | 0.8088 | 0.4260 | 0.4765 | 0.8123 |
143
+ | [llm-jp-3-13b](https://huggingface.co/llm-jp/llm-jp-3-13b) | 0.5802 | 0.5570 | 0.2593 | 0.4600 | 0.7000 | 0.6300 | 0.8292 | 0.3460 | 0.5937 | 0.8469 |
144
+ | [llm-jp-3-13b-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct) | 0.6239 | 0.4848 | 0.2622 | 0.5300 | 0.9300 | 0.7000 | 0.8262 | 0.4460 | 0.5703 | 0.8659 |
145
+
146
+
147
+ ### Japanese MT Bench
148
+
149
+ | Model name | average | coding | extraction | humanities | math | reasoning | roleplay | stem | writing |
150
+ | :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
151
+ | [llm-jp-3-1.8b-instruct](https://huggingface.co/llm-jp/llm-jp-3-1.8b-instruct) | 4.64 | 2.80 | 3.55 | 7.05 | 2.45 | 2.80 | 6.90 | 5.40 | 6.20 |
152
+ | [llm-jp-3-3.7b-instruct](https://huggingface.co/llm-jp/llm-jp-3-3.7b-instruct) | 5.28 | 2.75 | 5.65 | 7.05 | 2.25 | 3.65 | 8.25 | 5.85 | 6.75 |
153
+ | [llm-jp-3-13b-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct) | 6.66 | 4.30 | 6.90 | 9.00 | 3.60 | 5.75 | 8.55 | 7.50 | 7.70 |
154
+
155
+
156
+
157
  ## Risks and Limitations
158
 
159
  The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.