Update README.md
Browse files
README.md
CHANGED
@@ -129,6 +129,31 @@ The models have been fine-tuned on the following datasets.
|
|
129 |
| |[Synthetic-JP-EN-Coding-Dataset-567k](https://huggingface.co/datasets/Aratako/Synthetic-JP-EN-Coding-Dataset-567k)| A synthetic instruction dataset. We used sampled one.|
|
130 |
|English |[FLAN](https://huggingface.co/datasets/Open-Orca/FLAN) | We used sampled one. |
|
131 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
132 |
## Risks and Limitations
|
133 |
|
134 |
The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
|
|
|
129 |
| |[Synthetic-JP-EN-Coding-Dataset-567k](https://huggingface.co/datasets/Aratako/Synthetic-JP-EN-Coding-Dataset-567k)| A synthetic instruction dataset. We used sampled one.|
|
130 |
|English |[FLAN](https://huggingface.co/datasets/Open-Orca/FLAN) | We used sampled one. |
|
131 |
|
132 |
+
|
133 |
+
## Evaluation
|
134 |
+
|
135 |
+
### llm-jp-eval (v1.3.1)
|
136 |
+
|
137 |
+
| Model name | average | EL | FA | HE | MC | MR | MT | NLI | QA | RC |
|
138 |
+
| :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
139 |
+
| [llm-jp-3-1.8b](https://huggingface.co/llm-jp/llm-jp-3-1.8b) | 0.3767 | 0.3725 | 0.1948 | 0.2350 | 0.2500 | 0.0900 | 0.7730 | 0.3080 | 0.4629 | 0.7040 |
|
140 |
+
| [llm-jp-3-1.8b-instruct](https://huggingface.co/llm-jp/llm-jp-3-1.8b-instruct) | 0.4967 | 0.3695 | 0.1892 | 0.3850 | 0.4200 | 0.4200 | 0.7989 | 0.3700 | 0.5016 | 0.7729 |
|
141 |
+
| [llm-jp-3-3.7b](https://huggingface.co/llm-jp/llm-jp-3-3.7b) | 0.4231 | 0.3812 | 0.2440 | 0.2200 | 0.1900 | 0.3600 | 0.7947 | 0.3800 | 0.4688 | 0.7694 |
|
142 |
+
| [llm-jp-3-3.7b-instruct](https://huggingface.co/llm-jp/llm-jp-3-3.7b-instruct) | 0.5258 | 0.4220 | 0.2418 | 0.3950 | 0.5900 | 0.5600 | 0.8088 | 0.4260 | 0.4765 | 0.8123 |
|
143 |
+
| [llm-jp-3-13b](https://huggingface.co/llm-jp/llm-jp-3-13b) | 0.5802 | 0.5570 | 0.2593 | 0.4600 | 0.7000 | 0.6300 | 0.8292 | 0.3460 | 0.5937 | 0.8469 |
|
144 |
+
| [llm-jp-3-13b-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct) | 0.6239 | 0.4848 | 0.2622 | 0.5300 | 0.9300 | 0.7000 | 0.8262 | 0.4460 | 0.5703 | 0.8659 |
|
145 |
+
|
146 |
+
|
147 |
+
### Japanese MT Bench
|
148 |
+
|
149 |
+
| Model name | average | coding | extraction | humanities | math | reasoning | roleplay | stem | writing |
|
150 |
+
| :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
151 |
+
| [llm-jp-3-1.8b-instruct](https://huggingface.co/llm-jp/llm-jp-3-1.8b-instruct) | 4.64 | 2.80 | 3.55 | 7.05 | 2.45 | 2.80 | 6.90 | 5.40 | 6.20 |
|
152 |
+
| [llm-jp-3-3.7b-instruct](https://huggingface.co/llm-jp/llm-jp-3-3.7b-instruct) | 5.28 | 2.75 | 5.65 | 7.05 | 2.25 | 3.65 | 8.25 | 5.85 | 6.75 |
|
153 |
+
| [llm-jp-3-13b-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct) | 6.66 | 4.30 | 6.90 | 9.00 | 3.60 | 5.75 | 8.55 | 7.50 | 7.70 |
|
154 |
+
|
155 |
+
|
156 |
+
|
157 |
## Risks and Limitations
|
158 |
|
159 |
The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
|