matsuo-lab
/

weblab-10b-instruction-sft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

matsuo-lab commited on Aug 17, 2023

Commit

1aeb8d7

•

1 Parent(s): f8f840d

Update README.md

Files changed (1) hide show

README.md +6 -3

README.md CHANGED Viewed

@@ -49,12 +49,15 @@ This repository provides a Japanese-centric multilingual GPT-NeoX model of 10 bi
 * **Japanese benchmark**
-    - *The 4-task average accuracy is based on results of JCommonsenseQA, JNLI, MARC-ja, and JSQuAD.*
     | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
     | :-- | :-- | :-- | :-- | :-- | :-- |
-    | weblab-10b-instruction-sft | 79.04 | 74.35 | 65.65 | 96.06 | 80.09 |
-    | weblab-10b | 67.27 | 65.86 | 54.19 | 84.49 | 64.54 |
 ---

 * **Japanese benchmark**
+    - *We used [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/2f1583c0735eacdfdfa5b7d656074b69577b6774) library for evaluation.*
+    - *The 4-task average accuracy is based on results of JCommonsenseQA-1.1, JNLI-1.1, MARC-ja-1.1, and JSQuAD-1.1.*
+    - *model loading is performed with float16, and evaluation is performed with template version 0.3 using the few-shot in-context learning.*
+    - *The number of few-shots is 3,3,3,2.*
     | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
     | :-- | :-- | :-- | :-- | :-- | :-- |
+    | weblab-10b-instruction-sft | 78.78 | 74.35 | 65.65 | 96.06 | 79.04 |
+    | weblab-10b | 66.38 | 65.86 | 54.19 | 84.49 | 60.98 |
 ---