matsuo-lab commited on
Commit
1aeb8d7
1 Parent(s): f8f840d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -49,12 +49,15 @@ This repository provides a Japanese-centric multilingual GPT-NeoX model of 10 bi
49
 
50
  * **Japanese benchmark**
51
 
52
- - *The 4-task average accuracy is based on results of JCommonsenseQA, JNLI, MARC-ja, and JSQuAD.*
 
 
 
53
 
54
  | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
55
  | :-- | :-- | :-- | :-- | :-- | :-- |
56
- | weblab-10b-instruction-sft | 79.04 | 74.35 | 65.65 | 96.06 | 80.09 |
57
- | weblab-10b | 67.27 | 65.86 | 54.19 | 84.49 | 64.54 |
58
 
59
  ---
60
 
 
49
 
50
  * **Japanese benchmark**
51
 
52
+ - *We used [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/2f1583c0735eacdfdfa5b7d656074b69577b6774) library for evaluation.*
53
+ - *The 4-task average accuracy is based on results of JCommonsenseQA-1.1, JNLI-1.1, MARC-ja-1.1, and JSQuAD-1.1.*
54
+ - *model loading is performed with float16, and evaluation is performed with template version 0.3 using the few-shot in-context learning.*
55
+ - *The number of few-shots is 3,3,3,2.*
56
 
57
  | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
58
  | :-- | :-- | :-- | :-- | :-- | :-- |
59
+ | weblab-10b-instruction-sft | 78.78 | 74.35 | 65.65 | 96.06 | 79.04 |
60
+ | weblab-10b | 66.38 | 65.86 | 54.19 | 84.49 | 60.98 |
61
 
62
  ---
63