HachiML commited on
Commit
bf12e52
1 Parent(s): 3b5e67e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -8,17 +8,17 @@ language:
8
  ---
9
  ## JGLUE Score
10
  I evaluated this model using the following JGLUE tasks. Here are the scores:
11
- | Task | Llama-2-7b-hf | Llama-2-13b-hf | This Model |
12
- |---------------------|:-----------------:|:-----------------:|:----------:|
13
- | JCOMMONSENSEQA(acc) | 51.56 | 75.06 | 75.78 |
14
- | JNLI(acc) | 29.74 | 22.18 | 50.69 |
15
- | MARC_JA(acc) | 85.72 | - | 79.64 |
16
- | JSQUAD(exact_match) | 64.16 | 76.13 | 62.83 |
17
- | **Average** | **57.79** | **-** | **67.23** |
18
  - Note: Use v0.3 prompt template
19
  - The JGLUE scores were measured using the following script:
20
  [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable)
21
- - (*) Refer to the following article: [Google Colab での JP Language Model Evaluation Harness による日本語LLMの評価手順](https://note.com/npaka/n/nedf4dacd4037)
22
 
23
  ## How to use
24
 
 
8
  ---
9
  ## JGLUE Score
10
  I evaluated this model using the following JGLUE tasks. Here are the scores:
11
+ | Task | Llama-2-13b-hf(*) | This Model |
12
+ |---------------------|:-----------------:|:----------:|
13
+ | JCOMMONSENSEQA(acc) | 75.06 | 75.78 |
14
+ | JNLI(acc) | 22.18 | 50.69 |
15
+ | MARC_JA(acc) | 38.83 | 79.64 |
16
+ | JSQUAD(exact_match) | 76.13 | 62.83 |
17
+ | **Average** | **53.05** | **67.23** |
18
  - Note: Use v0.3 prompt template
19
  - The JGLUE scores were measured using the following script:
20
  [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable)
21
+ - (*) A similar method was used to measure these scores.
22
 
23
  ## How to use
24