File size: 4,079 Bytes
042bc28
 
 
7cdecbc
4a4da9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
042bc28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5c56f23
 
 
 
 
042bc28
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
---
tags:
- Multilingual
license: mit
language:
- af
- am
- ar
- hy
- as
- ast
- az
- be
- bn
- bs
- bg
- my
- ca
- ceb
- zho
- hr
- cs
- da
- nl
- en
- et
- tl
- fi
- fr
- ff
- gl
- lg
- ka
- de
- el
- gu
- ha
- he
- hi
- hu
- is
- ig
- id
- ga
- it
- ja
- jv
- kea
- kam
- kn
- kk
- km
- ko
- ky
- lo
- lv
- ln
- lt
- luo
- lb
- mk
- ms
- ml
- mt
- mi
- mr
- mn
- ne
- ns
- no
- ny
- oc
- or
- om
- ps
- fa
- pl
- pt
- pa
- ro
- ru
- sr
- sn
- sd
- sk
- sl
- so
- ku
- es
- sw
- sv
- tg
- ta
- te
- th
- tr
- uk
- umb
- ur
- uz
- vi
- cy
- wo
- xh
- yo
- zu
---

### Model Sources
- **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
- **Link**: https://arxiv.org/pdf/2407.05975
- **Repository**: https://github.com/CONE-MT/LLaMAX/

### Model Description

🔥 LLaMAX2-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX2-7B.

🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX2-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.

🔥 LLaMAX2-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.

### Experiments
We evaluated LLaMAX2-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).

| MGSM                      | Avg.    | Lrl. | Hrl.   | Bn     | Th   | Sw | Ja    | Zh   | De | Fr | Ru   | Es | En |  
|---------------------------|---------|------|--------|--------|------|----|----|------|----|----|------|------|--------|
| MetaMath-7B (official)    | 38.32   | 6.9  | 51.8   | 6.8	   | 7.2  |6.8| 36.4 | 38.4 | 55.2|54.4| 52.0 |57.2|68.8| 
| MetaMath-7B (Reproduced)  | 38.08   | 6.8  | 51.5   | 6.0    | 10.0 |4.4| 36.4 |42.8|52.8|56.0|48.8|58.8|64.8| 
| LLaMAX2-7B-MetaMath       | 44.28   | 25.6 | 52.3   | 26.8   | 24.0 |26.0| 35.6 |42.4|56.8|55.2|53.6|56.8|65.6| 



### Model Usage

Prompt template:
```angular2html
def Prompt_template(query):
    prompt = (
         "Below is an instruction that describes a task. "
         "Write a response that appropriately completes the request.\n\n"
         f"### Instruction:\n{query}\n\n### Response: Let's think step by step."
    )
    return prompt
```

Code Example:
```angular2html
from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

query = "Bert fills out the daily crossword puzzle in the newspaper every day. He uses a pencil to fill out the puzzles every two weeks. On average, it takes him 1050 words to use up a pencil. How many words are in each crossword puzzle on average?"
prompt = Prompt_template(query)
inputs = tokenizer(prompt, return_tensors="pt")

generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

# => "If Bert uses up a pencil to fill out the puzzles every two weeks and it takes him 1050
words to use up a pencil, then he must be filling out 1050 words of crossword puzzles every
two weeks. To find out how many words are in each daily crossword puzzle, we need to divide
the total number of words (1050) by the number of days in two weeks (14). So, there are
1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
```

### Citation
if our model helps your work, please cite this paper:

```
@article{lu2024llamax,
  title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
  author={Lu, Yinquan and Zhu, Wenhao and Li, Lei and Qiao, Yu and Yuan, Fei},
  journal={arXiv preprint arXiv:2407.05975},
  year={2024}
}
```