Edit model card

Model Sources

Model Description

🔥 LLaMAX-7B-X-CSQA is a commonsense reasoning model with multilingual capability, which is fully fine-tuned the powerful multilingual model LLaMAX-7B on five English commonsense reasoning dataset to train LLaMAX-7B-X-CSQA, including X-CSQA, ARC-Easy, ARC-Challenge, OpenBookQA, and QASC.

🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 4.2% on the X-CSQA dataset.


X-CSQA Avg. Sw Ur Hi Ar Vi Ja Pl Zh Nl Ru It De Pt Fr Es En
Llama2-7B-X-CSQA 50.9 23.2 24.7 32.9 32.4 51.0 50.0 51.5 55.6 56.9 55.8 58.8 59.9 60.4 61.8 61.9 78.1
LLaMAX-7B-X-CSQA 55.1 43.5 39.0 44.1 45.1 54.0 49.9 54.6 58.2 58.9 57.1 59.1 59.0 60.9 61.6 62.7 74.0

Model Usage

Code Example:

from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

query = "What is someone operating a vehicle likely to be accused of after becoming inebriated? \n Options: A.punish \t B. arrest \t C. automobile accidents \t D. talking nonsense \t E.drunk
driving \n Answer:"
inputs = tokenizer(query, return_tensors="pt")

generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
# => E


if our model helps your work, please cite this paper:

  title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
  author={Lu, Yinquan and Zhu, Wenhao and Li, Lei and Qiao, Yu and Yuan, Fei},
  journal={arXiv preprint arXiv:2407.05975},
Downloads last month
Inference Examples
Inference API (serverless) is not available, repository is disabled.