MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
Abstract
English-centric large language models (LLMs) often show strong multilingual capabilities. However, the multilingual performance of these models remains unclear and is not thoroughly evaluated for many languages. Most benchmarks for multilinguality focus on classic NLP tasks, or cover a minimal number of languages. We introduce MEXA, a method for assessing the multilingual capabilities of pre-trained English-centric LLMs using parallel sentences, which are available for more languages than existing downstream tasks. MEXA leverages the fact that English-centric LLMs use English as a kind of pivot language in their intermediate layers. It computes the alignment between English and non-English languages using parallel sentences to evaluate the transfer of language understanding from English to other languages. This alignment can be used to estimate model performance in other languages. We conduct studies using various parallel datasets (FLORES-200 and Bible), models (Llama family, Gemma family, Mistral, and OLMo), and established downstream tasks (Belebele, m-MMLU, and m-ARC). We explore different methods to compute embeddings in decoder-only models. Our results show that MEXA, in its default settings, achieves a statistically significant average Pearson correlation of 0.90 with three established downstream tasks across nine models and two parallel datasets. This suggests that MEXA is a reliable method for estimating the multilingual capabilities of English-centric LLMs, providing a clearer understanding of their multilingual potential and the inner workings of LLMs. Leaderboard: https://huggingface.co/spaces/cis-lmu/Mexa, Code: https://github.com/cisnlp/Mexa.
Community
Introducing MEXA, a method for assessing the multilingual capabilities of English-centric LLMs using parallel sentences. It estimates how many languages an LLM covers and at what level.
Leaderboard: https://huggingface.co/spaces/cis-lmu/Mexa
Code: https://github.com/cisnlp/Mexa
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Language-specific Calibration for Pruning Multilingual Language Models (2024)
- Lens: Rethinking Multilingual Enhancement for Large Language Models (2024)
- EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models (2024)
- FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data (2024)
- NLLB-E5: A Scalable Multilingual Retrieval Model (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 1
Collections including this paper 0
No Collection including this paper