arxiv:2403.03640

Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People

Published on Mar 6

Upvote

Authors:

Xidong Wang ,

Anningzhe Gao ,

Benyou Wang

Abstract

Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion. This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark. In the multilingual medical benchmark, the released Apollo models, at various relatively-small sizes (i.e., 0.5B, 1.8B, 2B, 6B, and 7B), achieve the best performance among models of equivalent size. Especially, Apollo-7B is the state-of-the-art multilingual medical LLMs up to 70B. Additionally, these lite models could be used to improve the multi-lingual medical capabilities of larger models without fine-tuning in a proxy-tuning fashion. We will open-source training corpora, code, model weights and evaluation benchmark.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 23

Browse 23 models citing this paper

Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People

Abstract

Community

Models citing this paper 23

Datasets citing this paper 3

Spaces citing this paper 2

Collections including this paper 1