Llama3-KALE-LM-Chem-8B

Introduction

We are thrilled to present Llama3-KALE-LM-Chem 8B, our first open-source KALE-LM, which specializes in chemistry.

Training Details

We have continually pre-trained the model with a large amount of data and post-trained it through supervised fine-tuning.

Benchmarks

Open Benchmarks

Models ChemBench MMLU MMLU-Chem SciQ IE(Acc) IE(LS)
GPT-3.5 47.15 69.75 53.32 89.6 52.98 68.28
GPT-4 53.72 78.67 63.70 94.10 54.20 69.74
Llama3-8B-Instruct 46.02 68.3 51.10 93.30 45.83 61.22
LlaSMol 28.47 54.47 33.24 72.30 2.16 3.23
ChemDFM 44.44 58.11 45.60 86.70 7.61 11.49
ChemLLM-7B-Chat 34.16 61.79 48.39 94.00 29.66 39.17
ChemLLM-7B-Chat-1.5-SFT 42.75 63.56 49.63 95.10 14.96 19.61
Llama3-KALE-LM-Chem-8B 52.40 68.74 53.83 91.50 67.50 78.37

ChemBench Details (Evaluated By OpenCompass)

Models NC PP M2C C2M PP RS YP TP SP Average
GPT-3.5 46.93 56.98 85.28 38.25 43.67 42.33 30.33 42.57 38 47.15
GPT-4 54.82 65.02 92.64 52.88 62.67 52.67 42.33 24.75 35.67 53.72
Llama3-8B-Instruct 51.31 27.79 90.30 40.88 34.00 30.00 45.33 60.89 33.67 46.02
LlaSMol 27.78 29.34 31.44 23.38 25.67 24.00 37.33 34.65 22.67 28.47
ChemDFM 36.92 55.57 83.95 42.00 40.00 37.33 39.00 33.17 32.00 44.44
ChemLLM-7B-Chat 41.05 29.76 85.28 26.12 26.00 24.00 20.00 24.26 31.00 34.16
ChemLLM-7B-Chat-1.5-SFT 50.06 49.51 85.28 38.75 38.00 26.67 28.33 31.68 33.67 42.44
Llama3-KALE-LM-Chem-8B 63.58 58.39 92.98 44.50 48.67 38.33 46.33 44.55 34.33 52.41

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Cite This Work

@article{dai2024kale,
  title={KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model},
  author={Dai, Weichen and Chen, Yezeng and Dai, Zijie and Huang, Zhijie and Liu, Yubo and Pan, Yixuan and Song, Baiyang and Zhong, Chengli and Li, Xinhe and Wang, Zeyu and others},
  journal={arXiv preprint arXiv:2409.18695},
  year={2024}
}
Downloads last month
27
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B

Finetuned
(479)
this model

Collection including USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B