metadata
license: apache-2.0
base_model:
- internlm/internlm3-8b-instruct
tags:
- llama
- internlm3
Converted Llama from InternLM3-8B-Instruct
Descritpion
This is a converted model from InternLM3-8B-Instruct to LLaMA format. This conversion allows you to use InternLM3-8B-Instruct as if it were a Qwen2 model, which is convenient for some inference use cases. The precision is excatly the same as the original model.
Usage
You can load the model using the LlamaForCausalLM
class as shown below:
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaForCausalLM
device = "cuda" # the device to load the model onto, cpu or cuda
attn_impl = 'eager' # the attention implementation to use
prompt = "大模型和人工智能经历了两年的快速发展,请你以此主题对人工智能的从业者写一段新年寄语"
system_prompt = """You are an AI assistant whose name is InternLM (书生·浦语).
- InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.
- InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文."""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
]
tokenizer = AutoTokenizer.from_pretrained("silence09/InternLM3-8B-Instruct-Converted-LlaMA", trust_remote_code=True)
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
print(prompt)
llama_model = LlamaForCausalLM.from_pretrained(
"silence09/InternLM3-8B-Instruct-Converted-LlaMA",
torch_dtype='auto',
attn_implementation=attn_impl).to(device)
llama_generated_ids = llama_model.generate(model_inputs.input_ids, max_new_tokens=100, do_sample=False)
llama_generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, llama_generated_ids)
]
llama_response = tokenizer.batch_decode(llama_generated_ids, skip_special_tokens=True)[0]
print(llama_response)
Precision Guarantee
To comare result with the original model, you can use this code
More Info
It was converted using the python script available at this repository