MobileLLM-350M-LS: Unused and Uninitialized Weights, Repeated Output During Generation
#2
by
Boxp
- opened
Description
When using MobileLLM-350M-LS
with transformers
library, I encountered the following issues:
Warnings:
- Unused weights:
lm_head.weight
. - Reinitialized weights:
model.embed_tokens.weight
.
- Unused weights:
Output Issues:
- Generated text contains excessive repetition. For example:
Hello! Who are you? Below are some of the questions we ask new members. What is your name? What is your birthday? What is your birthday? ...
- Generated text contains excessive repetition. For example:
Environment
- Transformers version: 4.44.0
- Python version: 3.11
Code
from transformers import AutoTokenizer, AutoModelForCausalLM
def infer_mobilellm():
model_dir = "/nfs/300-MT-Pro/model/huggingface/MobileLLM-350M-LS"
tokenizer = AutoTokenizer.from_pretrained(model_dir, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True)
input_text = "Hello! Who are you?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
infer_mobilellm()
Questions
What is the correct way to initialize and perform inference with MobileLLM-350M-LS
?