Adapter Model in K2
Quick Start
After deploying the model k2_fp_delta, we need to use the PEFT
API from transformers
to load the entire K2!
base_model = /path/to/geollama
lora_weights = /path/to/adapter/model
tokenizer = LlamaTokenizer.from_pretrained(base_model)
model = LlamaForCausalLM.from_pretrained(
base_model,
load_in_8bit=load_8bit,
device_map=device_map
torch_dtype=torch.float16
)
model = PeftModel.from_pretrained(
model,
lora_weights,
torch_dtype=torch.float16,
device_map=device_map,
)
model.config.pad_token_id = tokenizer.pad_token_id = 0
model.config.bos_token_id = 1
model.config.eos_token_id = 2