Edit model card

Uploaded model

  • Developed by: jingwang
  • License: apache-2.0
  • Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

install dependencies in google colab

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes

inference


from unsloth import FastLanguageModel
from typing import Dict, List, Tuple, Union, Any
import pandas
from tqdm import trange, tqdm
import torch

class FormatPrompt_QA_with_citation():
    '''format prompt class'''
    def __init__(self, eos_token:str='</s>') -> None:
        self.inputs = ['context','question'] # required input fields
        self.outputs = ['answer', 'citation'] #  for training, and model inference output fields
        self.eos_token = eos_token

    def __call__(self, instance: Dict[str, Any]) -> str:
        '''
        function call operator 
        Args:
            instance: dictionary with keys: 'question', 'answer'
        Returns:
            prompt: formatted prompt
        '''
        return self.formatting_prompt_func(instance)
    
    def formatting_prompt_func(self, instance: dict) -> str:
        '''format prompt for domain specific QA
        note this is for fine-tuning pre-trained model,
        if starting with instuct tuned model, use `tokenizer.apply_chat_template(messages)` instead
        '''

        assert all([ item in instance.keys()  for item in self.inputs ]), logging.info(f"instance must have {self.inputs}!")
        
        prompt = f"""<s> [INST] Context: {str(instance["context"])}\
        Question: {str(instance["question"])} 
        Answer: [/INST]"""

        if ('answer' in instance):
            if ('citation' in instance):
                answer = {"answer":str(instance['answer']), "citation":str(instance['citation'])}
            else:
                answer = {"answer":str(instance['answer']), "citation":""}
            prompt += json.dumps(answer, ensure_ascii=False) + self.eos_token # json format
        else:
            pass
        return prompt

formatting_func = FormatPrompt_context_QA()

# pull model from huggingface
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "jingwang/mistral_qa_citation",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)


# inference
FastLanguageModel.for_inference(model)

example = {'context': 'John Gadsby Chapman , The Baptism of Pocahontas (1840). A copy is on display in the Rotunda of the United States Capitol . During her stay at Henricus, Pocahontas met John Rolfe. Rolfe\'s English-born wife Sarah Hacker and child Bermuda had died on the way to Virginia after the wreck of the ship Sea Venture on the Summer Isles, now known as Bermuda. He established the Virginia plantation Varina Farms , where he cultivated a new strain of tobacco . Rolfe was a pious man and agonized over the potential moral repercussions of marrying a heathen, though in fact Pocahontas had accepted the Christian faith and taken the baptismal name Rebecca. In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas, to whom my hearty and best thoughts are, and have been a long time so entangled, and enthralled in so intricate a labyrinth that I was even a-wearied to unwind myself thereout. [41] The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown. For two years they lived at Varina Farms, across the James River from Henricus. Their son Thomas was born in January 1615. [42] The marriage created a climate of peace between the Jamestown colonists and Powhatan\'s tribes; it endured for eight years as the "Peace of Pocahontas". [43] In 1615, Ralph Hamor wrote, "Since the wedding we have had friendly commerce and trade not only with Powhatan but also with his subjects round about us." [44] The marriage was controversial in the British court at the time because "a commoner" had "the audacity" to marry a "princess." [45] [46]',
  'question': 'Who did Pocahontas marry?',
  #'answer': 'Pocahontas married John Rolfe',
  #'citation': 'The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown.'
}



inputs = tokenizer([formatting_func(example)],  return_tensors="pt", padding=False).to(model.device)
input_length = inputs.input_ids.shape[-1]

with torch.no_grad():
  output = model.generate(**inputs,
                          do_sample=False,
                          temperature=0.5,
                          max_new_tokens=1024,
                          pad_token_id=tokenizer.eos_token_id,
                          use_cache=False,
                          )
  response = tokenizer.decode(
                  output[0][input_length::], # response only, remove prompts
                  skip_special_tokens=True,
                  )
  print(response)
>> {"answer": "Pocahontas married John Rolfe", "citation": "In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas"}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for jingwang/mistral_qa_citation

Finetuned
(302)
this model