sooolee commited on
Commit
e830b01
1 Parent(s): 8d1242c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -37,7 +37,8 @@ The base model [braindao/flan-t5-cnn](https://huggingface.co/braindao/flan-t5-cn
37
  ## Model description
38
 
39
  * This model further finetuned [braindao/flan-t5-cnn](https://huggingface.co/braindao/flan-t5-cnn) on the more conversational samsum dataset.
40
- * Huggingface [PEFT Library](https://github.com/huggingface/peft) LoRA (r = 16) was used to further reduced the model size. Only 1.7M parameters were trained (0.71% of original flan-t5-base 250M parameters).
 
41
  * The model checkpoint is just 7MB.
42
 
43
  ## Intended uses & limitations
@@ -61,6 +62,36 @@ The following hyperparameters were used during training:
61
  - rougeL: 37.300937%
62
  - rougeLsum: 37.271341%
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ### Framework versions
65
 
66
  - Transformers 4.27.2
 
37
  ## Model description
38
 
39
  * This model further finetuned [braindao/flan-t5-cnn](https://huggingface.co/braindao/flan-t5-cnn) on the more conversational samsum dataset.
40
+ * Huggingface [PEFT Library](https://github.com/huggingface/peft) LoRA (r = 16) and bitsandbytes int-8 was used to speed up training and reduce the model size.
41
+ * Only 1.7M parameters were trained (0.71% of original flan-t5-base 250M parameters).
42
  * The model checkpoint is just 7MB.
43
 
44
  ## Intended uses & limitations
 
62
  - rougeL: 37.300937%
63
  - rougeLsum: 37.271341%
64
 
65
+ ### How to use
66
+
67
+ ```python
68
+ import torch
69
+ from peft import PeftModel, PeftConfig
70
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
71
+
72
+ # Load peft config for pre-trained checkpoint etc.
73
+ peft_model_id = "sooolee/flan-t5-base-cnn-samsum-lora"
74
+ config = PeftConfig.from_pretrained(peft_model_id)
75
+
76
+ # load base LLM model and tokenizer
77
+ model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, load_in_8bit=True, device_map='auto')
78
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
79
+
80
+ # Load the Lora model
81
+ model = PeftModel.from_pretrained(model, peft_model_id, device_map='auto')
82
+
83
+ # Tokenize the text inputs
84
+ texts = "<e.g. Part of YouTube Transcript>"
85
+ inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
86
+
87
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
88
+ with torch.no_grad():
89
+ output = self.model.generate(input_ids=inputs["input_ids"].to(device), max_new_tokens=60, do_sample=True, top_p=0.9)
90
+ summary = self.tokenizer.batch_decode(output.detach().cpu().numpy(), skip_special_tokens=True)
91
+
92
+ summary
93
+ ```
94
+
95
  ### Framework versions
96
 
97
  - Transformers 4.27.2