nakayama
/

lora-hh-rlhf-49k-ja-for-open-calm-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

nakayama commited on May 21, 2023

Commit

eec4d34

•

1 Parent(s): 2382c00

Update README.md

Files changed (1) hide show

README.md +81 -0

README.md CHANGED Viewed

@@ -1,3 +1,84 @@
 ---
 license: mit
 ---

 ---
 license: mit
+datasets:
+- Anthropic/hh-rlhf
+- kunishou/hh-rlhf-49k-ja
+language:
+- ja
+library_name: transformers
+pipeline_tag: text-generation
 ---
+[cyberagent/open-calm-7b](https://huggingface.co/cyberagent/open-calm-7b)に対して[kunishou/
+hh-rlhf-49k-ja](https://huggingface.co/datasets/kunishou/hh-rlhf-49k-ja)をpeftを用いて（というより[tloen/alpaca-lora](https://github.com/tloen/alpaca-lora)を改変して）チューニングしたものの差分です。
+lora-alpacaから学習時のパラメータは特に変えていません。
+```
+import torch
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+LOAD_8BIT = False
+BASE_MODEL = "cyberagent/open-calm-7b"
+LORA_WEIGHTS = "nakayama/lora-hh-rlhf-49k-ja-for-open-calm-7b"
+tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
+model = AutoModelForCausalLM.from_pretrained(
+    BASE_MODEL,
+    load_in_8bit=LOAD_8BIT,
+    torch_dtype=torch.float16,
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(
+    model,
+    LORA_WEIGHTS,
+    torch_dtype=torch.float16,
+    adapter_name=LORA_WEIGHTS
+)
+def generate_prompt(instruction, input=None):
+    if input:
+        return f"""以下は、タスクを説明する命令と、さらなるコンテキストを提供する入力の組み合わせです。要求を適切に満たすような応答を書きなさい。
+### Instruction:
+{instruction}
+### Input:
+{input}
+### Response:"""
+    else:
+        return f"""以下は、ある作業を記述した指示です。依頼を適切に完了させる回答を書きなさい。
+### Instruction:
+{instruction}
+### Response:"""
+if not LOAD_8BIT:
+    model.half()
+instruction="次の日本の観光地について説明してください。"
+input="富士山"
+prompt = generate_prompt(instruction, input)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    generation_output = model.generate(
+        **inputs,
+        do_sample=True,
+        temperature=0.1,
+        top_p=0.75,
+        top_k=20,
+        return_dict_in_generate=True,
+        output_scores=True,
+        max_new_tokens=128,
+        repetition_penalty=1.5,
+        no_repeat_ngram_size=5,
+        pad_token_id=tokenizer.pad_token_id,
+    )
+s = generation_output.sequences[0]
+output = tokenizer.decode(s)
+print(output.split("### Response:")[1].strip())
+```