llama-13b-hf-prompt-answering / README.md

Update README.md

821b86d over 1 year ago

4.04 kB

	---
	license: other
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- llama
	- decapoda-research-13b-hf
	- prompt answering
	- peft
	---

	## Model Card for Model ID

	This repository contains a LLaMA-13B further fine-tuned model on conversations and question answering prompts.

	⚠️ I used [LLaMA-13B-hf](https://huggingface.co/decapoda-research/llama-13b-hf) as a base model, so this model is for Research purpose only (See the [license](https://huggingface.co/decapoda-research/llama-13b-hf/blob/main/LICENSE))


	## Model Details


	### Model Description

	The decapoda-research/llama-13b-hf model was finetuned on conversations and question answering prompts.

	Developed by: [More Information Needed]

	Shared by: [More Information Needed]

	Model type: Causal LM

	Language(s) (NLP): English, multilingual

	License: Research

	Finetuned from model: decapoda-research/llama-13b-hf


	## Model Sources [optional]

	Repository: [More Information Needed]
	Paper: [More Information Needed]
	Demo: [More Information Needed]

	## Uses

	The model can be used for prompt answering


	### Direct Use

	The model can be used for prompt answering


	### Downstream Use

	Generating text and prompt answering


	## Recommendations

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.


	# Usage

	## Creating prompt

	The model was trained on the following kind of prompt:

	```python
	def generate_prompt(instruction: str, input_ctxt: str = None) -> str:
	if input_ctxt:
	return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Input:
	{input_ctxt}

	### Response:"""
	else:
	return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Response:"""
	```

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	import torch
	from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM

	tokenizer = LlamaTokenizer.from_pretrained("Sandiago21/llama-13b-hf-prompt-answering")
	model = LlamaForCausalLM.from_pretrained(
	"Sandiago21/llama-13b-hf-prompt-answering",
	load_in_8bit=True,
	torch_dtype=torch.float16,
	device_map="auto",
	)
	generation_config = GenerationConfig(
	temperature=0.2,
	top_p=0.75,
	top_k=40,
	num_beams=4,
	max_new_tokens=128,
	)

	model.eval()
	if torch.__version__ >= "2":
	model = torch.compile(model)
	```

	### Example of Usage
	```python
	instruction = "What is the capital city of Greece and with which countries does Greece border?"
	input_ctxt = None # For some tasks, you can provide an input context to help the model generate a better response.

	prompt = generate_prompt(instruction, input_ctxt)
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids
	input_ids = input_ids.to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	input_ids=input_ids,
	generation_config=generation_config,
	return_dict_in_generate=True,
	output_scores=True,
	)

	response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
	print(response)

	>>> The capital city of Greece is Athens and it borders Albania, Macedonia, Bulgaria and Turkey.
	```

	## Training Details


	### Training Data

	The decapoda-research/llama-13b-hf was finetuned on conversations and question answering data


	### Training Procedure

	The decapoda-research/llama-13b-hf model was further trained and finetuned on question answering and prompts data for 1 epoch (approximately 10 hours of training on a single GPU)


	## Model Architecture and Objective

	The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.