Update README.md

0af61c5 verified 6 months ago

5.27 kB

	---
	library_name: peft
	base_model: mistralai/Mistral-7B-v0.1
	language:
	- en
	pipeline_tag: text-generation
	widget:
	- text: "How many helicopters can a human eat in one sitting?"
	tags:
	- Δ
	- LoRA
	---

	<!--
	# Model Card for Model ID
	-->

	## Model Details

	<!--![image/png](https://cdn-uploads.huggingface.co/production/uploads/648b0f4fd8fe693f51de98d2/aerBANxBtCya732NdBiw0.png)-->
	$$
	W_{mistral} + LoRA_{zephyr} = W_{zephyr} \\
	W_{zephyr} - LoRA_{zephyr} = W_{mistral}
	$$

	<!--
	$$ W_{mistral} + LoRA_{zephyr} = W_{zephyr} $$
	```
	typeof/zephyr-7b-beta-lora + mistralai/Mistral-7B-v0.1
	= HuggingFaceH4/zephyr-7b-beta
	````

	### Model Description

	- Developed by: [More Information Needed]
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]


	### Model Sources [optional]

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## Uses

	### Direct Use

	[More Information Needed]

	### Downstream Use [optional]

	[More Information Needed]

	### Out-of-Scope Use

	[More Information Needed]

	## Bias, Risks, and Limitations

	[More Information Needed]

	### Recommendations

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
	-->

	### Model Sources
	[HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	# pip install transformers peft

	import torch
	from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer

	model_id = "mistralai/Mistral-7B-v0.1"
	peft_model_id = "typeof/zephyr-7b-beta-lora"

	model = AutoModelForCausalLM.from_pretrained(model_id)
	model.load_adapter(peft_model_id)

	tokenizer_id = "HuggingFaceH4/zephyr-7b-beta" # for chat template etc...
	tokenizer = AutoTokenizer.from_pretrained(tokenizer_id)

	pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

	messages = [
	{
	"role": "system",
	"content": "You are a friendly chatbot who always responds in the style of a pirate",
	},
	{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
	]
	prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```
	```
	<\|system\|>
	You are a friendly chatbot who always responds in the style of a pirate</s>
	<\|user\|>
	How many helicopters can a human eat in one sitting?</s>
	<\|assistant\|>
	Well, me matey, that’s a good question indeed! I’ve never seen
	a human eat a helicopter, and I don’t think many others have
	either. However, I’ve heard rumors that some people have
	eaten entire airplanes, so I suppose it’s not entirely unheard
	of.

	As for the number of helicopters one could eat, that depends
	on the size and weight of the helicopter. A small, lightweight
	helicopter would be easier to eat than a large, heavy one.
	In fact, I’ve heard that some people have eaten entire helicopters
	as part of a dare or a challenge.

	So, my advice to you, me hearty, is to steer clear of helicopters
	and stick to more traditional fare. Yarr!</s>
	```
	<!--

	## Training Details

	### Training Data


	[More Information Needed]

	### Training Procedure


	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	#### Speeds, Sizes, Times [optional]


	[More Information Needed]

	## Evaluation


	### Testing Data, Factors & Metrics

	#### Testing Data


	[More Information Needed]

	#### Factors


	[More Information Needed]

	#### Metrics


	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary

	## Model Examination [optional]

	[More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	[More Information Needed]

	#### Software

	[More Information Needed]

	## Citation [optional]

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	[More Information Needed]

	## More Information

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]

	## Training procedure

	The following `bitsandbytes` quantization config was used during training:
	- quant_method: bitsandbytes
	- load_in_4bit: True
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: True

	### Framework versions

	- PEFT 0.6.3.dev0

	-->
	#### Summary

	[Zephyr-7B-β](https://arxiv.org/abs/2305.18290) is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
	[Zephyr-7B technical report](https://arxiv.org/abs/2310.16944)

	[LoRA](https://arxiv.org/abs/2305.14314)
	[QLoRA](https://arxiv.org/abs/2106.09685)