Blackroot/Llama-2-13B-Storywriter-LORA

Join the Coffee & AI Discord for AI Stuff and things!

Get the base model here:

Base Model Quantizations by The Bloke here: https://huggingface.co/TheBloke/Llama-2-13B-GGML https://huggingface.co/TheBloke/Llama-2-13B-GPTQ

Prompting for this model:

A brief warning that no alignment or attempts to sanitize or otherwise filter the dataset or the outputs have been done. This is a completelty raw model and may behave unpredictably or create scenarios that are unpleasant.

The base Llama2 is a text completion model. That means it will continue writing from the story in whatever manner you direct it. This is not an instruct tuned model, so don't try and give it instruction.

Correct prompting:

He grabbed his sword, his gleaming armor, he readied himself. The battle was coming, he walked into the dawn light and

Incorrect prompting:

Write a story about...

This model has been trained to generate as much text as possible, so you should use some mechanism to force it to stop at N tokens or something. For exmaple, in one prompt I average about 7000 output tokens, basically make sure you have a max sequence length set or it'll just keep going forever.

Training procedure

PEFT:

The following bitsandbytes quantization config was used during training:

load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32

This ran on for 3500 steps -- 3 epochs on an in testing storywriting dataset. Training took 14 hours on a 3090 Ti.