File size: 2,445 Bytes
cc0099a
 
 
 
 
 
994ced8
cc0099a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80592df
 
 
 
 
 
 
cc0099a
 
3edc5a8
 
 
 
cc0099a
 
 
3edc5a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80592df
3edc5a8
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
license: llama3
datasets:
- NobodyExistsOnTheInternet/ToxicQAFinal
---

# Llama-3-Alpha-Centauri-v0.1-LoRA

---

## Disclaimer

**Note:** All models and LoRAs from the **Centaurus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:

- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
- The user should not use the model and its outputs for any illegal purposes;
- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.

I do not endorse any particular perspectives presented in the training data.

---

## Base

This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3](https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3).

## Datasets

- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)

## Fine Tuning

### - Quantization Configuration

- load_in_4bit=True
- bnb_4bit_quant_type="fp4"
- bnb_4bit_compute_dtype=compute_dtype
- bnb_4bit_use_double_quant=False

### - PEFT Parameters

- lora_alpha=64
- lora_dropout=0.05
- r=128
- bias="none"

### - Training Arguments

- num_train_epochs=1
- per_device_train_batch_size=1
- gradient_accumulation_steps=4
- optim="adamw_bnb_8bit"
- save_steps=25
- logging_steps=25
- learning_rate=2e-4
- weight_decay=0.001
- fp16=False
- bf16=False
- max_grad_norm=0.3
- max_steps=-1
- warmup_ratio=0.03
- group_by_length=True
- lr_scheduler_type="constant"

## Credits

- Meta ([https://huggingface.co/meta-llama](https://huggingface.co/meta-llama)): for the original Llama-3;
- HuggingFace: for hosting this model and for creating the fine-tuning tools;
- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
- Undi95 ([https://huggingface.co/Undi95](https://huggingface.co/Undi95)) and Sao10k ([https://huggingface.co/Sao10K](https://huggingface.co/Sao10K)): my main inspirations for doing these models =]

A huge thank you to all of them ☺️