fearlessdots
commited on
Commit
•
c8a1f76
1
Parent(s):
b090cc3
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,97 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- NobodyExistsOnTheInternet/ToxicQAFinal
|
5 |
+
---
|
6 |
+
|
7 |
+
# Alpha-Ophiuchi-mini-128k-v0.1
|
8 |
+
|
9 |
+
---
|
10 |
+
|
11 |
+
## Disclaimer
|
12 |
+
|
13 |
+
**Note:** All models and LoRAs from the **Ophiuchus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:
|
14 |
+
|
15 |
+
- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
|
16 |
+
- The user should not use the model and its outputs for any illegal purposes;
|
17 |
+
- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.
|
18 |
+
|
19 |
+
I do not endorse any particular perspectives presented in the training data.
|
20 |
+
|
21 |
+
---
|
22 |
+
|
23 |
+
## Ophiuchus Series
|
24 |
+
|
25 |
+
This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:
|
26 |
+
|
27 |
+
- Science, Technology, Engineering, and Mathematics (STEM)
|
28 |
+
- Computer Science (including programming)
|
29 |
+
- Social Sciences
|
30 |
+
|
31 |
+
And several key cognitive skills, including but not limited to:
|
32 |
+
|
33 |
+
- Reasoning and logical deduction
|
34 |
+
- Critical thinking
|
35 |
+
- Analysis
|
36 |
+
|
37 |
+
While maintaining strong overall knowledge and expertise, the models will undergo refinement through:
|
38 |
+
|
39 |
+
- Fine-tuning processes
|
40 |
+
- Model merging techniques including Mixture of Experts (MoE)
|
41 |
+
|
42 |
+
Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.
|
43 |
+
|
44 |
+
## Base
|
45 |
+
|
46 |
+
This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3).
|
47 |
+
|
48 |
+
## LoRA
|
49 |
+
|
50 |
+
The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA).
|
51 |
+
|
52 |
+
## Datasets
|
53 |
+
|
54 |
+
- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
|
55 |
+
|
56 |
+
## Fine Tuning
|
57 |
+
|
58 |
+
### - Quantization Configuration
|
59 |
+
|
60 |
+
- load_in_4bit=True
|
61 |
+
- bnb_4bit_quant_type="fp4"
|
62 |
+
- bnb_4bit_compute_dtype=compute_dtype
|
63 |
+
- bnb_4bit_use_double_quant=False
|
64 |
+
|
65 |
+
### - PEFT Parameters
|
66 |
+
|
67 |
+
- lora_alpha=64
|
68 |
+
- lora_dropout=0.05
|
69 |
+
- r=128
|
70 |
+
- bias="none"
|
71 |
+
|
72 |
+
### - Training Arguments
|
73 |
+
|
74 |
+
- num_train_epochs=1
|
75 |
+
- per_device_train_batch_size=1
|
76 |
+
- gradient_accumulation_steps=4
|
77 |
+
- optim="adamw_bnb_8bit"
|
78 |
+
- save_steps=25
|
79 |
+
- logging_steps=25
|
80 |
+
- learning_rate=2e-4
|
81 |
+
- weight_decay=0.001
|
82 |
+
- fp16=False
|
83 |
+
- bf16=False
|
84 |
+
- max_grad_norm=0.3
|
85 |
+
- max_steps=-1
|
86 |
+
- warmup_ratio=0.03
|
87 |
+
- group_by_length=True
|
88 |
+
- lr_scheduler_type="constant"
|
89 |
+
|
90 |
+
## Credits
|
91 |
+
|
92 |
+
- Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3;
|
93 |
+
- HuggingFace: for hosting this model and for creating the fine-tuning tools used;
|
94 |
+
- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
|
95 |
+
- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
|
96 |
+
|
97 |
+
A huge thank you to all of them ☺️
|