fearlessdots commited on
Commit
c8a1f76
1 Parent(s): b090cc3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -3
README.md CHANGED
@@ -1,3 +1,97 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - NobodyExistsOnTheInternet/ToxicQAFinal
5
+ ---
6
+
7
+ # Alpha-Ophiuchi-mini-128k-v0.1
8
+
9
+ ---
10
+
11
+ ## Disclaimer
12
+
13
+ **Note:** All models and LoRAs from the **Ophiuchus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:
14
+
15
+ - The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
16
+ - The user should not use the model and its outputs for any illegal purposes;
17
+ - The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.
18
+
19
+ I do not endorse any particular perspectives presented in the training data.
20
+
21
+ ---
22
+
23
+ ## Ophiuchus Series
24
+
25
+ This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:
26
+
27
+ - Science, Technology, Engineering, and Mathematics (STEM)
28
+ - Computer Science (including programming)
29
+ - Social Sciences
30
+
31
+ And several key cognitive skills, including but not limited to:
32
+
33
+ - Reasoning and logical deduction
34
+ - Critical thinking
35
+ - Analysis
36
+
37
+ While maintaining strong overall knowledge and expertise, the models will undergo refinement through:
38
+
39
+ - Fine-tuning processes
40
+ - Model merging techniques including Mixture of Experts (MoE)
41
+
42
+ Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.
43
+
44
+ ## Base
45
+
46
+ This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3).
47
+
48
+ ## LoRA
49
+
50
+ The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA).
51
+
52
+ ## Datasets
53
+
54
+ - [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
55
+
56
+ ## Fine Tuning
57
+
58
+ ### - Quantization Configuration
59
+
60
+ - load_in_4bit=True
61
+ - bnb_4bit_quant_type="fp4"
62
+ - bnb_4bit_compute_dtype=compute_dtype
63
+ - bnb_4bit_use_double_quant=False
64
+
65
+ ### - PEFT Parameters
66
+
67
+ - lora_alpha=64
68
+ - lora_dropout=0.05
69
+ - r=128
70
+ - bias="none"
71
+
72
+ ### - Training Arguments
73
+
74
+ - num_train_epochs=1
75
+ - per_device_train_batch_size=1
76
+ - gradient_accumulation_steps=4
77
+ - optim="adamw_bnb_8bit"
78
+ - save_steps=25
79
+ - logging_steps=25
80
+ - learning_rate=2e-4
81
+ - weight_decay=0.001
82
+ - fp16=False
83
+ - bf16=False
84
+ - max_grad_norm=0.3
85
+ - max_steps=-1
86
+ - warmup_ratio=0.03
87
+ - group_by_length=True
88
+ - lr_scheduler_type="constant"
89
+
90
+ ## Credits
91
+
92
+ - Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3;
93
+ - HuggingFace: for hosting this model and for creating the fine-tuning tools used;
94
+ - failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
95
+ - NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
96
+
97
+ A huge thank you to all of them ☺️