|
--- |
|
license: mit |
|
datasets: |
|
- NobodyExistsOnTheInternet/ToxicQAFinal |
|
--- |
|
|
|
# Alpha-Ophiuchi-mini-128k-v0.1 |
|
|
|
--- |
|
|
|
## Disclaimer |
|
|
|
**Note:** All models and LoRAs from the **Ophiuchus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms: |
|
|
|
- The user is responsible for what they might do with it, including how the output of the model is interpreted and used; |
|
- The user should not use the model and its outputs for any illegal purposes; |
|
- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA. |
|
|
|
I do not endorse any particular perspectives presented in the training data. |
|
|
|
--- |
|
|
|
## Ophiuchus Series |
|
|
|
This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses: |
|
|
|
- Science, Technology, Engineering, and Mathematics (STEM) |
|
- Computer Science (including programming) |
|
- Social Sciences |
|
|
|
And several key cognitive skills, including but not limited to: |
|
|
|
- Reasoning and logical deduction |
|
- Critical thinking |
|
- Analysis |
|
|
|
While maintaining strong overall knowledge and expertise, the models will undergo refinement through: |
|
|
|
- Fine-tuning processes |
|
- Model merging techniques including Mixture of Experts (MoE) |
|
|
|
Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes. |
|
|
|
## Base |
|
|
|
This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3). |
|
|
|
## LoRA |
|
|
|
The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA). |
|
|
|
## Datasets |
|
|
|
- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal) |
|
|
|
## Fine Tuning |
|
|
|
### - Quantization Configuration |
|
|
|
- load_in_4bit=True |
|
- bnb_4bit_quant_type="fp4" |
|
- bnb_4bit_compute_dtype=compute_dtype |
|
- bnb_4bit_use_double_quant=False |
|
|
|
### - PEFT Parameters |
|
|
|
- lora_alpha=64 |
|
- lora_dropout=0.05 |
|
- r=128 |
|
- bias="none" |
|
|
|
### - Training Arguments |
|
|
|
- num_train_epochs=1 |
|
- per_device_train_batch_size=1 |
|
- gradient_accumulation_steps=4 |
|
- optim="adamw_bnb_8bit" |
|
- save_steps=25 |
|
- logging_steps=25 |
|
- learning_rate=2e-4 |
|
- weight_decay=0.001 |
|
- fp16=False |
|
- bf16=False |
|
- max_grad_norm=0.3 |
|
- max_steps=-1 |
|
- warmup_ratio=0.03 |
|
- group_by_length=True |
|
- lr_scheduler_type="constant" |
|
|
|
## Credits |
|
|
|
- Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3; |
|
- HuggingFace: for hosting this model and for creating the fine-tuning tools used; |
|
- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation; |
|
- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset; |
|
|
|
A huge thank you to all of them ☺️ |