fearlessdots
/

Alpha-Ophiuchi-mini-128k-v0.1

+---
+license: mit
+datasets:
+- NobodyExistsOnTheInternet/ToxicQAFinal
+---
+# Alpha-Ophiuchi-mini-128k-v0.1
+---
+## Disclaimer
+**Note:** All models and LoRAs from the **Ophiuchus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:
+- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
+- The user should not use the model and its outputs for any illegal purposes;
+- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.
+I do not endorse any particular perspectives presented in the training data.
+---
+## Ophiuchus Series
+This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:
+- Science, Technology, Engineering, and Mathematics (STEM)
+- Computer Science (including programming)
+- Social Sciences
+And several key cognitive skills, including but not limited to:
+- Reasoning and logical deduction
+- Critical thinking
+- Analysis
+While maintaining strong overall knowledge and expertise, the models will undergo refinement through:
+- Fine-tuning processes
+- Model merging techniques including Mixture of Experts (MoE)
+Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.
+## Base
+This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3).
+## LoRA
+The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA).
+## Datasets
+- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
+## Fine Tuning
+### - Quantization Configuration
+- load_in_4bit=True
+- bnb_4bit_quant_type="fp4"
+- bnb_4bit_compute_dtype=compute_dtype
+- bnb_4bit_use_double_quant=False
+### - PEFT Parameters
+- lora_alpha=64
+- lora_dropout=0.05
+- r=128
+- bias="none"
+### - Training Arguments
+- num_train_epochs=1
+- per_device_train_batch_size=1
+- gradient_accumulation_steps=4
+- optim="adamw_bnb_8bit"
+- save_steps=25
+- logging_steps=25
+- learning_rate=2e-4
+- weight_decay=0.001
+- fp16=False
+- bf16=False
+- max_grad_norm=0.3
+- max_steps=-1
+- warmup_ratio=0.03
+- group_by_length=True
+- lr_scheduler_type="constant"
+## Credits
+- Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3;
+- HuggingFace: for hosting this model and for creating the fine-tuning tools used;
+- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
+- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
+A huge thank you to all of them ☺️