fearlessdots's picture
Update README.md
c8a1f76 verified
|
raw
history blame
3.2 kB
---
license: mit
datasets:
- NobodyExistsOnTheInternet/ToxicQAFinal
---
# Alpha-Ophiuchi-mini-128k-v0.1
---
## Disclaimer
**Note:** All models and LoRAs from the **Ophiuchus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:
- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
- The user should not use the model and its outputs for any illegal purposes;
- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.
I do not endorse any particular perspectives presented in the training data.
---
## Ophiuchus Series
This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:
- Science, Technology, Engineering, and Mathematics (STEM)
- Computer Science (including programming)
- Social Sciences
And several key cognitive skills, including but not limited to:
- Reasoning and logical deduction
- Critical thinking
- Analysis
While maintaining strong overall knowledge and expertise, the models will undergo refinement through:
- Fine-tuning processes
- Model merging techniques including Mixture of Experts (MoE)
Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.
## Base
This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3).
## LoRA
The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA).
## Datasets
- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
## Fine Tuning
### - Quantization Configuration
- load_in_4bit=True
- bnb_4bit_quant_type="fp4"
- bnb_4bit_compute_dtype=compute_dtype
- bnb_4bit_use_double_quant=False
### - PEFT Parameters
- lora_alpha=64
- lora_dropout=0.05
- r=128
- bias="none"
### - Training Arguments
- num_train_epochs=1
- per_device_train_batch_size=1
- gradient_accumulation_steps=4
- optim="adamw_bnb_8bit"
- save_steps=25
- logging_steps=25
- learning_rate=2e-4
- weight_decay=0.001
- fp16=False
- bf16=False
- max_grad_norm=0.3
- max_steps=-1
- warmup_ratio=0.03
- group_by_length=True
- lr_scheduler_type="constant"
## Credits
- Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3;
- HuggingFace: for hosting this model and for creating the fine-tuning tools used;
- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
A huge thank you to all of them ☺️