fearlessdots
/

Alpha-Ophiuchi-mini-128k-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Alpha-Ophiuchi-mini-128k-v0.1 / README.md

fearlessdots's picture

Update README.md

c8a1f76 verified 6 months ago

|

3.2 kB

	---
	license: mit
	datasets:
	- NobodyExistsOnTheInternet/ToxicQAFinal
	---

	# Alpha-Ophiuchi-mini-128k-v0.1

	---

	## Disclaimer

	Note: All models and LoRAs from the Ophiuchus series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:

	- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
	- The user should not use the model and its outputs for any illegal purposes;
	- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.

	I do not endorse any particular perspectives presented in the training data.

	---

	## Ophiuchus Series

	This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:

	- Science, Technology, Engineering, and Mathematics (STEM)
	- Computer Science (including programming)
	- Social Sciences

	And several key cognitive skills, including but not limited to:

	- Reasoning and logical deduction
	- Critical thinking
	- Analysis

	While maintaining strong overall knowledge and expertise, the models will undergo refinement through:

	- Fine-tuning processes
	- Model merging techniques including Mixture of Experts (MoE)

	Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.

	## Base

	This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3](https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3).

	## LoRA

	The LoRA merged with the base model is available at [https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA](https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA).

	## Datasets

	- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)

	## Fine Tuning

	### - Quantization Configuration

	- load_in_4bit=True
	- bnb_4bit_quant_type="fp4"
	- bnb_4bit_compute_dtype=compute_dtype
	- bnb_4bit_use_double_quant=False

	### - PEFT Parameters

	- lora_alpha=64
	- lora_dropout=0.05
	- r=128
	- bias="none"

	### - Training Arguments

	- num_train_epochs=1
	- per_device_train_batch_size=1
	- gradient_accumulation_steps=4
	- optim="adamw_bnb_8bit"
	- save_steps=25
	- logging_steps=25
	- learning_rate=2e-4
	- weight_decay=0.001
	- fp16=False
	- bf16=False
	- max_grad_norm=0.3
	- max_steps=-1
	- warmup_ratio=0.03
	- group_by_length=True
	- lr_scheduler_type="constant"

	## Credits

	- Microsoft ([https://huggingface.co/microsoft](https://huggingface.co/microsoft)): for the original Phi-3;
	- HuggingFace: for hosting this model and for creating the fine-tuning tools used;
	- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
	- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;

	A huge thank you to all of them ☺️