fearlessdots's picture
Update README.md
c8a1f76 verified
|
raw
history blame
3.2 kB
metadata
license: mit
datasets:
  - NobodyExistsOnTheInternet/ToxicQAFinal

Alpha-Ophiuchi-mini-128k-v0.1


Disclaimer

Note: All models and LoRAs from the Ophiuchus series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:

  • The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
  • The user should not use the model and its outputs for any illegal purposes;
  • The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.

I do not endorse any particular perspectives presented in the training data.


Ophiuchus Series

This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:

  • Science, Technology, Engineering, and Mathematics (STEM)
  • Computer Science (including programming)
  • Social Sciences

And several key cognitive skills, including but not limited to:

  • Reasoning and logical deduction
  • Critical thinking
  • Analysis

While maintaining strong overall knowledge and expertise, the models will undergo refinement through:

  • Fine-tuning processes
  • Model merging techniques including Mixture of Experts (MoE)

Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.

Base

This model and its related LoRA was fine-tuned on https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3.

LoRA

The LoRA merged with the base model is available at https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA.

Datasets

Fine Tuning

- Quantization Configuration

  • load_in_4bit=True
  • bnb_4bit_quant_type="fp4"
  • bnb_4bit_compute_dtype=compute_dtype
  • bnb_4bit_use_double_quant=False

- PEFT Parameters

  • lora_alpha=64
  • lora_dropout=0.05
  • r=128
  • bias="none"

- Training Arguments

  • num_train_epochs=1
  • per_device_train_batch_size=1
  • gradient_accumulation_steps=4
  • optim="adamw_bnb_8bit"
  • save_steps=25
  • logging_steps=25
  • learning_rate=2e-4
  • weight_decay=0.001
  • fp16=False
  • bf16=False
  • max_grad_norm=0.3
  • max_steps=-1
  • warmup_ratio=0.03
  • group_by_length=True
  • lr_scheduler_type="constant"

Credits

A huge thank you to all of them ☺️