Accounting for the future will be anything but easy, but accounting for the present will be anything but difficult.
Description
DSCNTR-Alpha is a deep QLoRA fine-tune (256 rank) of the base Llama-3.1-8B that specializes mainly in Turkish along with a dozen other languages to create an absolutely un-slopped and human generalist/reasoner/coder/role-player/"o1-thinker" model, and also supports custom system prompts.
I, tomar753, developed it solely by me myself, including the data creation - yes, the dataset is >%95 hand-written by me, with the help of several top-league models whenever factual knowledge was a concern - and hyperparameter tinkering phases.
Despite being only one day away from 18, I have devoted the last 1.5 years of my life to "Project Descentral" (which will continue indefinitely) to create the local mental stimulator that would be clever and always unpredictable as a chat partner, but without the intelligence regression the creative training usually brings. Actually, my endeavors were never about mitigating any loss, but to advance every stat of the model, covering even unstructured text generation.
More than +2K hours has been spent on research since the earlier Pygmalion AI era to know how the LLM landscape would shape over time on the end-user side, so that I could figure out people's likes and dislikes in language models.
Although this version of the model is still far from ideal from my view, I figured it was competitive enough in its size class and worth being a glimpse to the actual releases in the coming months.
This repository includes only a slightly quantized version of the model, as its goal is to simply provide a usable demo to those who would like to experiment.
System Prompt / Usage
The model is mostly trained on an Alpaca-like format with the following baked-in default prompt:
### System:
You are a language model and AI assistant.
### User:
{{user_message}}
### Answer:
{{assistant_message}}
### User:
{{user_message_2}}
### Answer:
{{assistant_message_2}}
.
.
.
The model supports multi-turn conversations and (partially) multi-char RP.
Also note that the BOS token must be added before any tokens in all cases.
Data Preparation (Plus Some Musings)
As a close resembler of LIMA methodology, the dataset consists of merely 816 examples that are high-quality and especially diverse. Considering that the data is a product of a single person and at the size of almost four 300-page books, the sheer amount of time and mental work that went into the project is immeasurable by anything on the planet. Not to mention the writer's block occurring every now and then.
While I'm a strong supporter for fully open-source community, I have to respect and prioritize the efforts I did more than the collective chain since I have no personal income yet and must keep the original work as is to keep it valuable.
Training Parameters
r = 256
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head", "embed_tokens",]
lora_alpha = 32
lora_dropout = 0
bias = "none"
use_gradient_checkpointing = "unsloth"
random_state = 3407
use_rslora = True
use_dora = False
loftq_config = None
per_device_train_batch_size = 1
gradient_accumulation_steps = 16
warmup_ratio = 0.1
num_train_epochs = 3
learning_rate = 5e-5
embedding_learning_rate = 5e-6
max_steps = 0
group_by_length = False
bf16 = true
weight_decay = 0.01
max_grad_norm = 8.0
lr_scheduler_type = "cosine"
optim = "paged_adamw_8bit"
seed = 3407
Recommended Hyperparameters
All samplers neutralised, with min_p set to 0.1. Make sure the temperature is applied last.
Limitations
Being only an 8B model, this iteration probably isn't the best of the bests at factual knowledge and very complex reasoning over long chunks of context, but will surprise with its distinctive approach across different situations and open-ended tasks β there is a reason I am the data preparation process itself.
Disclaimer
The model might occasionally display unprompted bias in casual contexts out of the blue. Any harm as a consequence of its behavior is not of my responsibility. ALWAYS double-check its outputs.
Appreciations
Special thanks to my younger brother, kolar628, for helping me overcome uncreative episodes.
Shoutout to the precious T3 AIβLE family for their crystal clear vision and hard work in developing Turkish language models!
ββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Downloads last month
- 20
Model tree for d-scentre/DSCNTR-Alpha-8B
Base model
meta-llama/Llama-3.1-8B