File size: 2,473 Bytes
d0fba29 f0e2957 b4410b8 a050d03 d0fba29 3c0cc80 d0fba29 1e14f65 4b8c2dc 1b45cda b4410b8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- sft
- axium
base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
---
# About the uploaded model
- **Developed by:** prithivMLmods
- **License:** apache-2.0
- **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
**The model is still in the training phase. This is not the final version and may contain artifacts and perform poorly in some cases.**
## Trainer Configuration
| **Parameter** | **Value** |
|------------------------------|------------------------------------------|
| **Model** | `model` |
| **Tokenizer** | `tokenizer` |
| **Train Dataset** | `dataset` |
| **Dataset Text Field** | `text` |
| **Max Sequence Length** | `max_seq_length` |
| **Dataset Number of Processes** | `2` |
| **Packing** | `False` (Can make training 5x faster for short sequences.) |
| **Training Arguments** | |
| - **Per Device Train Batch Size** | `2` |
| - **Gradient Accumulation Steps** | `4` |
| - **Warmup Steps** | `5` |
| - **Number of Train Epochs** | `1` (Set this for 1 full training run.) |
| - **Max Steps** | `60` |
| - **Learning Rate** | `2e-4` |
| - **FP16** | `not is_bfloat16_supported()` |
| - **BF16** | `is_bfloat16_supported()` |
| - **Logging Steps** | `1` |
| - **Optimizer** | `adamw_8bit` |
| - **Weight Decay** | `0.01` |
| - **LR Scheduler Type** | `linear` |
| - **Seed** | `3407` |
| - **Output Directory** | `outputs` |
.
.
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |