File size: 1,684 Bytes
45a3ac8 d4727c5 45a3ac8 d4727c5 45a3ac8 d4727c5 45a3ac8 d4727c5 45a3ac8 d4727c5 45a3ac8 4a1cf8c d4727c5 45a3ac8 d4727c5 575c2f3 d4727c5 575c2f3 d4727c5 45a3ac8 d4727c5 45a3ac8 d4727c5 45a3ac8 575c2f3 45a3ac8 d4727c5 45a3ac8 d4727c5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
license: mit
base_model: TheBloke/zephyr-7B-alpha-GPTQ
tags:
- generated_from_trainer
- gptq
- peft
model-index:
- name: thesa
results: []
datasets:
- loaiabdalslam/counselchat
language:
- en
pipeline_tag: text-generation
---
# Thesa: A Therapy Chatbot 👩🏻⚕️
Thesa is an experimental project of a therapy chatbot trained on mental health data and fine-tuned with the Zephyr GPTQ model that uses quantization to decrease high computatinal and storage costs.
## Model description
- Model type: fine-tuned from [TheBloke/zephyr-7B-alpha-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-alpha-GPTQ) on various mental health datasets
- Language(s): English
- License: MIT
## Intended uses & limitations
This model is purely experimental and should not be used as substitute for a mental health professional.
## Training evaluation
Training loss:
<img src="imgs/loss_27.2.24.png" alt="loss" width="550"/>
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- warmup_ratio: 0.1
- train_batch_size: 8
- eval_batch_size: 8
- gradient_accumulation_steps: 1
- seed: 35
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP
- fp16: True
Learning rate overtime (warm up ratio was used during training):
<img src="imgs/lr_27.2.24.png" alt="lr" width="550"/>
### Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1
- Accelerate 0.27.2
- PEFT 0.8.2
- Auto-GPTQ 0.6.0
- TRL 0.7.11
- Optimum 1.17.1
- Bitsandbytes 0.42.0 |