Example Pipeline
from transformers import pipeline
predict_task = pipeline(model="mrjunos/depression-reddit-distilroberta-base", task="text-classification")
predict_task("Stop listing your issues here, use forum instead or open ticket.")
[{'label': 'not_depression', 'score': 0.9813856482505798}]
Disclaimer: This machine learning model classifies texts related to depression, but I am not an expert or a mental health professional. I do not intend to diagnose or offer medical advice. The information provided should not replace consultation with a qualified professional. The results may not be accurate. Use this model at your own risk and seek professional advice if needed.
This model is a fine-tuned version of distilroberta-base on the mrjunos/depression-reddit-cleaned dataset. It achieves the following results on the evaluation set:
- Loss: 0.0821
- Accuracy: 0.9716
Model description
This model is a transformer-based model that has been fine-tuned on a dataset of Reddit posts related to depression. The model can be used to classify posts as either depression or not depression.
Intended uses & limitations
This model is intended to be used for research purposes. It is not yet ready for production use. The model has been trained on a dataset of English-language posts, so it may not be accurate for other languages.
Training and evaluation data
The model was trained on the mrjunos/depression-reddit-cleaned dataset, which contains approximately 7,000 labeled instances. The data was split into Train and Test using:
ds = ds['train'].train_test_split(test_size=0.2, seed=42)
The dataset consists of two main features: 'text' and 'label'. The 'text' feature contains the text data from Reddit posts related to depression, while the 'label' feature indicates whether a post is classified as depression or not.
Training procedure
You can find here the steps I followed to train this model: https://github.com/mrjunos/machine_learning/blob/main/NLP-fine_tunning-hugging_face_model.ipynb
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.1711 | 0.65 | 500 | 0.0821 | 0.9716 |
0.1022 | 1.29 | 1000 | 0.1148 | 0.9709 |
0.0595 | 1.94 | 1500 | 0.1178 | 0.9787 |
0.0348 | 2.59 | 2000 | 0.0951 | 0.9851 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.0
- Tokenizers 0.13.3
- Downloads last month
- 15
Dataset used to train mrjunos/depression-reddit-distilroberta-base
Evaluation results
- Accuracy on mrjunos/depression-reddit-cleanedself-reported0.972