Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Emotion Classification Model

Model Description

This model fine-tunes DistilBERT for the task of emotion classification. It is trained to classify text into one of six emotions: sadness, joy, love, anger, fear, and surprise. The model is designed for natural language processing applications where understanding emotions in text is valuable, such as social media analysis, customer feedback, and mental health monitoring.

Training and Evaluation

  • Training Dataset: dair-ai/emotion (16,000 examples)
  • Validation Accuracy: 94.5%
  • Test Accuracy: 93.1%
  • Training Time: 169.2 seconds (~2 minutes 49 seconds)
  • Hyperparameters:
    • Learning Rate: 5e-5
    • Batch Size (Train): 32
    • Batch Size (Validation): 64
    • Epochs: 3
    • Weight Decay: 0.01
    • Optimizer: AdamW
    • Evaluation Strategy: Epoch-based

Usage

from transformers import pipeline

# Load the model from HuggingFace Hub
classifier = pipeline("text-classification", model="your-username/emotion-classification-model")

# Example usage
text = "I’m so happy today!"
result = classifier(text)
print(result)


## Limitations
**Biases in Dataset**
  The model was trained on the dair-ai/emotion dataset, which may not represent the full diversity of language use across demographics, regions, or cultures.
As a result, it might underperform on texts containing:

  - Slang or Informal Language
  For example, "I'm shook!" may not be accurately classified as an expression of surprise.

  - Non-Standard Grammar or Dialects
  Variants like African American Vernacular English (AAVE) or regional dialects might lead to misclassifications.

  - Limited Contextual Understanding
  The model processes inputs as isolated pieces of text, without awareness of surrounding context.
  For instance:
    - Sarcasm
    "Oh great, another rainy day!" may not be correctly classified as expressing frustration.

    - Complex or Mixed Emotions
    Texts expressing multiple emotions (e.g., "I’m angry but also relieved") may be oversimplified into a single label.

    - Short Texts and Ambiguity
    Performance can degrade for very short texts (e.g., one or two words) due to insufficient context.
      For example:
      - "Wow!" might be classified as joy or surprise depending on subtle cues not present in such brief inputs.
      - Ambiguous inputs like "Okay" or "Fine" are challenging without additional context.

  - Domain-Specific Language
  The model may underperform on text from specialized domains (e.g., legal, medical, or technical writing) or content involving code-mixed or multilingual inputs.
  For example, "Estoy feliz!" might not be recognized as expressing joy without multilingual training.


## Potential Improvements
- Data Augmentation
Including additional datasets or generating synthetic data can improve generalization.

- Longer Training
Training for more epochs could marginally increase accuracy, although diminishing returns are likely.

- Larger Models
Fine-tuning larger models like BERT or RoBERTa may yield better results for nuanced understanding.

- Bias Mitigation
Incorporating fairness-aware training methods or balanced datasets could reduce biases.
Downloads last month
15
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train uboza10300/emotion-classification-model