metadata

license: mit
datasets:
  - stereoset
  - crows_pairs
  - wu981526092/MGSD
language:
  - en
metrics:
  - f1
  - recall
  - precision
  - accuracy

Token-Level Stereotype Classifier

The Token-Level Stereotype Classifier is a transformer-based model developed to detect and classify different types of stereotypes present in the text at the token level. It is designed to recognize stereotypical and anti-stereotypical stereotypes towards gender, race, profession, and religion. The model can help in developing applications aimed at mitigating stereotypical language use and promoting fairness and inclusivity in natural language processing tasks.

Model Architecture

The model is built using the pretrained Distilbert model. It is fine-tuned on MGS Dataset for the task of token-level classification.

Classes

The model identifies nine classes, including:

unrelated: The token does not indicate any stereotype.
stereotype_gender: The token indicates a gender stereotype.
anti-stereotype_gender: The token indicates an anti-gender stereotype.
stereotype_race: The token indicates a racial stereotype.
anti-stereotype_race: The token indicates an anti-racial stereotype.
stereotype_profession: The token indicates a professional stereotype.
anti-stereotype_profession: The token indicates an anti-professional stereotype.
stereotype_religion: The token indicates a religious stereotype.
anti-stereotype_religion: The token indicates an anti-religious stereotype.

Usage

The model can be used as a part of the Hugging Face's pipeline for Named Entity Recognition (NER).

from transformers import pipeline

nlp = pipeline("ner", model="wu981526092/Token-Level-Stereotype-Detector", tokenizer="wu981526092/Token-Level-Stereotype-Detector")
result = nlp("Text containing potential stereotype...")

print(result)