Model description

This model is a fine-tuned version of the bert-base-uncased model to classify toxic comments.

How to use

You can use the model with the following code.

from transformers import BertForSequenceClassification, BertTokenizer, TextClassificationPipeline

model_path = "Kwaai/bert-toxic-classification"
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path, num_labels=2)

pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(pipeline("You're a fucking nerd."))

Training data

The training data comes from this Kaggle competition. We use 90% of the train.csv data to train the model.

Evaluation results

The model achieves 0.95 AUC in a 1500 rows held-out test set.

Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.