oranne55
/

qualifier-model3-finetune-pretrained-transformer

Text Classification

prompt-injection

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Jailbreak Classifier

Classifies prompts as jailbreaks or benign. This is a fine-tune checkpoint of bert-base-uncased on the jailbreak-classification dataset.

Training Details

Training Data

Fine-tuned on the jailbreak-classification dataset.

Training Procedure

Training Hyperparameters

Second fine-tuning hyper-parameters(on train(0.8) and val(0.2))

learning_rate = 5e-5
train_batch_size = 8
eval_batch_size = 8
lr_scheduler_type = linear
num_train_epochs = 5.0

Fecond fine-tuning hyper-parameters(on train and test)

learning_rate = 1e-5
train_batch_size = 8
eval_batch_size = 8
lr_scheduler_type = linear
num_train_epochs = 3.0

Downloads last month: 120

Safetensors

Model size

109M params

Tensor type

F32

·

Inference Examples

Text Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train oranne55/qualifier-model3-finetune-pretrained-transformer