meta-llama/Prompt-Guard-86M · Open Sourcing Dataset?

Aug 2

I wanted to compare this performance with https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2 from LLM Guard or my own naive prompt engineered solution and the dataset would be very helpful for this use case.

cynikolai

Meta Llama org Aug 2

Hm, we can't expose the training dataset at this time, sorry.

However, I might recommend using an adversarial jailbreak dataset that wasn't used in training either model as an eval (you can usually tell if a model has been trained on a given dataset because it will have 99%+ accuracy, even on the test set). https://huggingface.co/datasets/synapsecai/synthetic-prompt-injections comes to mind as one that I don't believe was either used by this model or https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2.

johnnydevriese

Aug 2

Thank you for your helpful response. I appreciate the suggestion and will definitely look into that dataset. It sounds like a useful resource for evaluation purposes.

johnnydevriese changed discussion status to closed Aug 2