5 4 10

Daniel De Leon

daniel-de-leon

daniel-de-leon-user293

AI & ML interests

None yet

Recent Activity

posted an update about 1 month ago

As the rapid adoption of chat bots and QandA models continues, so do the concerns for their reliability and safety. In response to this, many state-of-the-art models are being tuned to act as Safety Guardrails to protect against malicious usage and avoid undesired, harmful output. I published a Hugging Face blog introducing a simple, proof-of-concept, RoBERTa-based LLM that my team and I finetuned to detect toxic prompt inputs into chat-style LLMs. The article explores some of the tradeoffs of fine-tuning larger decoder vs. smaller encoder models and asks the question if "simpler is better" in the arena of toxic prompt detection. 🔗 to blog: https://huggingface.co/blog/daniel-de-leon/toxic-prompt-roberta 🔗 to model: https://huggingface.co/Intel/toxic-prompt-roberta 🔗 to OPEA microservice: https://github.com/opea-project/GenAIComps/tree/main/comps/guardrails/toxicity_detection A huge thank you to my colleagues that helped contribute: @qgao007, @mitalipo, @ashahba and Fahim Mohammad

upvoted an article about 1 month ago

¡Lanzamiento de la Comunidad Latinoamericana de NLP en Hugging Face! 🌟

published an article about 1 month ago

Occam’s Sheath: A Simpler Approach to AI Safety Guardrails

View all activity

Articles

Occam’s Sheath: A Simpler Approach to AI Safety Guardrails

Oct 18

• 8

Organizations

Posts 1

Post

2396

As the rapid adoption of chat bots and QandA models continues, so do the concerns for their reliability and safety. In response to this, many state-of-the-art models are being tuned to act as Safety Guardrails to protect against malicious usage and avoid undesired, harmful output. I published a Hugging Face blog introducing a simple, proof-of-concept, RoBERTa-based LLM that my team and I finetuned to detect toxic prompt inputs into chat-style LLMs. The article explores some of the tradeoffs of fine-tuning larger decoder vs. smaller encoder models and asks the question if "simpler is better" in the arena of toxic prompt detection.

🔗 to blog: https://huggingface.co/blog/daniel-de-leon/toxic-prompt-roberta
🔗 to model: Intel/toxic-prompt-roberta
🔗 to OPEA microservice: https://github.com/opea-project/GenAIComps/tree/main/comps/guardrails/toxicity_detection

A huge thank you to my colleagues that helped contribute: @qgao007 , @mitalipo , @ashahba and Fahim Mohammad

Glue Suite V2

models 1

daniel-de-leon/output

Updated Jul 27, 2023

datasets

None public yet

Daniel De Leon

AI & ML interests

Recent Activity

Articles

Occam’s Sheath: A Simpler Approach to AI Safety Guardrails

Organizations

Posts 1

spaces 5 Sort: Recently updated

Streamlit Docker Template

Streamlit Docker Template

Test Docker

St Shap Text Classification

Glue Suite V2

models 1

datasets

spaces 5