Report for cardiffnlp/twitter-roberta-base-offensive
#81
by
inoki-giskard
- opened
Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 13 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tweet_eval (subset offensive
, split validation
).
👉Performance issues (8)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "maga" |
Recall = 0.400 | — | -43.68% than global |
🔍✨Examples
For records in the dataset where `text` contains "maga", the Recall is 43.68% lower than the global Recall.text | label | Predicted label |
|
---|---|---|---|
143 | The #Child-#Rape Assembly Line via @user We cannot #trust a #Catholic #priest. We cannot trust a #Jewish #rabbi - Why do we let these #people have #secrets? #GreatAwakening #MAGA @user | offensive | non-offensive (p = 0.72) |
180 | 😀😆😂 @user rocks #MAGA 👍 | offensive | non-offensive (p = 0.90) |
205 | A 5th columnist always imagines himself as a patriot. That’s how traitors rationalize perfidy. . . Clearly the deep state, the steady state, the swamp, or whatever you wish to call it sees itself as above such nuisances & trivialities as elections."" #MAGA"" | offensive | non-offensive (p = 0.51) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "antifa" |
Precision = 0.529 | — | -25.14% than global |
🔍✨Examples
For records in the dataset where `text` contains "antifa", the Precision is 25.14% lower than the global Precision.text | label | Predicted label |
|
---|---|---|---|
63 | @user condemn Antifa for attacking peaceful protesters..there was once a time that Joe Biden was viewed as someone who could be a bridge builder but you succumbed and caved to the evil ways of the Demon-crats and went over to the Dark Side..what would your son think. | non-offensive | offensive (p = 0.55) |
117 | @user @user I guess I don't know when I'm talking to a proud boy and when I'm talkin to an antifa. I am pro antifa. Which are you? | offensive | non-offensive (p = 0.78) |
151 | @user @user @user That's why what she did was an ANTIFA style attack. Abusive method meant to bully opposition into silence for control | offensive | non-offensive (p = 0.70) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "control" |
Recall = 0.537 | — | -24.39% than global |
🔍✨Examples
For records in the dataset where `text` contains "control", the Recall is 24.39% lower than the global Recall.text | label | Predicted label |
|
---|---|---|---|
8 | @user @user @user @user You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it | non-offensive | offensive (p = 0.71) |
14 | @user @user Will @user or @user ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! | offensive | non-offensive (p = 0.58) |
39 | @user This is why we need gun control | offensive | non-offensive (p = 0.66) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "gun" |
Recall = 0.558 | — | -21.48% than global |
🔍✨Examples
For records in the dataset where `text` contains "gun", the Recall is 21.48% lower than the global Recall.text | label | Predicted label |
|
---|---|---|---|
8 | @user @user @user @user You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it | non-offensive | offensive (p = 0.71) |
14 | @user @user Will @user or @user ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! | offensive | non-offensive (p = 0.58) |
39 | @user This is why we need gun control | offensive | non-offensive (p = 0.66) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "people" |
Recall = 0.622 | — | -12.48% than global |
🔍✨Examples
For records in the dataset where `text` contains "people", the Recall is 12.48% lower than the global Recall.text | label | Predicted label |
|
---|---|---|---|
143 | The #Child-#Rape Assembly Line via @user We cannot #trust a #Catholic #priest. We cannot trust a #Jewish #rabbi - Why do we let these #people have #secrets? #GreatAwakening #MAGA @user | offensive | non-offensive (p = 0.72) |
156 | @user @user Irony alert. Didn’t the @user under Thatcher sell of most of the council house stock and now they are trying to replace it. What a joke these people are | non-offensive | offensive (p = 0.53) |
196 | @user @user @user @user Or go to a baseball game with a terrorist? Who was that? Liberals seem to be the most perfect people but suck at everything. | non-offensive | offensive (p = 0.82) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | text contains "liberals" |
Precision = 0.623 | — | -11.91% than global |
🔍✨Examples
For records in the dataset where `text` contains "liberals", the Precision is 11.91% lower than the global Precision.text | label | Predicted label |
|
---|---|---|---|
41 | @user We need to stop expecting liberals to act reasonably...they murder babies...they are completely unhinged! So long as the crazies keep voting for the crazy party...you will get crazy. TDS is real!!! | non-offensive | offensive (p = 0.83) |
101 | @user @user I am upset. You know why because I remember following you based on the content of your post. I followed you around the 2016 election. You and many others lkke me were fighting for Hillary against real sexism and stupidity. All I asked was why do liberals attack other liberals | non-offensive | offensive (p = 0.58) |
135 | @user @user Liberals should just be banished from the United States & dropped in the middle east. | non-offensive | offensive (p = 0.56) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text contains "conservatives" |
Recall = 0.667 | — | -6.13% than global |
🔍✨Examples
For records in the dataset where `text` contains "conservatives", the Recall is 6.13% lower than the global Recall.text | label | Predicted label |
|
---|---|---|---|
238 | #Muslims and #Islam finally got their #WhiteSupremacist #Terrorists they have wanted for so long they can harp on about and say were worse than them .. Two 15 year old boys from #Ramsgate! Jesus 🤔 #MAGA #MEGA #MCGA #MBGA #Conservatives #Patriots | non-offensive | offensive (p = 0.65) |
290 | @user @user @user @user Red roses swarm. Conservatives pounce. We need our own verb. | offensive | non-offensive (p = 0.92) |
359 | @user conservatives believe all persons of color are a priori guilty of crimes therefore when PoC are denied their rights in extrajudicial killings no conservative EVER questions the killing. | offensive | non-offensive (p = 0.51) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text_length(text) >= 107.500 |
Balanced Accuracy = 0.736 | — | -5.31% than global |
🔍✨Examples
For records in the dataset where `text_length(text)` >= 107.500, the Balanced Accuracy is 5.31% lower than the global Balanced Accuracy.text | text_length(text) | label | Predicted label |
|
---|---|---|---|---|
8 | @user @user @user @user You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it | 154 | non-offensive | offensive (p = 0.71) |
14 | @user @user Will @user or @user ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! | 136 | offensive | non-offensive (p = 0.58) |
34 | @user Contribute to their Maximum Capacity"" - Translation: ""Vote for guys with a criminal record like me who occasionally wear dresses and support those who disrespect the sacrifices represented by the flag"""" | 212 | non-offensive | offensive (p = 0.50) |
👉Ethical issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Ethical | major 🔴 | — | Fail rate = 0.107 | Switch Religion | 3/28 tested samples (10.71%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 10.71% of the cases. We expected the predictions not to be affected by this transformation.text | Switch Religion(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
115 | @user You bet it has. England is doomed. Liberals everywhere are destroying our lives by collaborating with Muslims. The Royal Family bloodline mixes with Muslims going way back. Do you think this is an accident? Read this: | @user You bet it has. England is doomed. Liberals everywhere are destroying our lives by collaborating with hindus. The Royal Family bloodline mixes with hindus going way back. Do you think this is an accident? Read this: | offensive (p = 0.53) | non-offensive (p = 0.52) |
710 | @user God help us. the woman cant even get coherent sentence out of her mouth. | @user allah help us. the woman cant even get coherent sentence out of her mouth. | offensive (p = 0.54) | non-offensive (p = 0.56) |
1052 | #SJW crap and/or #Paedophilia .. #Hollywood #TV #Movies #Gaming #Politics #Comics #Police #PublicAuthorities like #Councils .. #DWP #DWPCrimes #CatholicChurch or #Church #Vatican #Islam and #Muslims used as a weapon .. Missed anything?! #NHS maybe? #MBGA #Conservatives | #SJW crap and/or #Paedophilia .. #Hollywood #TV #Movies #Gaming #Politics #Comics #Police #PublicAuthorities like #Councils .. #DWP #DWPCrimes #CatholicChurch or #mosque #kumbh mela #hinduism and #buddhists used as a weapon .. Missed anything?! #NHS maybe? #MBGA #Conservatives | non-offensive (p = 0.53) | offensive (p = 0.52) |
👉Underconfidence issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Underconfidence | medium 🟡 | avg_word_length(text) >= 4.156 |
Underconfidence rate = 0.024 | — | +17.22% than global |
🔍✨Examples
For records in your dataset where `avg_word_length(text)` >= 4.156, we found a significantly higher number of underconfident predictions (24 samples, corresponding to 2.39% of the predictions in the data slice).text | avg_word_length(text) | label | Predicted label |
|
---|---|---|---|---|
850 | @user @user @user . #Hypocrisy to see so called conservatives call out supposed sexual deviancy when just about every sexual political scandal in recent memory involves Republicans and it's really #homophobia #RoyMoore #Kavanaugh #JimJordan #MarkFoley #BobPackwood #ClarenceThomas #DonaldTrump | 6.73684 | offensive | offensive (p = 0.50) |
non-offensive (p = 0.50) | ||||
622 | @user @user @user @user @user @user That’s right...he lies all day long and he is still terrible at it...anyone else would have mastered it by now...he’s definitely got 10000hr | 5.10345 | non-offensive | non-offensive (p = 0.50) |
offensive (p = 0.50) | ||||
262 | @user you never were a slave. Spartacus was a slave and a heroic figure. You are neither. | 4.29412 | offensive | offensive (p = 0.50) |
non-offensive (p = 0.50) |
👉Robustness issues (3)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.123 | Transform to uppercase | 123/1000 tested samples (12.3%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 12.3% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
730 | @user GOP, Conservatives, Evangelicals, Traditionalists"" Catholics are all hypocrites."" | @USER GOP, CONSERVATIVES, EVANGELICALS, TRADITIONALISTS"" CATHOLICS ARE ALL HYPOCRITES."" | offensive (p = 0.70) | non-offensive (p = 0.75) |
1215 | @user @user Look at us we are just so full of ourselves an so narcissistic we can't realize our time is just about up . | @USER @USER LOOK AT US WE ARE JUST SO FULL OF OURSELVES AN SO NARCISSISTIC WE CAN'T REALIZE OUR TIME IS JUST ABOUT UP . | offensive (p = 0.67) | non-offensive (p = 0.63) |
783 | @user Oh cool. Alana found the one person dumber than she is and retweeted him. Nice move. | @USER OH COOL. ALANA FOUND THE ONE PERSON DUMBER THAN SHE IS AND RETWEETED HIM. NICE MOVE. | offensive (p = 0.65) | non-offensive (p = 0.59) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.086 | Add typos | 86/1000 tested samples (8.6%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 8.6% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
312 | @user @user @user I'll never understand the anti-ANTIFA sentiment. There's only alternative to being opposed to fascism. Fighting against fascists isn't akin to the racist violence of the KKK or to making death threats against a woman who's come forward regarding her abuse. | @user @user @user I'll nevet understand the angi-ANTIFA sentikent. There's only alternative to bejing opposed to fascim. Fighting against fascists isn't akin to the racist violence lf the KKK or to making dath threats against a woman who's come foerward regarding her abuse. | offensive (p = 0.52) | non-offensive (p = 0.51) |
991 | @user The liberals have sunk to an all new low....how tragic!! | @user The liberalsh ave sunk to an all new low....how tratic!@ | offensive (p = 0.53) | non-offensive (p = 0.56) |
562 | @user @user @user Never going to happen. Not in a million years. Not unless Soros hires a bunch of Antifa punks to pretend to be vets. | @user @user @user Never going to happen. Not in a million years. Not unless Soros hires a bunch of Antifa punks to pretend to be vets. | offensive (p = 0.53) | non-offensive (p = 0.55) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.070 | Transform to title case | 70/1000 tested samples (7.0%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 7.0% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to title case(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1215 | @user @user Look at us we are just so full of ourselves an so narcissistic we can't realize our time is just about up . | @User @User Look At Us We Are Just So Full Of Ourselves An So Narcissistic We Can'T Realize Our Time Is Just About Up . | offensive (p = 0.67) | non-offensive (p = 0.52) |
476 | @user @user Whatever happened to innocent until proven guilty... liberals just want Kavanaugh to be guilty because Trump picked him. Get over yourselves | @User @User Whatever Happened To Innocent Until Proven Guilty... Liberals Just Want Kavanaugh To Be Guilty Because Trump Picked Him. Get Over Yourselves | offensive (p = 0.51) | non-offensive (p = 0.56) |
1285 | @user @user @user @user @user @user @user @user @user You mean like all the Liberals screaming at a SC nomination hearing? That kind of clown show? So tell me since when do accusers get to make demands? This isn't a university misandry board of injustice. | @User @User @User @User @User @User @User @User @User You Mean Like All The Liberals Screaming At A Sc Nomination Hearing? That Kind Of Clown Show? So Tell Me Since When Do Accusers Get To Make Demands? This Isn'T A University Misandry Board Of Injustice. | non-offensive (p = 0.55) | offensive (p = 0.62) |
Checkout out the Giskard Space and test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.