ibm/BoolQ_robustness
Viewer
•
Updated
•
29.4k
•
47
Datasets from "A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios" (https://arxiv.org/abs/2408.01963)