CommonSenseQA 2
A harder, bias-reduced multiple-choice benchmark that probes everyday commonsense beyond lexical shortcuts.
About CommonSenseQA 2
CommonSenseQA 2 extends classic commonsense QA with more diverse relations and more carefully constructed distractors, making superficial pattern matching less effective. Questions target practical knowledge about objects, social situations, and cause-and-effect, emphasizing the difference between plausible and correct answers.
The benchmark is designed to better reflect real-world ambiguity and to reduce annotation artifacts that can inflate scores. As a result, high performance often indicates genuine conceptual understanding rather than exploitation of dataset biases.