PIQA
A physical commonsense benchmark where models choose the more feasible solution to everyday problems.
About PIQA
PIQA presents short scenarios with two candidate solutions that differ subtly in practicality. Correct answers depend on intuitive physics, object affordances, and procedural knowledge (e.g., how to use household tools or perform simple tasks safely).
Because options are intentionally close in plausibility, PIQA resists surface heuristics and provides a focused probe of grounded commonsense.