PIQA presents short scenarios with two candidate solutions that differ subtly in practicality. Correct answers depend on intuitive physics, object affordances, and procedural knowledge (e.g., how to use household tools or perform simple tasks safely).
Because options are intentionally close in plausibility, PIQA resists surface heuristics and provides a focused probe of grounded commonsense.
Have a question? Noticed something wrong? Let us know.
A physical commonsense benchmark where models choose the more feasible solution to everyday problems.