MMLU
A multi-task exam-style benchmark covering dozens of academic and professional subjects to test breadth of knowledge and problem solving.
About MMLU
MMLU consists of four-choice questions spanning humanities, STEM, social sciences, and professional domains. Many questions require recall of domain facts, application of definitions, or light reasoning under time constraints—skills analogous to standardized testing.
Due to its breadth and stability, MMLU is frequently used as a headline indicator of general knowledge in model reports. Sub-scores by discipline can reveal strengths and weaknesses across subject areas.