CL-bench Life

CL-bench Life, from the Tencent Hunyuan team and Fudan University’s NLP group, is a companion to CL-bench that shifts from clean professional sources to messy, real-life context. Instead of well-structured reference material, models are given everyday communication, fragmented notes, and behavioral traces that are socially grounded and temporally dispersed, and must learn from this context to complete tasks.

Tasks span three categories: communication & social interactions, fragmented information & revisions, and behavioral records & activity trails.

Methodology

We source results from the public CL-bench Life leaderboard, where the “Life” results appear as a tab alongside the main CL-bench leaderboard.

CL-bench Life contains 405 expert-curated context–task pairs with 5,348 verification rubrics across its three categories. As in CL-bench, scoring is rubric-based and the headline metric is the solving rate, the percentage of tasks solved against their rubrics. Our chart also exposes the per-category solving rates.

For full details, see the CL-bench Life paper and dataset.

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Papers & Reports

Data Insights

Newsletter

Podcast

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

CL-bench Life

CL-bench Life

Methodology

CL-bench Life

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Publications

Papers & Reports

Data Insights

Newsletter

Podcast

Data explorers

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

Benchmarks by Epoch AI

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

Scaling

Software progress

Open models

Capabilities

Math

Leading companies

Finances

Geopolitics

Chips

Data centers

Energy

Adoption and use

Economic impact

Future of AI

About Epoch AI

Donate

Team

Careers

Consultations

For press

Transparency

CL-bench Life

CL-bench Life

Methodology