Data Insight
Aug. 15, 2025

Frontier AI performance becomes accessible on consumer hardware within a year

Using a single top-of-the-line gaming GPU like NVIDIA’s RTX 5090 (under $2500), anyone can locally run models matching the absolute frontier of LLM performance from just 6 to 12 months ago. This lag is consistent with our previous estimate of a 5 to 22 month gap for open-weight models of any size. However, it should be noted that small open models are more likely to be optimized for specific benchmarks, so the “real-world” lag may be somewhat longer.

Benchmark

Several factors drive this democratizing trend, including a comparable rate of scaling among open-weight models to the closed-source frontier, the success of techniques like model distillation, and continual progress in GPUs enabling larger models to be run at home.

Epoch's work is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons BY license.

Learn more about this graph

We find that leading open models runnable on a single consumer GPU typically match the capabilities of frontier models after an average lag between 6-12 months. This relatively short and consistent lag means that the most advanced AI capabilities are becoming widely accessible for local development and experimentation in under a year.

Data

Analysis

Assumptions

Explore this data