Our GPU Clusters Dataset is a collection of compute clusters made with chips that may be used for AI training and inference in addition to other high performance computing tasks. The dataset tracks key details such as their performance, hardware type, location, and estimated cost and power draw.
This documentation describes which clusters are contained within the dataset, the information in its records (including data fields and definitions), and processes for adding new entries and auditing accuracy. It also includes a changelog and acknowledgements.
The dataset is accessible on our website as a visualization or table, and is available for download as a CSV file, refreshed daily. For a quick-start example of loading the data and working with it in your research, see this Google Colab demo notebook.
If you would like to ask any questions about the database, or suggest any systems that should be added or edited, feel free to contact us at data@epoch.ai.
If this dataset is useful for you, please cite it.
Epoch’s data is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.