Machine Learning benchmarking at NERSC¶
NERSC uses both standard framework-oriented benchmarks as well as scientific benchmarks from research projects in order to characterize our systems for scientific Deep Learning.
Framework benchmarks¶
TensorFlow¶
We ran a version of the tf_cnn_benchmarks repository as well as a DCGAN model on Cori.
Training results
PyTorch¶
We have a repository of benchmarks with standard computer vision models, LSTM, and 3D convolutional models here: https://github.com/sparticlesteve/pytorch-benchmarks
We compare PyTorch software installations, hardware, and analyze scaling performance using the PyTorch distributed library with MPI. See the notebooks in the links below for numbers and plots.
Software versions¶
Results for a handful of software versions that were available on the Cori system are in this notebook:
https://github.com/sparticlesteve/pytorch-benchmarks/blob/master/notebooks/SoftwareAnalysis.ipynb
Training throughput results:
Hardware comparisons¶
Results comparing training throughput on Cori Haswell, KNL, and GPU are here:
https://github.com/sparticlesteve/pytorch-benchmarks/blob/master/notebooks/HardwareAnalysis.ipynb
Scaling analysis¶
Throughput scaling results on Cori Haswell with Intel PyTorch v1.0.0 are available here:
https://github.com/sparticlesteve/pytorch-benchmarks/blob/master/notebooks/ScalingAnalysis.ipynb
Scientific Deep Learning Benchmarks¶
HEP-CNN¶
The HEP-CNN benchmark trains a simple Convolutional Neural Network to classify LHC collision detector images as signal or background.
- Framework: TensorFlow
- Multi-node library: Horovod or Cray PE ML Plugin
- Papers: https://arxiv.org/abs/1711.03573, https://arxiv.org/abs/1708.05256
- Code: https://github.com/sparticlesteve/hep_cnn_benchmark/tree/benchmark-dev
CosmoFlow¶
The CosmoFlow benchmark trains a 3D Convolutional Neural Network to predict cosmological parameters from simulated universe volumes.
- Framework: TensorFlow
- Multi-node library: Cray PE ML Plugin
- Paper: https://arxiv.org/abs/1808.04728
- Code: https://github.com/sparticlesteve/cosmoflow-benchmark
CosmoGAN¶
- Framework: TensorFlow
- Paper: https://arxiv.org/abs/1706.02390
- Code: https://github.com/MustafaMustafa/cosmoGAN
Deep Learning Climate Segmentation¶
- Framework: TensorFlow
- Multi-node library: Horovod
- Paper: https://arxiv.org/abs/1810.01993
- Code: https://github.com/sparticlesteve/climate-seg-benchmark