Hyperparameter optimization¶
Hyperparameter optimization (HPO) is for tuning the hyperparameters of your machine learning model. E.g., the learning rate, filter sizes, etc. There are several popular algorithms used for HPO including grid search, random search, Bayesian optimization, and genetic optimization. Similarly, there are several libraries and tools implementing these algorithms, each having their own tradeoffs in usability, flexibility, and feature support.
On this page we will collect recommendations and examples for running distributed HPO tasks on our HPC systems.
Weights and Biases¶
W&B is a great tool for experiment logging and visualization, in addition to HPO. The W&B webpage has documentation and examples: https://wandb.ai/
KerasTuner¶
An easy-to-use tool if you're using Keras: https://keras.io/keras_tuner/
RayTune¶
Tune is an open-source Python library for experiment execution and hyperparameter tuning at any scale. RayTune:
- supports any ML framework
- implements state of the art HPO strategies
- natively integrates with optimization libraries (HyperOpt, BayesianOpt, and Facebook Ax)
- 1ntegrates well with Slurm Handles trials micro scheduling on
- multi-gpu-node resources (no GPU binding boilerplate needed)
We provide RayTune in all of our GPU TensorFlow and PyTorch modules and shifter image. You can also use our slurm-ray-cluster scripts for running multi-gpu nodes HPO campaigns, the repo also include a hello world MNIST example.
HYPPO¶
A new tool built by some LBNL folks which is tested on NERSC systems: https://hpo-uq.gitlab.io/
Cray HPO¶
Cray provides an HPO library which integrates very naturally with the Cray systems. It can use SLURM to request and use an allocation and provides genetic search, random search, grid search, and population-based training.
The official Cray HPO documentation can be found here:
https://cray.github.io/crayai/hpo/hpo.html
You can load the latest version on Cori with:
module load cray-hpo
You can find an example Jupyter notebook for genetic search here:
https://github.com/sparticlesteve/cori-intml-examples/blob/master/CrayHPO_rpv.ipynb
DeepHyper¶
DeepHyper is a Python package for distributed Hyperparameter Optimization, Neural Architecture Search and Uncertainty Quantification. It can interface with different backends to distribute computation such as threads, processes, Ray and MPI.
In case of issue contact Prasanna Balaprakash (pbalapra[at]anl[dot]gov) or directly open an issue on our Github.
A quick example of DeepHyper API:
def run(config: dict):
return -config["x"]**2
# Necessary IF statement otherwise it will enter in a infinite loop
# when loading the 'run' function from a subprocess
if __name__ == "__main__":
from deephyper.problem import HpProblem
from deephyper.search.hps import CBO
from deephyper.evaluator import Evaluator
# define the variable you want to optimize
problem = HpProblem()
problem.add_hyperparameter((-10.0, 10.0), "x")
# define the evaluator to distribute the computation
evaluator = Evaluator.create(
run,
method="process",
method_kwargs={
"num_workers": 2,
},
)
# define your search and execute it
search = CBO(problem, evaluator)
results = search.search(max_evals=100)
which outputs a Pandas DataFrame where the best x
is clearly near 0
:
p:x job_id objective timestamp_submit timestamp_gather
0 -7.744105 1 -5.997117e+01 0.011047 0.037649
1 -9.058254 2 -8.205196e+01 0.011054 0.056398
2 -1.959750 3 -3.840621e+00 0.049750 0.073166
3 -5.150553 4 -2.652819e+01 0.065681 0.089355
4 -6.697095 5 -4.485108e+01 0.082465 0.158050
.. ... ... ... ... ...
95 -0.034096 96 -1.162566e-03 26.479630 26.795639
96 -0.034204 97 -1.169901e-03 26.789255 27.155481
97 -0.037873 98 -1.434366e-03 27.148506 27.466934
98 -0.000073 99 -5.387088e-09 27.460253 27.774704
99 0.697162 100 -4.860350e-01 27.768153 28.142431