Skip to content

Machine Learning at NERSC

NERSC supports a variety of software for Machine Learning and Deep Learning on our systems.

These docs include details about how to use our system optimized frameworks, deploying large language models, multi-node training libraries, and performance guidelines.

Classical Machine Learning

Libraries like scikit-learn and other non-deep-learning libraries are supported through our standard installations and environments for Python and R, more under Analytics.

Deep Learning Frameworks

We have prioritized support for the following Deep Learning frameworks:

Deploying with Jupyter

Users can deploy distributed deep learning workloads from Jupyter notebooks using parallel execution libraries such as IPyParallel. Jupyter notebooks can be used to submit workloads to the batch system and also provide powerful interactive capabilities for monitoring and controlling those workloads.

Science use-cases

Machine Learning and Deep Learning are increasingly used to analyze scientific data, in diverse fields. We have gathered some examples of work ongoing at NERSC on the science use-cases page including some code and datasets and how to run these at NERSC.

Deploying Large Language Models on NERSC

The NERSC Chatbot Deployment package enables users to deploy Hugging Face large language models (LLMs) on NERSC supercomputers using Slurm and the vLLM serving framework. It supports both command-line and Python library interfaces, with utilities for seamless Gradio integration on NERSC JupyterHub. The deployed models expose an OpenAI-compatible API endpoint, facilitating easy integration with existing tools while enforcing secure API key access. For detailed usage and best practices, see the documentation.