Skip to content

Using Perlmutter

Perlmutter is not a production resource

Perlmutter is not a production resource and usage is not charged against your allocation of time. While we will attempt to make the system available to users as much as possible, it is subject to unannounced and unexpected outages, reconfigurations, and periods of restricted access. Please visit the timeline page for more information about changes we've made in our recent upgrades.

Current Known Issues

Known Issues on Perlmutter

For updates on past issues, see the timeline page.

Access

Perlmutter is now available to general users. All users with an active NERSC account have been added to the Perlmutter server login. Please follow the steps below to login to the system. If you wish to obtain a NERSC account please visit our accounts page to get an overview of what kind of allocation or user account you need.

Connecting to Perlmutter

You can connect directly to Perlmutter with

ssh perlmutter-p1.nersc.gov

or

ssh saul-p1.nersc.gov

You can also connect to Perlmutter from Cori or a DTN and then connect to Perlmutter with ssh perlmutter.

Connecting to Perlmutter with sshproxy

If you have an ssh key generated by sshproxy, you can configure your local computer's ~/.ssh/config file as suggested in the webpage section SSH Configuration File Options.

Connecting to Perlmutter with a Collaboration Account

Collabsu is not available on Perlmutter. Please create a direct login with sshproxy to login into Perlmutter or switch to a collaboration account on Cori or the DTNs and then login to Perlmutter.

Transferring Data to / from Perlmutter Scratch

Perlmutter scratch is only accessible from Perlmutter login or compute nodes.

NERSC has set up a dedicated Globus Endpoint on Perlmutter that has access to Perlmutter Scratch as well as the Community and Homes File Systems at NERSC. This is the recommended way to transfer large volumes of data to/from Perlmutter scratch.

Alternatively, for small transfers you can use scp on a Perlmutter login node.

Larger datasets could also be staged on the Community File System (which is available on Perlmutter) either with Globus, or a cp, or rsync on a Data Transfer Node. Once the data is on the Community File System, you can use cp, or rsync from a Perlmutter login node to copy the data to Perlmutter scratch.

Caveats on the system

Last Updated: Dec 10th, 2021.

  • Static compilation isn't officially supported by NERSC, but we have outlined some instructions under the static compilation section in the compiler wrappers documentation page.
  • MPI/mpi4py users may notice a mlx5 error that stems from spawning forks within an MPI rank, which is considered undefined/unsupported behavior.
  • PrgEnv-gnu users when using a GPU enabled code (gcc and nvcc) you might have to load a compatible gcc version for the respective cudatoolkit installation. Please see our gcc compatibility section for additional details.
  • Users may notice MKL-based CPU code runs more slowly. Please try module load fast-mkl-amd.

Preparing for Perlmutter

Please check the Transitioning Applications to Perlmutter webpage for a wealth of useful information on how to transition your applications for Perlmutter.

Compiling/Building Software

Running Jobs

Perlmutter uses Slurm for batch job scheduling.

Tip

To run a job on Perlmutter GPU nodes, you must submit the job using a project GPU allocation account name, which ends in _g (e.g., m9999_g). To run a job on Perlmutter or Cori CPU nodes, use an account name without the trailing _g.

Below you can find general information on how to submit jobs using Slurm and monitor jobs, etc.:

Known issues

GPU Binding with many MPI processes

Bug in Cray MPICH may require GPU binding for jobs with many MPI ranks

Due to an outstanding bug with our vendor, users with many MPI ranks may also require GPU binding. This is because the MPI ranks are incorrectly allocating GPU memory, and too many MPI ranks that allocate this memory will cause the program to segfault (this segfault might happen during execution, or before the first statement is executed, and may happen only when multiple nodes are used). One workaround is to use gpu-binding to evenly spread the allocated memory. Here is an example of using gpu-binding in a 4 node job:

srun --ntasks=32 --ntasks-per-node=8 -G 4 --gpu-bind=single:2 python -m mpi4py.bench helloworld
Even with GPU binding, users may find that the number of MPI ranks they can use within a job are limited. Note that this also impacts CPU-only code that is using CUDA-aware MPI. We expect a fix for this problem soon.

Profiling with hardware counters

NVIDIA Data Center GPU Manager (dcgm) is a light weight tool to measure and monitor GPU utilization and comprehensive diagnostics of GPU nodes on a cluster. NERSC will be using this tool to measure application utilization and monitor the status of the machine. Due to current hardware limitations, collecting profiling metrics using performance tools such as Nsight-Compute, TAU, HPCToolkit applications that require acess to hardware counters will conflict with the DCGM instance running on the system.

To invoke performance collection with ncu one must add dcgmi profile --pause / --resume to your scripts (this script will work for single node or multiple node runs):

srun --ntasks-per-node 1 dcgmi profile --pause
srun <Slurm flags> ncu -o <filename> <other Nsight Compute flags> <program> <program arguments>
srun --ntasks-per-node 1 dcgmi profile --resume

Running profiler on multiple nodes

The DCGM instance on each node must be paused before running the profiler. Please note that you should only use 1 task to pause the dcgm instance as shown above.