Podman at NERSC¶
Podman (Pod Manager) is an open-source, OCI-compliant container framework that is under active development by Red Hat. In many ways Podman can be treated as a drop-in replacement for Docker.
Since "out of the box" Podman currently lacks several key capabilities for HPC users, NERSC has been working with with Red Hat to adapt Podman for HPC use-cases and has developed an add-on called podman-hpc. podman-hpc
is now available to all users on Perlmutter. podman-hpc
enables improved performance, especially at large scale, and makes using common HPC tools like Cray MPI and NVIDIA CUDA capabilities easy.
podman-hpc
at NERSC is experimental
podman-hpc
has been recently deployed at NERSC and should not be considered stable or suitable for production. If you encounter what you think could be a problem/bug, please report it to us via filing a NERSC ticket.
Users may be interested in using Podman Desktop on their local machines. It is a free alternative to Docker Desktop.
Why podman-hpc
?¶
Users who are comfortable with Shifter, the current NERSC production container runtime, may wonder what advantages Podman offers over Shifter. Here are a few:
podman-hpc
doesn't impose many of the restrictions that Shifter does:- No container modules will be loaded by default.
- Most environment variables will not be automatically propagated into the container.
- Applications that require root permission inside the container will be allowed to run. This is securely enabled via Podman's rootless mode.
- Users can build images directly on Perlmutter.
- Users can choose to run these images directly via
podman-hpc
without uploading to an intermediate repository. - Podman is an OCI-compliant framework (like Docker). Users who are familiar with Docker will find that Podman has very similar syntax and can often be used as a drop-in replacement for Docker. Users may also find that this makes their workflow more portable.
podman-hpc
is a transparent wrapper around Podman. Users will find that they can pass standard unprivileged Podman commands topodman-hpc
.- Podman is a widely used tool that is not specific to NERSC.
How to use podman-hpc
¶
To see all available commands, users can issue the podman-hpc --help
command:
elvis@nid001036:~> podman-hpc --help
Manage pods, containers and images ... on HPC!
Description:
The podman-hpc utility is a wrapper script around the podman container
engine. It provides additional subcommands for ease of use and
configuration of podman in a multi-node, multi-user high performance
computing environment.
Usage: podman-hpc [options] COMMAND [ARGS]...
Options:
--additional-stores TEXT Specify other storage locations
--squash-dir TEXT Specify alternate squash directory location
--help Show this message and exit.
Commands:
infohpc Dump configuration information for podman_hpc.
migrate Migrate an image to squashed.
pull Pulls an image to a local repository and makes a squashed...
rmsqi Removes a squashed image.
shared-run Launch a single container and exec many threads in it This is...
...
command.
podman-hpc
is available on Perlmutter. Once users ssh to Perlmutter, they can issue the podman-hpc images
command:
elvis@nid001036:~> podman-hpc images
REPOSITORY TAG IMAGE ID CREATED SIZE R/O
elvis@nid001036:~>
command. This should show there are no images yet.
Building images¶
Users should generate a Containerfile
or Dockerfile
. (A Containerfile
is a more general form of a Dockerfile
- they follow the same syntax and usually can be used interchangeably.) Users can build and tag the image in the same directory via a command like:
podman-hpc build -t elvis:test .
Images that a user builds with podman-hpc
will be automatically converted into a suitable squashfile format for podman-hpc
. These images can be directly accessed and used in a job.
podman-hpc
images and caches are stored in local storage
podman-hpc
build artifacts and cache files will be stored on the login node where the issue performed the build. If a user logs onto a new node, they will not have access to these cached files and will need to build from scratch. At the moment we have no purge policy for the local image build storage, although users can likely expect one in the future.
Pulling images¶
If a user just needs to pull an existing image, they must first log in to their chosen registry (in this case, Dockerhub).
elvis@nid001036:~> podman-hpc login docker.io
Username: elvis
Password:
Login Succeeded!
The user can then pull the image
elvis@nid001036:~> podman-hpc pull elvis/hello-world:1.0
Trying to pull docker.io/elvis/hello-world:1.0...
Getting image source signatures
Copying blob sha256:7b1a6ab2e44dbac178598dabe7cff59bd67233dba0b27e4fbd1f9d4b3c877a54
Copying config sha256:0849b79544d682e6149e46977033706b17075be384215ef8a69b5a37037c7231
Writing manifest to image destination
Storing signatures
0849b79544d682e6149e46977033706b17075be384215ef8a69b5a37037c7231
elvis@nid001036:~> podman-hpc images
REPOSITORY TAG IMAGE ID CREATED SIZE R/O
docker.io/elvis/hello-world 1.0 0849b79544d6 16 months ago 75.2 MB true
Images that a user pulls from a registry will be automatically converted into a suitable squashfile format for podman-hpc
. These images can be directly accessed and used in a job.
Using podman-hpc
as a container runtime¶
Users can use podman-hpc
as a container runtime. Early benchmarking has shown that in many cases, performance is comparable to Shifter and bare metal.
Our goal has been to design podman-hpc
so that standard Podman commands still work. Please check out this page for a full list of podman run capabilities.
Users can use podman-hpc
in both interactive and batch jobs without requesting any special resources. They only need to have previously built or pulled an image via podman-hpc
. Users may chose to run a container in interactive mode, like in this example:
elvis@nid001036:~> podman-hpc run --rm -it registry.nersc.gov/library/nersc/mpi4py:3.1.3 /bin/bash
root@d23b3ea141ed:/opt# cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.5 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.5 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
root@d23b3ea141ed:/opt# exit
exit
elvis@nid001036:~>
Here we see that the container is using the Ubuntu Jammy OS.
Users may also chose to run a container in standard run mode:
elvis@nid001036:~> podman-hpc run --rm registry.nersc.gov/library/nersc/mpi4py:3.1.3 echo $SLURM_JOB_ID
198507
elvis@nid001036:~>
Here we print the SLURM job id from inside the container.
Unlike Shifter, podman-hpc
does not enable any MPI or GPU capability by default. Users must request the additional utilities they need.
Module Name | Function |
---|---|
--mpi | Uses current optimized Cray MPI |
--gpu | Uses older CLE7 Cray MPI libraries |
More modules will be added soon.
Unlike Shifter, no capabilities are loaded by default
Shifter users may be aware that MPICH and GPU capabilities are loaded by default. In podman-hpc
, we take the opposite (and more OCI-compliant approach) in which users must explicitly request all capabilities they need.
Using Cray MPICH in podman-hpc
¶
Using Cray MPICH in podman-hpc
is very similar to what we describe in our MPI in Shifter documentation. To be able to use Cray MPICH at runtime, users must first include a standard implementation of MPICH in their image. If users add the podman-hpc --mpi
flag, it will enable our current Cray MPICH to be inserted and replaced with the MPICH in their container at runtime.
Here is an example of running an MPI-enabled task in podman-hpc
in an interactive job:
elvis@nid001037:~> srun -n 2 podman-hpc run --rm --mpi registry.nersc.gov/library/nersc/mpi4py:3.1.3 python3 -m mpi4py.bench helloworld
Hello, World! I am process 0 of 2 on nid001037.
Hello, World! I am process 1 of 2 on nid001041.
Using NVIDIA GPUs in podman-hpc
¶
Accessing NVIDIA GPUs in a container requires that the NVIDIA CUDA user drivers and other utilities are present in the container at runtime. If users add the podman-hpc --gpu
flag, this will ensure all required utilities are enabled at runtime.
Here is an example of running a GPU-enabled task in podman-hpc
in an interactive job:
elvis@nid001037:~> srun -n 2 -G 2 podman-hpc run --rm --gpu registry.nersc.gov/library/nersc/mpi4py:3.1.3 nvidia-smi
Sat Jan 14 01:16:06 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
Sat Jan 14 01:16:06 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100-SXM... Off | 00000000:03:00.0 Off | 0 |
| N/A 27C P0 52W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+