Skip to content

CUDA

Warning

This page is currently under active development. Check back soon for more content.

CUDA is a general purpose parallel computing platform and programming model that leverages the parallel compute engine in NVIDIA GPUs to solve many complex computational problems in a more efficient way than on a CPU.

For full documentation:

CUDA C

A vector addition example written in CUDA C is provided in this NVIDIA blog and can be compiled with the nvcc compiler provided in the PrgEnv-nvidia environment on Perlmutter.

nvcc -o saxpy.ex saxpy.cu

CUDA Fortran

A vector addition example written in CUDA Fortran is provided in this NVIDIA blog and can be compiled with the nvfortran compiler provided in the PrgEnv-nvidia environment on Perlmutter.

nvfortran -o saxpy.ex saxpy.cuf

Using CUDA on Perlmutter

On Perlmutter CUDA is available via the cudatoolkit modules. The toolkit modules contain GPU-accelerated libraries, profiling tools (nsight compute & systems), debugger tools (cuda-gdb & cuda-memcheck) a runtime library and nvcc CUDA compiler.

NVIDIA maintains extensive documentation for CUDA toolkits.

For info on CUDA for Perlmutter, please see the Native CUDA C/C++ and Memory Management, and other sections in the Perlmutter Readiness page.

PrgEnv-nvidia

The host compilers nvc / nvc++ (accessible through the cc / CC wrapper) in NVIDIA SDK has CUDA opt-in support. To compile a single source C / C++ code (host & device code in the same source file) with the Cray wrappers you must add the -cuda flag to their compilation step which notifies the nvc/ nvc++ compiler to accept CUDA runtime APIs. Omitting the -cuda flag will result in your application compiling without any of the CUDA API calls, and will generate an executable with undefined behavior.

PrgEnv-gnu

When using the PrgEnv-gnu environment in conjunction with the cudatoolkit module (i.e., if compiling any application for both host and device side), you must note that not every version of gcc is compatible with every version of nvcc - supported host compilers for each nvcc installation.

Versions

NERSC generally aims to make the latest versions of cudatoolkit available. In some cases a specific version other than what is installed is needed.

In this situation one should first check if the version needed is compatible. Generally CUDA is forward compatible. For example code written for 11.3 should work with 11.7.

See the CUDA Compatibility Document which describes the details.

If this is not an option next one should consider using containers through Shifter for the specific desired CUDA version.

Tutorials