The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.
This wiki is primarily intended for NERSC users who wish to use Open MPI on Cori and Perlmutter, however these instructions are sufficiently general that they should be largely applicable to other Cray XC and EX systems running SLURM.
Using Open MPI at NERSC¶
Load the Open MPI module to pick up the packages compiler wrappers,
mpirun launch command, and other utilities:
For Perlmutter, the following will load the default package
module use /global/common/software/m3169/perlmutter/modulefiles module load openmpi
while for Cori, use the following module command sequence-
module use /global/common/software/m3169/cori/modulefiles module load openmpi
On Cori, Open MPI is available for use with the
PrgEnv-intel Cray programming environments. The module file will detect which compiler environment you have loaded and load the appropriately built Open MPI package. On Perlmutter, Open MPI is available for use with the
PrgEnv-gnu programming environments.
On either system, the simplest way to compile your application when using Open MPI is via the MPI compiler wrappers, e.g.
mpicc -o my_c_exec my_c_prog.c mpif90 -o my_f90_exec my_f90_prog.f90
You pass extra compiler options to the back end compiler just as you would if using the compiler (not the cray wrappers) directly. Note by default the Open MPI compiler wrappers will build dynamic executables.
There are two ways to launch applications compiled against Open MPI on Cori. You can either use the Open MPI supplied
mpirun job launcher, or Slurm's srun job launcher, e.g.
salloc -N 5 --ntasks-per-node=32 -C haswell srun -n 160 ./my_c_exec
salloc -N 5 --ntasks-per-node=32 -C haswell mpirun -np 160 ./my_c_exec
If you wish to use srun, you should use the same srun options as if your application was compiled and linked against the vendor's MPI implementation.
On Perlmutter, only the
mpirun method is available for launching applications compiled using the Open MPI compiler wrappers. (See also our running jobs example for Open MPI).
mpirun man page for more details about command line options.
mpirun --help may also be used to get more information about
mpirun command line options.
Note if you wish to use MPI dynamic process functionality such as
MPI_Comm_Spawn, you must use
mpirun to launch the application.
For Cori, the Open MPI package has been built so that it can be used on both the Haswell and KNL partitions.
Using Java Applications with Open MPI¶
The Open MPI supports a Java interface. Note this interface has not been standardized by the MPI Forum. Information on how use Open MPI's Java interface is available on the Open MPI Java FAQ. Note you will need to load the java module to use Open MPI's Java interface.
Using Open MPI with OFI libfabric on Cori¶
The Open MPI installed on Cori is configured to optionally use the OFI libfabric transport. By default Open MPI's native uGNI interface to the Aries network is used. If you would like to work with the OFI libfabric transport instead, the following environment variable needs to be set:
Note that the OFI libfabric GNI provider targets functionality over performance.
As long as Perlmutter uses HPE Slingshot 10, Open MPI will be configured to use UCX for its underlying communication on Perlmutter.
UCX special considerations on Perlmutter¶
It turns out that there are some problems using UCX out of the box on Perlmutter. Currently we are recommending that users restrict Open MPI to using one of the network adaptors per node by setting the following environment variable:
Also, if your application is using
MPI_Comm_Spawn or related MPI dynamic process calls, then the UCX transport layer needs be explicitly set. Currently it is recommended to set this environment variable as follows: