Skip to content

Darshan I/O profiler

Darshan is an open-source lightweight I/O profiler developed by ANL, which collects I/O statistics of several widely-used HPC I/O frameworks such as MPI-IO, HDF5, PNetCDF, and standard POSIX calls. We use Darshan at NERSC to examine file system utilization and provide advices to improve performance of users' applications.

Darshan is automatically loaded as a module on Cori for all users, and is included at link time into users' applications via the Cray compiler wrappers (cc, CC, ftn) (see the related page in the docs for more details on compilers on Cori).

To check whether your dynamically linked application has been compiled to instrument data with darshan at runtime, use ldd and look for darshan among the results:

$ ldd your-application |grep darshan
    libdarshan.so => /usr/common/software/darshan/3.2.1/lib/libdarshan.so

For statically built applications you can list the symbols contained in your executable with nm.

Tip

The default darshan/3.2.1 module loaded on Cori only instruments POSIX and MPI-IO calls, but we also provide darshan/3.2.1-hdf5, which can be used to instrument applications using HDF5, and can be swapped for the default darshan module with:

module swap darshan/3.2.1-hdf5

Opting out of darshan

Should darshan cause you any issue, you can disable it by unloading the darshan module and rebuilding your application. We believe darshan to be stable for most applications at NERSC, but users should contact us if they experience any problems, via the online help desk.

Enabling darshan at runtime

Darshan is automatically injected into users' applications at compile time on Cori, but it can also be enabled at runtime; this option is available only for dynamically linked executables, e.g. applications built before darshan went into production, or applications built without using the Cray compiler wrappers (e.g. Nvidia compilers for external Cori architectures), or interpreted languages applications (e.g. Python). This may also be useful for applications not built on Cori, like executables on CVMFS or other pre-compiled binaries.

You can enable darshan by setting the LD_PRELOAD variable for your application, for example:

LD_PRELOAD=/usr/common/software/darshan/3.2.1/lib/libdarshan.so your-application-here

If you want to instrument HDF5 code, substitute /3.2.1/ with /3.2.1-hdf5/ in the darshan path above.

Do not export LD_PRELOAD globally

export-ing LD_PRELOAD in your session will instrument any application you execute, which may impact your workflow and also the filesystem where the darshan logs are stored.

To instrument a code you execute through srun, export the LD_PRELOAD variable only to the application being launched by srun, to avoid instrumenting srun internal calls:

srun --export=ALL,LD_PRELOAD=/usr/common/software/darshan/3.2.1/lib/libdarshan.so your-application-here

Warning

The ALL token in srun --export=ALL,LD_PRELOAD=... is required to instruct SLURM to add LD_PRELOAD to the existing environment variables; not specifying ALL may cause your application to crash because some required environment variables are missing. See man srun for more information and details.

Warning

Darshan doesn't interact correctly with multiple Python processes spawned via multiprocessing, due to how the Python internals operate to clone processes. Related bug tracker.

Instrumenting non-MPI code

Darshan can be used also to instrument non-MPI code. To enable this feature, set the environment variable DARSHAN_ENABLE_NONMPI to any value:

DARSHAN_ENABLE_NONMPI=1 LD_PRELOAD=/usr/common/software/darshan/3.2.1/lib/libdarshan.so your-application-here

Producing reports

The darshan modules save the data they collect to a shared dir, divided by date, username, application name, etc. according to the following "mask":

/global/cscratch1/sd/darshanlogs/${YEAR}/${MONTH}/${DAY}/${USER}_${APPLICATION-NAME}_${JOB-ID}_${TIME}.darshan

This means you should be able to "find" the logs of your applications by searching for the day your application was running and your NERSC username.

Darshan log files can be processed to produce a plain text or PDF report containing relevant insights of your application.

For example, given $LOGFILE an environment variable storing some compressed darshan log data, you can parse it with:

darshan-parser $LOGFILE

The output can be quite long if the application has accessed several files during a long run: redirect the output to a file (e.g. > $PARSED_LOGFILE) or pipe it to other commands for better reading (e.g. | less).

Excessive computing on login nodes harms other users

Please submit a job or use the interactive queue if you plan to parse several logfiles, because it may impact other users' experience and workflows on login nodes.

To produce a PDF report you need to first load the texlive module, then use darshan-job-summary.pl, like the following:

module load texlive
darshan-job-summary.pl $LOGFILE

You can control where to store the output file name with --output /path/to/output.pdf, otherwise the output file will default to a file named like the input darshan log file and the suffix .pdf, saved in the current directory.

Here's an example of a report produced by darshan when executing an MPI application: you can extract many details on how your application accesses and uses the file system, and you can appreciate some plots.

Difference between darshan-parser text output and PDF report

The PDF report does not contain everything that can be extracted with the darshan-parser tool, but new darshan releases may improve the PDF report produced, see e.g. this thread.

Build options

To build darshan 3.2.1 on Cori, these scripts were used.

In particular, the PrgEnv-gnu and craype-haswell modules are used because the gnu compiler produces a more "compact" darshan library with less dependencies, which can be used to instrument applications built against many combinations of compilers and MPI frameworks.

The MPI framework used to build darshan is the Cray-optimized MPICH, automatically provided by the cc compiler wrapper: all users' applications built against MPICH or MVAPICH should work fine. Users building their applications against OpenMPI or derivatives (Intel MPI, Spectrum MPI, etc) may need to disable darshan or build their own version if they desire to instrument their code.

To instrument non-MPI programs, disable MPI at compile-time with --without-mpi. Darshan 3.2.0 and 3.2.1 contain a bug that breaks compilation when --without-mpi is requested, because it tries to use some MPI functions and variables. Use a more recent release (if available) or build from source:

git clone https://xgitlab.cels.anl.gov/darshan/darshan.git darshan-git
cd darshan-git/darshan-runtime
./configure --without-mpi --prefix=...

Darshan is also able to instrument PnetCDF I/O calls; this mode can be enabled by adding --enable-pnetcdf-mod=${PNETCDF_DIR} at the configure, after you load one of the cray-parallel-netcdf pnetcdf modules available on Cori.

HDF5-aware darshan build

cray-hdf5-parallel/1.10.5.2 was used to build the HDF5-aware darshan release: HDF5 1.10 introduced some ABI changes that are not compatible with HDF5 1.8 or lower; only HDF5 1.10 or higher are currently available on Cori, so if you use any HDF5 module available you should not experience issues.

If your application was built against HDF5 1.8 or lower and you cannot rebuild it against a newer HDF5 release, and you want to instrument your code with darshan, you need to rebuild it against the HDF5 release you're using: feel free to use the scripts above or contact us for support.

A caveat of building an application with the HDF5-capable darshan is that the HDF5 library dependency will be always included in the list of libraries that the application will look up, also when you don't have any HDF5 code; this means that the library will always be "loaded" by the Operative System at execution, but apart from a minor slowdown in order to retrieve the library, your application should work normally.

Known issues

  1. If you build an application with gcc and HDF5, and you load the darshan built with HDF5, the linker may complain with the following message:

    /usr/bin/ld: warning: libhdf5_parallel_gnu_82.so.103, needed by /usr/common/software/darshan/3.2.1-hdf5/lib/libdarshan.so, may conflict with libhdf5_parallel_gnu_82.so.200
    

    This is caused by a difference between the loaded cray-hdf5-parallel module and the module used to build darshan: this warning does not have an impact on your application and can be ignored, since the ABI used by darshan to instrument I/O calls should be the same for all the cray-hdf5 modules available at Cori.

  2. While the hdf5-aware darshan module usually works fine for compiled applications, it may produce some incompatibility warnings when used with interpreted programs, for example with Python environments using conda to provide an external HDF5 build. This is probably caused by h5py trying to load the HDF5 dependencies it was built upon directly, instead of using those provided in the LD_PRELOAD variable. This causes the following warning message to appear:

    $ source /usr/common/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh  # load conda in the current env
    $ conda create -y --prefix $HOME/conda/testdarshan/ python=3.8 h5py hdf5=1.10.6
    
    $ $HOME/conda/testdarshan/bin/python -c 'import h5py; print(h5py.version.hdf5_version)
    1.10.6
    
    $ LD_PRELOAD=/usr/common/software/darshan/3.2.1-hdf5/lib/libdarshan.so $HOME/conda/testdarshan/bin/python -c 'import h5py; print(h5py.version.hdf5_version)
    $HOME/conda/testdarshan/lib/python3.8/site-packages/h5py/__init__.py:37: UserWarning: h5py is running against HDF5 1.10.5 when it was built against 1.10.6, this may cause problems
    Warning! ***HDF5 library version mismatched error***
    The HDF5 header files used to compile this application do not match the version used by the HDF5 library to which this application is linked.
    Data corruption or segmentation faults may occur if the application continues.
    This can happen when an application was compiled by one version of HDF5 but linked with a different version of static or shared HDF5 library.
    You should recompile the application or check your shared library related settings such as 'LD_LIBRARY_PATH'.
    You can, at your own risk, disable this warning by setting the environment variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
    Setting it to 2 or higher will suppress the warning messages totally.
    Headers are 1.10.6, library is 1.10.5
    
    $ HDF5_DISABLE_VERSION_CHECK=2 LD_PRELOAD=/usr/common/software/darshan/3.2.1-hdf5/lib/libdarshan.so $SCRATCH/conda/testdarshan/bin/python -c 'import h5py; print(h5py.version.hdf5_version)'
    1.10.5
    

    Setting the variable HDF5_DISABLE_VERSION_CHECK to 1 or higher will drop the warning, but this seems to cause h5py to use the HDF5 library used to compile darshan with, instead of the HDF5 library installed with conda.

    Please refer to the section Build options above to build your own darshan release on top of the HDF5 library you installed with conda. In particular when you configure darshan you need to specify the HDF5 path during the configure step, which is the prefix you use when you installed the conda environment; in the example above it would be: --enable-hdf5-mod=$HOME/conda/testdarshan/

    You can then use your own darshan to instrument your python code.

  3. When instrumenting interpreted languages (e.g. Python), you may get errors like undefined symbol: H5get_libversion. Explicitly adding also the HDF5 library in LD_PRELOAD fixes this error, for example:

    LD_PRELOAD=/usr/common/software/darshan/3.2.1/lib/libdarshan.so:/your/libhdf5.so your-application-here
    

    And similarly for variables exported to srun.

  4. The HDF5-aware darshan library provided was built with the MPICH provided by the Cray compiler wrapper, and may cause some applications to break with the following message: Attempting to use an MPI routine before initializing MPICH

    Consider building your own version of darshan as shown in the section above, adding --without-mpi.

  5. Darshan aggregates the data collected during a MPI run only when MPI_Finalize() is called inside the application; this means that applications lacking the Finalize call won't have their data collected, and similarly this will happen for applications that crashed during execution. A fix for this issue is currently being developed.

  6. Applications built with darshan usually are less portable than those built without, because the library loader will try to load libdarshan.so at every execution. You can opt-out of darshan to make your application more portable.