Darshan 

Description

Darshan is a lightweight and scalable I/O profiling tool. Darshan is able to collect profile information for POSIX, HDF5, NetCDF and MPIIO calls. Darshan profile data can be used to investigate and tune I/O behavior of MPI applications. 

Darshan can be used only with MPI applications. The application, at minimum, must call MPI_Init and MPI_Finalize. 

How to use Darshan

Enabling Darshan: Darshan is now enabled by default.

Any codes linked with the module will have darshan profiling enabled. 

Conflicting Modules: Any software that uses PMPI interface cannot be used alongside Darshan. For example, CrayPAT and Darshan cannot both be loaded at the same time. Modules report this conflict to the user as appropriate. 

Using Darshan with statically linked applications:

No special changes are needed to compile and run a Darshan enabled code. Simply load darshan module and build the code as usual.

Using Darshan with dynamically linked applications:

Applications that link dynamically require an additional step for linking with Darshan. An environment variable export LD_PRELOAD=/sw/xe/darshan/3.1.3/darshan-3.1.3/lib/libdarshan.so must be defined at compile time and in the PBS script.

Switching Programing Environments: The Darshan module is automatically enabled after switching Programing Environments. 

Obtaining Darshan Profile Data:

Each time a darshan-instrumented application is executed, it will generate a single log file summarizing the I/O activity from that application. The log files are collected at a central location. Log file for a given application will have log file name in the following format:

<USERNAME>_<BINARY_NAME>_<JOB_ID>_<DATE>_<UNIQUE_ID>_<TIMING>.darshan.gz

To obtain a copy of your job's darshan profile data, add the following directive to the job submission command 

-lgres=darshan

After a successful job completion, a copy of the Darshan profile data is copied to the directory from which the batch job was submitted.

Typically, Darshan profile data is written to the file system after MPI_Finalize call. If the job does not complete successfully, no Darshan profile data will be written to the file system. This includes jobs that did not complete due to job time expiry. 

Using Darshan Profile Data:

The following utilities can be used to analyze Darsha profile data:

darshan-job-summary.pl - generates a graphical summary in a PDF file

darshan-summary-per-file.sh - generates a separate PDF summary file for every file accessed by the application

darshan-parser - generates a readable text file of all the information contained in the log file

Detailed information on darshan-parser output is available here.

These utilities can be used from any of the login nodes. They will not work on compute or service nodes. 

Disabling Darshan

To disable the darshan module in your current shell:

$ module unload darshan

No profile file generated

  • Application is not an MPI application – need MPI_Init & MPI_Finalize
  • Application did not exit cleanly – MPI_Finalize was never called, application killed by scheduler etc.
  • Environment variable DARSHAN_DISABLE is set
  • Application not linked with Darshanlibraries – check ‘nm | grep -i darshan
  • Darshan module not loaded in the job script

Known Issues

  • Exec calls will hang when code is compiled with darshan loaded. unload darshan and build the code.
  • MPI applications that use fork calls will hang when code is compiled with darshan loaded. unload darshan and build code.

Additional Information / References