There are several tools available to users to pursue application profiling and performance analysis. The collection of tools from Cray in CPMAT provides a way for comprehensive application performance on Blue Waters. CPMAT relies on application level instrumentation with both counting and sampling based profiling for traditional CPU based features, MPI, the Gemini interconnect, and the NVIDIA GPUs. Some of CPMATs functionality depends on collection of hardware counter data using the popular PAPI.  Two other tools are also available that provide complementary collection and analysis: NCSA's PerfSuite and TAU. PerfSuite provides a lightweight method for counter data collection and code profiling at the source code level. TAU is a comprehensive code-profiling tool that covers functionality similar to CPMAT but has additional features useful for profiling. 

For additional information please see the discussions on:

For specific topics on profiling please see the following:

MPI itself has built-in memory profiling tools analyzing memory usage.  Run "man mpi" on the system, and search (by hitting "/") for "report", then "n" for the next result.  You'll see this section on how to invoke mpich's memory reporting function:

               If set to 1, print a summary of the min/max high water mark
               and associated rank to stderr.

               If set to 2, output each rank's high water mark to a file as
               specified using MPICH_MEMORY_REPORT_FILE.

               If set to 3, do both 1 and 2.


If you're having memory issues (or particularly if your application is being killed by the OOM ("Out Of Memory") killer, then this will help you track down how much memory your code is actually using.