The Performance API (PAPI) project specifies a standard application programming interface (API) for accessing hardware performance counters available on most modern microprocessors. CrayPat uses PAPI to interface to the Cray system hardware; therefore the module xt-papi is normally loaded as part of the perftools module. The interface between PAPI and CrayPat is normally transparent to the user. However, advanced users may want to bypass CrayPat and work with PAPI directly. In this case, you must unload the perftools module and then reload only the xt-papi module.
% module load perftools
To see the information PAPI can capture on your system i.e PAPI events and AMD native event names, you can use papi_avail and papi_native_avail commands in your batch script on a compute node.
module load perftools aprun -n 1 papi_avail aprun -n 1 papi_native_avail
By default, no hardware performance counter events are monitored during sampling and tracing. User has to explicitely specify the event name using environment variable PAT_RT_HWPC, in the batch script before aprun command. A hwcgrp number can be used in place of the list of the event names to specify a predefined counter group. The valid hwcgrp numbers are listed in the hwpc man page.
export PAT_RT_HWPC=1 # PAPI_L1_DCM, PAPI_TLB_DM, PAPI_L1_DCA, PAPI_FP_OPS
Hardware performance counter events:
PAPI_L1_DCM : Level 1 data cache misses PAPI_TLB_DM : Data translation lookaside buffer misses PAPI_L1_DCA : Level 1 data cache accesses PAPI_FP_OPS : Floating point operations