Developing and Tuning Communication-Avoiding Numerical Algorithms
Parallel algorithms with lower communication and synchronization costs are critical for deployment of scientific applications on current and next-generation HPC hardware. Numerical libraries deploy scalable algorithms to service many user applications. For the debugging and tuning of such algorithms and libraries, we require a modest amount of compute time on extreme-scale computational resources.
Access to Blue Waters would allow us to tune communication-avoiding algorithms for matrix and tensor computations. We will evaluate algorithms for computation of the QR, eigenvalue, and singular value decompositions of dense matrices. Additionally, we would like to tune applications and benchmarks designed using Cyclops Tensor Framework (CTF), a massively-parallel library for algebra with (sparse) multidimensional matrices. These include CTF codes that have already demonstrated petascale performance, such as the Aquarius electronic structure code, as well as emerging CTF applications: algebraic multigrid, spectral element methods, graph betweenness centrality, and tensor factorizations.