GPU-Accelerated Sparse Factorization in Implicit Finite Element Method on Petascale Architecture
Seid Koric, University of Illinois at Urbana-Champaign
Usage Details
Seid Koric, Anshul Gupta, Ahmed Taha, Mariano Vazquez, Erman Guleryuz, Madhu Vellakal, Fereshteh Sabet, Steven Rennich, Natalia Gimelshein, Ashraf Idkaidek, Alfonso Santiago, Guillaume Houzeaux, Antoni Artigues, Eva Casoni, Daniel Mira, Herbert Owen Coppola, Paula Cordoba PanellaIn August 2014, Nvidia has started working with Dr. Anshul Gupta, the principal developer of WSMP, in porting and optimizing WSMP to Nvidia GPUs. During the direct solution of linear systems, WSMP calls a small number of BLAS routines to do, depending on the system being solved, a major portion of the computation. The WSMP calls to level-3 BLAS routines are intercepted and accelerated on the GPU. The new accelerated WSMP library, ACCEL_WSMP, has been recently beta-ported to XK7 nodes of the Titan system.
In the proposed work, Dr. Koric will perform a full scale ACCEL_WSMP benchmarks with assembled global stiffness matrices and load vectors ranging from 11-40 Million unknowns extracted from commercial and academic implicit FEA systems. Starting in January 2015, Dr. Gupta and Nvida have generously agreed to work with Dr. Koric in tuning and optimizing the ACCEL_WSMP library to the XK7 portion of Blue Waters.
A couple of popular open source FEA codes FEAP and WARP3D have already implemented WSMP as a direct solver, so this work will provide an access to GPUs for these codes. Prof. Masud, an Illinois allocation PI, has been developing a variational multiscale version of the FEAP code for many years, while WARP3D, originally devolved at Illinois, has been used by an NCSA PSP company as an alternative to commercial FEA codes.
Dr. Koric and the developers of the Alya muliphysics code, which scaled with on over 100,000 cores of Blue Waters in 2014, have future plans to implement WSMP and/or ACCEL_WSMP as a standalone solver or a preconditioner for ill-conditioned problems. Any other current and future PRAC code will also have access to this powerful direct solver library on Blue Waters CPUs and GPUs. We will keep disseminating our research findings from this project through papers in prestigious journals and presentations at professional conferences. Finally, this collaboration was originally initiated by the Nvidia upper management during their recent visit to NCSA, and it will further strengthen the business and strategic relationship between NCSA and Nvidia.