Thursday, June 29, 2017

Phi Architecture and Concepts

Presenter: Todd Evans, Texas Advanced Computing Center

Abstract: The Intel’s Knights Landing (KNL) processor is designed to support a wide variety of workloads but specializes in providing parallel performance.  The new processor supports self-hosted nodes with 68 cores connected via a 2D mesh topology, 4 hardware threads per core, and 2 512 bit width Vector Processing Units per core.  Applications designed to perform well on conventional multicore processors may need to be modified with efficient multi-threading and vectorization before full advantage of the hardware is realized.

This tutorial will provide practical information and advice to enable experienced OpenMP and MPI programmers to enhance applications on the KNL.  We’ll review the KNL’s architecture and discuss the impacts on performance of the different MCDRAM memory and cluster configurations.  Recommendations regarding MPI and OpenMP task layout will discussed. 
We will focus on the use of reports and directives to guide optimization and implementation of efficient memory access and alignment.  We also will showcase Intel VTune Amplifier XE’s and Advisor’s capabilities to provide detailed memory access analysis and parallel code profiling.  Optimization methods in multithreading will be covered in depth.
This session will include hands-on exercises on KNL systems.