For the past 20 years, high performance computing has
beneï¬ted from a significant reduction in the clock cycle
time of the basic processor. Going forward, trends indicate the
clock rate of the most powerful processors in the world may stay
the same or decrease slightly. When the clock rate decreases, the
chip runs at a slower speed. At the same time, the amount of
physical space that a computing core occupies is still trending
downward. This means more processing cores can be contained within
With this paradigm shift in chip technology, caused by the amount of electrical power required to run the device, additional performance is being delivered by increasing the number of processors on the chip and (re)introducing SIMD/vector processing. The goal is to deliver more ï¬‚oating-point operations per second per watt. Interestingly, these evolving chip technologies are being used on scientiï¬c systems as small as a single workstation and as large as the systems on the Top 500 list.
Within this book are techniques to
eï¬€ectively utilize these new node architectures.
Eï¬ƒcient threading on the node, vectorization to
utilize the powerful SIMD units, and eï¬€ective
memory management will be covered along with examples to allow the
typical application developer to apply them to their programs.
Performance portable techniques will be shown that will run
eï¬ƒciently on all HPC nodes.
The principal target systems will be Intel’s latest multicore Xeon system, the latest Intel Knight’s Landing (KNL) chip with discussion/comparison to the latest hybrid, accelerated systems using NVIDIA’s Pascal accelerator.