Abstract:
GPU-based clusters are being adopted at a rapid pace in HPC clusters to perform compute-intensive tasks at a large scale. One of the main performance challenges in the deployments of this GPU clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations through MPI communication profiling that provides higher performance and better utilization that allow GPU cluster to scale. We will also demonstrate with a selection of HPC applications on InfiniBand cluster, with technology such as GPUDirect RDMA to see how to utilize this new feature to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem that allow application to perform and scale.
GPU-based clusters are being adopted at a rapid pace in HPC clusters to perform compute-intensive tasks at a large scale. One of the main performance challenges in the deployments of this GPU clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations through MPI communication profiling that provides higher performance and better utilization that allow GPU cluster to scale. We will also demonstrate with a selection of HPC applications on InfiniBand cluster, with technology such as GPUDirect RDMA to see how to utilize this new feature to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem that allow application to perform and scale.
Back
Topics:
HPC and Supercomputing, Performance Optimization, Computer-Aided Engineering
Event:
GTC Silicon Valley