GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

HPC and Supercomputing
Presentation
Media
Accelerating HPL on Heterogeneous Clusters with NVIDIA GPUs
Abstract:
Learn about the design and use of a hybrid High-Performance Linpack (HPL) benchmark to measure the peak performance of heterogeneous clusters with GPU and non-GPU nodes. HPL continues to be used as the yardstick for ranking supercomputers around the world. Many clusters, of different scales, are being deployed with only a subset of nodes equipped with NVIDIA GPU accelerators. Their true peak performance is not reported due to the lack of a version of HPL that can take advantage of all the CPU and GPU resources available. We discuss a simple yet elegant approach of a fine-grain weighted MPI process distribution to balance the load between CPU and GPU nodes. We use techniques like process reordering to minimize communication overheads. We use a real-world cluster, Oakley at the Ohio Supercomputer Center, to evaluate our approach. On a heterogeneous configuration with 32 GPU and 192 non-GPU nodes, we achieve up to 50% of the combined theoretical peak and up to 80% of the combined actual peak performance of the GPU and non-GPU nodes.
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4535
Streaming:
Download:
Share: