GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
GPU-based clusters are being adopted at a rapid pace in HPC clusters to perform compute-intensive tasks at a large scale. One of the main performance challenges in the deployments of this GPU clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations through MPI communication profiling that provides higher performance and better utilization that allow GPU cluster to scale. We will also demonstrate with a selection of HPC applications on InfiniBand cluster, with technology such as GPUDirect RDMA to see how to utilize this new feature to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem that allow application to perform and scale.
GPU-based clusters are being adopted at a rapid pace in HPC clusters to perform compute-intensive tasks at a large scale. One of the main performance challenges in the deployments of this GPU clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations through MPI communication profiling that provides higher performance and better utilization that allow GPU cluster to scale. We will also demonstrate with a selection of HPC applications on InfiniBand cluster, with technology such as GPUDirect RDMA to see how to utilize this new feature to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem that allow application to perform and scale.  Back
 
Topics:
HPC and Supercomputing, Performance Optimization, Computer-Aided Engineering
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6399
Streaming:
Download:
Share:
 
Abstract:
To demonstrate the application performance improvement using GPUDirect RDMA, we utilized a general-purpose GPU Molecular Dynamics simulation application called HOOMD-blue. The code was modified and tuned for GPUDirect RDMA and for dual GPU/InfiniBand configuration in order to exploit higher scalability performance than ever achieved on this energy-efficient cluster before the introduction of GPUDirect RDMA. The goal is to present the improvements seen in the application performance of HOOMD-blue, as well as to show the best practices for properly configuring and running GPUDirect RDMA over both of the GPUs and the dual FDR InfiniBand hardware available on the Wilkes supercomputer.
To demonstrate the application performance improvement using GPUDirect RDMA, we utilized a general-purpose GPU Molecular Dynamics simulation application called HOOMD-blue. The code was modified and tuned for GPUDirect RDMA and for dual GPU/InfiniBand configuration in order to exploit higher scalability performance than ever achieved on this energy-efficient cluster before the introduction of GPUDirect RDMA. The goal is to present the improvements seen in the application performance of HOOMD-blue, as well as to show the best practices for properly configuring and running GPUDirect RDMA over both of the GPUs and the dual FDR InfiniBand hardware available on the Wilkes supercomputer.  Back
 
Topics:
Performance Optimization1, Computational Physics, Life & Material Science, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2015
Session ID:
S5169
Streaming:
Download:
Share:
 
Abstract:

GPU-based clusters are being adopted at a rapid pace in high performance computing clusters to perform compute-intensive tasks at large scale. One of the main performance challenges in the deployments of this GPU-based clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations such as RDMA for GPUDirect, which provides for higher performance and better utilization for GPU communications by allowing the network adapter and the GPU to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem. We will show the benefits of using this new technology and explain how registrants can utilize this new features in their own compute clusters.

GPU-based clusters are being adopted at a rapid pace in high performance computing clusters to perform compute-intensive tasks at large scale. One of the main performance challenges in the deployments of this GPU-based clusters is the performance and latency of communications between GPUs across the interconnect fabric. The goal of this session is to highlight interconnect optimizations such as RDMA for GPUDirect, which provides for higher performance and better utilization for GPU communications by allowing the network adapter and the GPU to directly communicate in a peer-to-peer fashion, completely bypassing the CPU subsystem. We will show the benefits of using this new technology and explain how registrants can utilize this new features in their own compute clusters.

  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3504
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next