GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
We'll introduce the fundamental concepts behind NVIDIA GPUDirect and explain how GPUDirect technologies are leveraged to scale out performance. GPUDirect technologies can provide even faster results for compute-intensive workloads, including those running on a new breed of dense, GPU-Accelerated servers such as the Summit and Sierra supercomputers and the NVIDIA DGX line of servers.
We'll introduce the fundamental concepts behind NVIDIA GPUDirect and explain how GPUDirect technologies are leveraged to scale out performance. GPUDirect technologies can provide even faster results for compute-intensive workloads, including those running on a new breed of dense, GPU-Accelerated servers such as the Summit and Sierra supercomputers and the NVIDIA DGX line of servers.  Back
 
Topics:
HPC and AI, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9653
Streaming:
Download:
Share:
 
Abstract:

Hear about the latest developments concerning the NVIDIA GPUDirect family of technologies, which are aimed at improving both the data and the control path among GPUs, in combination with third-party devices. We''ll introduce the fundamental concepts behind GPUDirect and present the latest developments, such as changes to the pre-existing APIs, the new APIs recently introduced. We''ll also discuss the expected performance in combination with the new computing platforms that emerged last year.

Hear about the latest developments concerning the NVIDIA GPUDirect family of technologies, which are aimed at improving both the data and the control path among GPUs, in combination with third-party devices. We''ll introduce the fundamental concepts behind GPUDirect and present the latest developments, such as changes to the pre-existing APIs, the new APIs recently introduced. We''ll also discuss the expected performance in combination with the new computing platforms that emerged last year.

  Back
 
Topics:
HPC and AI, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8474
Download:
Share:
 
Abstract:
Learn how to enable CUDA stream synchronous communications in your applications by employing novel GPUDirect features.
Learn how to enable CUDA stream synchronous communications in your applications by employing novel GPUDirect features.  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7128
Download:
Share:
 
Abstract:
GPUDirect is a family of technologies aimed at interoperating with third-party devices. We'll describe the new capabilities introduced in CUDA 7.5 and later, walking the audience through different technical skills involved. Also, we'll provide the latest benchmarking results on recent hardware platforms.
GPUDirect is a family of technologies aimed at interoperating with third-party devices. We'll describe the new capabilities introduced in CUDA 7.5 and later, walking the audience through different technical skills involved. Also, we'll provide the latest benchmarking results on recent hardware platforms.  Back
 
Topics:
HPC and Supercomputing, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6264
Streaming:
Download:
Share:
 
Abstract:

In the GPU off-loading programming model, the CPU is the initiator, e.g. it prepares and orchestrates work for the GPU. In GPU-accelerated multi-node programs, the CPU has to do the same for the network interface as well. But the truth is that both the GPU and the network have sophisticated hardware resources, and these can be effectively short-circuited so to get rid of the CPU altogether. Meet PeerSync, which is a set of CUDA-Infiniband Verbs interoperability APIs which opens an unlimited number of possibilities. It also provides a scheme to go beyond the GPU-network duo, i.e. effectively employing the same ideas to other 3rd party devices.

In the GPU off-loading programming model, the CPU is the initiator, e.g. it prepares and orchestrates work for the GPU. In GPU-accelerated multi-node programs, the CPU has to do the same for the network interface as well. But the truth is that both the GPU and the network have sophisticated hardware resources, and these can be effectively short-circuited so to get rid of the CPU altogether. Meet PeerSync, which is a set of CUDA-Infiniband Verbs interoperability APIs which opens an unlimited number of possibilities. It also provides a scheme to go beyond the GPU-network duo, i.e. effectively employing the same ideas to other 3rd party devices.

  Back
 
Topics:
HPC and Supercomputing, Data Center & Cloud Infrastructure
Type:
Talk
Event:
GTC Silicon Valley
Year:
2015
Session ID:
S5412
Streaming:
Download:
Share:
 
Abstract:

APEnet+ is a novel cluster interconnect, based on a custom PCI card which features a PCI Express Gen2 X8 link and a re-configurable HW component (FPGA). It supports a 3D Torus topology and has special acceleration features specifically developed for NVIDIA Fermi GPUs. An introduction to the basic features and the programming model of APEnet+ will be followed by a description of its performance on some numerical simulations, e.g. High Energy Physics simulations.

APEnet+ is a novel cluster interconnect, based on a custom PCI card which features a PCI Express Gen2 X8 link and a re-configurable HW component (FPGA). It supports a 3D Torus topology and has special acceleration features specifically developed for NVIDIA Fermi GPUs. An introduction to the basic features and the programming model of APEnet+ will be followed by a description of its performance on some numerical simulations, e.g. High Energy Physics simulations.

  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2282
Streaming:
Download:
Share:
 
Speakers:
Davide Rossetti
- INFN - the National Institute of Nuclear Physics - Rome
 
Topics:
HPC and AI
Type:
Talk
Event:
Supercomputing
Year:
2011
Session ID:
SC129
Download:
Share:
 
Speakers:
Davide Rossetti
- National Institute of Nuclear Physics
Abstract:
We describe APEnet+, the new generationof our 3D torus network which scales up to tens of thousands of cluster nodes with linear cost. The basic component is a custom PCIe adapter with six high-speed links, designed around a programmable HW component (FPGA), a nice environment for studying integration techniques between GPUs and network interfaces. The highlevel programming model is MPI, while a low-level RDMA API is also available.
We describe APEnet+, the new generationof our 3D torus network which scales up to tens of thousands of cluster nodes with linear cost. The basic component is a custom PCIe adapter with six high-speed links, designed around a programmable HW component (FPGA), a nice environment for studying integration techniques between GPUs and network interfaces. The highlevel programming model is MPI, while a low-level RDMA API is also available.  Back
 
Topics:
HPC and AI
Type:
Poster
Event:
GTC Silicon Valley
Year:
2010
Session ID:
P10I09
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next