GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Well discuss the RAPIDS ecosystem, which is accelerating the data science workflow by keeping data and computations on GPUs. Were able to go from ingestion to insights more quickly, with larger workloads. Within RAPIDS, cuML provides a sklearn-like application programming interface (API) and cuGraph a NetworkX API of GPU-accelerated algorithms. While over 100x of speedup is possible on a single GPU, the scale is bounded by the devices available memory space. By scaling to multiple GPUs spread across multiple nodes, cuML and cuGraph can increase speedup even further while providing avenues to scale up and out. Well focus on how we enabled the training and inference of machine learning and graph models on multiple nodes within cuML and cuGraph, and provide an architectural overview of our communications API, which is enabling GPU-to-GPU direct memory transfers. Well conclude with examples and benchmarks.
Well discuss the RAPIDS ecosystem, which is accelerating the data science workflow by keeping data and computations on GPUs. Were able to go from ingestion to insights more quickly, with larger workloads. Within RAPIDS, cuML provides a sklearn-like application programming interface (API) and cuGraph a NetworkX API of GPU-accelerated algorithms. While over 100x of speedup is possible on a single GPU, the scale is bounded by the devices available memory space. By scaling to multiple GPUs spread across multiple nodes, cuML and cuGraph can increase speedup even further while providing avenues to scale up and out. Well focus on how we enabled the training and inference of machine learning and graph models on multiple nodes within cuML and cuGraph, and provide an architectural overview of our communications API, which is enabling GPU-to-GPU direct memory transfers. Well conclude with examples and benchmarks.  Back
 
Topics:
Accelerated Data Science, HPC and AI
Type:
Talk
Event:
GTC Washington D.C.
Year:
2019
Session ID:
DC91231
Download:
Share:
 
Abstract:

Graphs are a ubiquitous part of technology we use daily in systems like GPS graphs help find the shortest path between two points and in social networks, which use them to help users find friends. We'll explain why analyzing these vast networks with possibly billions of entries requires the computing power of GPUs. We'll then discuss the performance of graph algorithms on the GPU and show benchmarking results from several graph frameworks. We'll also cover the RAPIDS roadmap that will help unify these frameworks and make them easy to use and simple to deploy.

Graphs are a ubiquitous part of technology we use daily in systems like GPS graphs help find the shortest path between two points and in social networks, which use them to help users find friends. We'll explain why analyzing these vast networks with possibly billions of entries requires the computing power of GPUs. We'll then discuss the performance of graph algorithms on the GPU and show benchmarking results from several graph frameworks. We'll also cover the RAPIDS roadmap that will help unify these frameworks and make them easy to use and simple to deploy.

  Back
 
Topics:
Accelerated Data Science, Algorithms & Numerical Techniques
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9783
Streaming:
Download:
Share:
 
Abstract:

This talk will present the results of running the following Graph500 and DARPA Graph Challenge benchmarks and highlight the improvements over other platforms: BFS Graph500 • Single Source Shortest Paths Graph500 • PageRank Pipeline Graph Challenge • Triangle Counting Graph Challenge • K-Truss Graph Challenge The tremendous performance advantages of the DGX-2 platform for deep-learning has recently gained a lot of publicity. However, that is not the only analytic environment that can take advantage of the DGX-2 architecture. Having sixteen fully connected 32GB Volta GPUs presents an intriguing platform for Graph Analytics. The 512GB of combined GPU memory and full NVLink connection between the GPUs offers a number of advantages over a distributed MPI-based approach.

This talk will present the results of running the following Graph500 and DARPA Graph Challenge benchmarks and highlight the improvements over other platforms: BFS Graph500 • Single Source Shortest Paths Graph500 • PageRank Pipeline Graph Challenge • Triangle Counting Graph Challenge • K-Truss Graph Challenge The tremendous performance advantages of the DGX-2 platform for deep-learning has recently gained a lot of publicity. However, that is not the only analytic environment that can take advantage of the DGX-2 architecture. Having sixteen fully connected 32GB Volta GPUs presents an intriguing platform for Graph Analytics. The 512GB of combined GPU memory and full NVLink connection between the GPUs offers a number of advantages over a distributed MPI-based approach.

  Back
 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8110
Streaming:
Share:
 
Abstract:

We discuss some of common use cases for AmgX, our toolkit for fast linear solvers on the GPU. AmgX includes Algebraic Multi-Grid methods, Krylov methods, nesting preconditioners, and allows complex composition of the solvers and preconditioners. We also present some recent performance results on NVIDIA® Tesla® K20 and K40 GPUs for large-scale CFD problems of industrial relevance.

We discuss some of common use cases for AmgX, our toolkit for fast linear solvers on the GPU. AmgX includes Algebraic Multi-Grid methods, Krylov methods, nesting preconditioners, and allows complex composition of the solvers and preconditioners. We also present some recent performance results on NVIDIA® Tesla® K20 and K40 GPUs for large-scale CFD problems of industrial relevance.

  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
Supercomputing
Year:
2013
Session ID:
SC3137
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next