GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
We'll discuss the GPU Open Analytics Initiative, an effort to develop a GPU data frame that can handle a large-scale data-analytics workflow and support out-of-core cases in which the data is larger than GPU memory. We'll describe how we divided the problem into two parts, developing an elementary single-GPU data frame to handle in-memory use cases, and then combining multiple single-GPU data frames into a distributed multi-GPU data frame for out-of-core use cases. We'll briefly introduce our distributed GPU data frame and its capabilities. We'll then explain how we scaled out by using Dask, a distributed computation framework in Python, to orchestrate the single-GPU data frames and achieve out-of-core capability with minimal effort. Our idea can be generalized to build custom distributed GPU computation by composing single-GPU libraries.
We'll discuss the GPU Open Analytics Initiative, an effort to develop a GPU data frame that can handle a large-scale data-analytics workflow and support out-of-core cases in which the data is larger than GPU memory. We'll describe how we divided the problem into two parts, developing an elementary single-GPU data frame to handle in-memory use cases, and then combining multiple single-GPU data frames into a distributed multi-GPU data frame for out-of-core use cases. We'll briefly introduce our distributed GPU data frame and its capabilities. We'll then explain how we scaled out by using Dask, a distributed computation framework in Python, to orchestrate the single-GPU data frames and achieve out-of-core capability with minimal effort. Our idea can be generalized to build custom distributed GPU computation by composing single-GPU libraries.  Back
 
Topics:
Accelerated Data Science, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9449
Streaming:
Download:
Share:
 
Abstract:
We'll demonstrate how Python and the Numba JIT compiler can be used for GPU programming that easily scales from your workstation to an Apache Spark cluster. Using an example application, we show how to write CUDA kernels in Python, compile and call them using the open source Numba JIT compiler, and execute them both locally and remotely with Spark. We also describe techniques for managing Python dependencies in a Spark cluster with the tools in the Anaconda Platform. Finally, we conclude with some tips and tricks for getting best performance when doing GPU computing with Spark and Python.
We'll demonstrate how Python and the Numba JIT compiler can be used for GPU programming that easily scales from your workstation to an Apache Spark cluster. Using an example application, we show how to write CUDA kernels in Python, compile and call them using the open source Numba JIT compiler, and execute them both locally and remotely with Spark. We also describe techniques for managing Python dependencies in a Spark cluster with the tools in the Anaconda Platform. Finally, we conclude with some tips and tricks for getting best performance when doing GPU computing with Spark and Python.  Back
 
Topics:
Programming Languages, Tools & Libraries, Big Data Analytics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6413
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next