GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Accelerated Data Science
Presentation
Media
RAPIDS: BlazingDB on Apache GDF - Accelerated ETL for AI Workloads
Abstract:

BlazingDB, the distributed SQL engine on GPUs, will show how we contribute to the Apache GPU Data Frame (GDF) project, and begun to leverage inside BlazingDB. Through the integration of the GDF we have been able to dramatically accelerate our data engine, getting over 10x performance improvements. More importantly, we have built a robust framework to help users bring data from their data lake into GPU accelerated workloads without having to ETL on CPU memory, or separate CPU clusters. Keep everything in the GPU, BlazingDB handles the SQL ETL, and then pyGDF and DaskGDF can take these results to continue machine learning workloads. With the GDF customer workloads can keep the data in the GPU, reduce network and PCIE I/O, dramatically improve ETL heavy GPU workloads, and enable data scientists to run end-to-end data pipelines from the comfort of one GPU server/cluster.

 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8226
Streaming:
Share: