GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Accelerated Data Science
Presentation
Media
Petabyte Data Pipelines: Massively Distributed SQL Data Warehouse on GPUs
Abstract:
With exponential data growth and the end of Moore's law, enabling data warehouses to scale is a huge challenge. Storing a petabyte in a data warehouse is incredibly costly, and often times non-performant. BlazingDB opens up a whole new level of speed with GPU power, while using data lake technologies to store massive data sets. We'll demonstrate how BlazingDB leverages GPUs for writing and reading, where compression and data skipping are key, and then for SQL analytics, where sorting, aggregations, and joining see huge performance bumps. This demo will be performed on a Microsoft Azure N Series GPU cluster for processing and Azure File Store for cold storage, showing a fully functional BlazingDB cloud deployment processing a massive data set.
 
Topics:
Accelerated Data Science, AI Startup, Data Center & Cloud Infrastructure
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7375
Download:
Share: