GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Well present BlazingSQL, RAPIDS open source SQL engine. BlazingSQL eliminates the need to build and deploy a database, enabling users to fully integrate high-performance SQL into their RAPIDS workflows.Its built entirely on the GPU Apache Arrow standard that underpins the RAPIDS ecosystem and the primitives underneath the cuDF and cuIO libraries. BlazingSQL supports a myriad of data sources. Users can query Apache Parquet and JSON in a data lake with in-memory data sources like Apache Arrow or Pandas in a single, intuitive, SQL query that feeds machine learning, deep learning, or graph workloads. We'll launch and run a series of BlazingSQL workloads distributed on a multi-GPU cluster.
Well present BlazingSQL, RAPIDS open source SQL engine. BlazingSQL eliminates the need to build and deploy a database, enabling users to fully integrate high-performance SQL into their RAPIDS workflows.Its built entirely on the GPU Apache Arrow standard that underpins the RAPIDS ecosystem and the primitives underneath the cuDF and cuIO libraries. BlazingSQL supports a myriad of data sources. Users can query Apache Parquet and JSON in a data lake with in-memory data sources like Apache Arrow or Pandas in a single, intuitive, SQL query that feeds machine learning, deep learning, or graph workloads. We'll launch and run a series of BlazingSQL workloads distributed on a multi-GPU cluster.  Back
 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Washington D.C.
Year:
2019
Session ID:
DC91406
Download:
Share:
 
Abstract:

Learn about BlazingSQL, our new, free GPU SQL engine built on RAPIDS open-source software. We will show multiple demo workflows using BlazingSQL to connect data lakes to RAPIDS tools. We'll explain how we dramatically accelerated our engine and made it substantially more lightweight by integrating Apache Arrow into GPU memory and cuDF into RAPIDS. That made it easy to install and deploy BlazingSQL + RAPIDS in a matter of minutes. More importantly, we built a robust framework to help users bring data from data lakes into GPU-Accelerated workloads without having to ETL on CPU memory or separate GPU clusters. We'll discuss how that makes it possible to keep everything in the GPU while BlazingSQL manages the SQL ETL. RAPIDS can then take these results to continue machine learning, deep learning, and visualization workloads.

Learn about BlazingSQL, our new, free GPU SQL engine built on RAPIDS open-source software. We will show multiple demo workflows using BlazingSQL to connect data lakes to RAPIDS tools. We'll explain how we dramatically accelerated our engine and made it substantially more lightweight by integrating Apache Arrow into GPU memory and cuDF into RAPIDS. That made it easy to install and deploy BlazingSQL + RAPIDS in a matter of minutes. More importantly, we built a robust framework to help users bring data from data lakes into GPU-Accelerated workloads without having to ETL on CPU memory or separate GPU clusters. We'll discuss how that makes it possible to keep everything in the GPU while BlazingSQL manages the SQL ETL. RAPIDS can then take these results to continue machine learning, deep learning, and visualization workloads.

  Back
 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9798
Streaming:
Download:
Share:
 
Abstract:

BlazingDB, the distributed SQL engine on GPUs, will show how we contribute to the Apache GPU Data Frame (GDF) project, and begun to leverage inside BlazingDB. Through the integration of the GDF we have been able to dramatically accelerate our data engine, getting over 10x performance improvements. More importantly, we have built a robust framework to help users bring data from their data lake into GPU accelerated workloads without having to ETL on CPU memory, or separate CPU clusters. Keep everything in the GPU, BlazingDB handles the SQL ETL, and then pyGDF and DaskGDF can take these results to continue machine learning workloads. With the GDF customer workloads can keep the data in the GPU, reduce network and PCIE I/O, dramatically improve ETL heavy GPU workloads, and enable data scientists to run end-to-end data pipelines from the comfort of one GPU server/cluster.

BlazingDB, the distributed SQL engine on GPUs, will show how we contribute to the Apache GPU Data Frame (GDF) project, and begun to leverage inside BlazingDB. Through the integration of the GDF we have been able to dramatically accelerate our data engine, getting over 10x performance improvements. More importantly, we have built a robust framework to help users bring data from their data lake into GPU accelerated workloads without having to ETL on CPU memory, or separate CPU clusters. Keep everything in the GPU, BlazingDB handles the SQL ETL, and then pyGDF and DaskGDF can take these results to continue machine learning workloads. With the GDF customer workloads can keep the data in the GPU, reduce network and PCIE I/O, dramatically improve ETL heavy GPU workloads, and enable data scientists to run end-to-end data pipelines from the comfort of one GPU server/cluster.

  Back
 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8226
Streaming:
Share:
 
Abstract:

Learn strategies for efficiently employing various cascaded compression algorithms on the GPU. Many database input fields are amenable to compression since they have repeating or gradually increasing pattern, such as dates and quantities. Fast implementations of decompression algorithms such as RLE-Delta will be presented. By utilizing compression, we can achieve 10 times greater effective read bandwidth than the interconnect allows for raw data transfers. However, I/O bottlenecks still play a big role in the overall performance and data has to be moved efficiently in and out of the GPU to ensure optimal decompression rate. After a deep dive into the implementation, we'll show a real-world example of how BlazingDB leverages these compression strategies to accelerate database operations.

Learn strategies for efficiently employing various cascaded compression algorithms on the GPU. Many database input fields are amenable to compression since they have repeating or gradually increasing pattern, such as dates and quantities. Fast implementations of decompression algorithms such as RLE-Delta will be presented. By utilizing compression, we can achieve 10 times greater effective read bandwidth than the interconnect allows for raw data transfers. However, I/O bottlenecks still play a big role in the overall performance and data has to be moved efficiently in and out of the GPU to ensure optimal decompression rate. After a deep dive into the implementation, we'll show a real-world example of how BlazingDB leverages these compression strategies to accelerate database operations.

  Back
 
Topics:
Accelerated Data Science, AI Startup, Algorithms & Numerical Techniques
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8417
Streaming:
Share:
 
Abstract:
With exponential data growth and the end of Moore's law, enabling data warehouses to scale is a huge challenge. Storing a petabyte in a data warehouse is incredibly costly, and often times non-performant. BlazingDB opens up a whole new level of speed with GPU power, while using data lake technologies to store massive data sets. We'll demonstrate how BlazingDB leverages GPUs for writing and reading, where compression and data skipping are key, and then for SQL analytics, where sorting, aggregations, and joining see huge performance bumps. This demo will be performed on a Microsoft Azure N Series GPU cluster for processing and Azure File Store for cold storage, showing a fully functional BlazingDB cloud deployment processing a massive data set.
With exponential data growth and the end of Moore's law, enabling data warehouses to scale is a huge challenge. Storing a petabyte in a data warehouse is incredibly costly, and often times non-performant. BlazingDB opens up a whole new level of speed with GPU power, while using data lake technologies to store massive data sets. We'll demonstrate how BlazingDB leverages GPUs for writing and reading, where compression and data skipping are key, and then for SQL analytics, where sorting, aggregations, and joining see huge performance bumps. This demo will be performed on a Microsoft Azure N Series GPU cluster for processing and Azure File Store for cold storage, showing a fully functional BlazingDB cloud deployment processing a massive data set.  Back
 
Topics:
Accelerated Data Science, AI Startup, Data Center & Cloud Infrastructure
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7375
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next