GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Early on, memory bandwidths, more than an order of magnitude higher than conventional processors have made GPUs an attractive platform for data-intensive applications. While there are many success stories about GPU-accelerated databases built from scratch, GPU-accelerated operations for large-scale, general-purpose databases are rather an exception than the norm. We characterize fundamental database operators like scan, filter, join, and group-by based on their memory access patterns. From these characteristics, we derive their potential for GPU acceleration, such as upper bounds for performance on current and future architectures. Starting from basic GPU implementations, we deep dive into aspects like optimizing data transfers, access, and layout, etc.
Early on, memory bandwidths, more than an order of magnitude higher than conventional processors have made GPUs an attractive platform for data-intensive applications. While there are many success stories about GPU-accelerated databases built from scratch, GPU-accelerated operations for large-scale, general-purpose databases are rather an exception than the norm. We characterize fundamental database operators like scan, filter, join, and group-by based on their memory access patterns. From these characteristics, we derive their potential for GPU acceleration, such as upper bounds for performance on current and future architectures. Starting from basic GPU implementations, we deep dive into aspects like optimizing data transfers, access, and layout, etc.  Back
 
Topics:
Accelerated Data Science, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8289
Streaming:
Download:
Share:
 
Abstract:
Cognitive applications are reshaping the IT landscape with entire data centers designed and built solely for that purpose. Though computationally challenging, deep learning networks have become a critical building block to boost accuracy of cognitive offerings like Watson. We'll present a detailed performance study of deep learning workloads and how sharing accelerator resources can improve throughput by a factor of three, effectively turning a four GPU commodity cloud system into a high-end, 12-GPU supercomputer. Using Watson workloads from three major areas that incorporate deep learning technology (language classification, visual recognition, and speech recognition), we document effectiveness and scalability of this approach.
Cognitive applications are reshaping the IT landscape with entire data centers designed and built solely for that purpose. Though computationally challenging, deep learning networks have become a critical building block to boost accuracy of cognitive offerings like Watson. We'll present a detailed performance study of deep learning workloads and how sharing accelerator resources can improve throughput by a factor of three, effectively turning a four GPU commodity cloud system into a high-end, 12-GPU supercomputer. Using Watson workloads from three major areas that incorporate deep learning technology (language classification, visual recognition, and speech recognition), we document effectiveness and scalability of this approach.  Back
 
Topics:
Artificial Intelligence and Deep Learning, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7320
Download:
Share:
 
Abstract:

Based on a comprehensive performance study of Watson workloads, we'll deep dive into optimizing critical retrieve and rank functions using GPU acceleration. The performance of cognitive applications like answering natural language questions heavily depends on quickly selecting the relevant documents needed to generate a correct answer. While analyzing the question to determine appropriate search terms, weights, and relationships is relatively quick, retrieving and ranking a relevant subset from millions of documents is a time-consuming task. Only after completing it can any advanced natural language processing algorithms be effective.

Based on a comprehensive performance study of Watson workloads, we'll deep dive into optimizing critical retrieve and rank functions using GPU acceleration. The performance of cognitive applications like answering natural language questions heavily depends on quickly selecting the relevant documents needed to generate a correct answer. While analyzing the question to determine appropriate search terms, weights, and relationships is relatively quick, retrieving and ranking a relevant subset from millions of documents is a time-consuming task. Only after completing it can any advanced natural language processing algorithms be effective.

  Back
 
Topics:
Accelerated Data Science, Federal
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7321
Download:
Share:
 
Abstract:

Starting with a conventional CPU implementation we identify the most time-consuming operations when processing SQL queries, and show how they can be efficiently offloaded to the GPU. Using queries from a variant of the TPC-H benchmark, we offer a deep dive on how to optimally map complex database operations like join to the GPU hardware, such that they achieve up to 90% hardware efficiency and a throughput of >100M records per second. Given data sets that are orders of magnitude larger than GPU memory, the focus of this talk will be on efficient data layout and movement.

Starting with a conventional CPU implementation we identify the most time-consuming operations when processing SQL queries, and show how they can be efficiently offloaded to the GPU. Using queries from a variant of the TPC-H benchmark, we offer a deep dive on how to optimally map complex database operations like join to the GPU hardware, such that they achieve up to 90% hardware efficiency and a throughput of >100M records per second. Given data sets that are orders of magnitude larger than GPU memory, the focus of this talk will be on efficient data layout and movement.

  Back
 
Topics:
Databases, Data Mining, Business Intelligence
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3190
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next