GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
We'll introduce and compare the performance of three distance-transform algorithms accelerated on NVIDIA's Xavier SoC. We will provide a close-up look at an algorithm that shows the math principle and the performance baseline and two algorithms that target different workloads and scenarios with different speedups. We'll use our comparisons to show how to choose the most practical acceleration scheme for embedded applications. We'll discuss distance transform on the PVA computer vision accelerator inside Xavier, and explain how its competitive performance allows users to offload the EDT task from GPUs. This process also offers more room for deep learning tasks.
We'll introduce and compare the performance of three distance-transform algorithms accelerated on NVIDIA's Xavier SoC. We will provide a close-up look at an algorithm that shows the math principle and the performance baseline and two algorithms that target different workloads and scenarios with different speedups. We'll use our comparisons to show how to choose the most practical acceleration scheme for embedded applications. We'll discuss distance transform on the PVA computer vision accelerator inside Xavier, and explain how its competitive performance allows users to offload the EDT task from GPUs. This process also offers more room for deep learning tasks.  Back
 
Topics:
Computer Vision, Autonomous Vehicles, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9165
Streaming:
Download:
Share:
 
Abstract:
We present Piko, a system abstraction to help implement high-level algorithmic pipelines on modern parallel architectures. We define 'pipelines' as a sequence of complex, dynamically-scheduled kernels that combine to implement a complex application. While primarily targeted towards efficient graphics applications, the way in which Piko exposes both parallelism and locality can naturally be applied to other domains as well. The abstraction helps programmers define work granularities as the data evolves across stages of an application. These definitions are disjoint from the underlying algorithms, which helps authors of Piko pipelines explore tradeoffs between locality and parallelism across varying application configurations and target architectures. As a consequence, Piko helps design high-performance software pipelines that are flexible as well as portable across architectures.
We present Piko, a system abstraction to help implement high-level algorithmic pipelines on modern parallel architectures. We define 'pipelines' as a sequence of complex, dynamically-scheduled kernels that combine to implement a complex application. While primarily targeted towards efficient graphics applications, the way in which Piko exposes both parallelism and locality can naturally be applied to other domains as well. The abstraction helps programmers define work granularities as the data evolves across stages of an application. These definitions are disjoint from the underlying algorithms, which helps authors of Piko pipelines explore tradeoffs between locality and parallelism across varying application configurations and target architectures. As a consequence, Piko helps design high-performance software pipelines that are flexible as well as portable across architectures.  Back
 
Topics:
Real-Time Graphics, Programming Languages, Performance Optimization, Rendering & Ray Tracing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4650
Streaming:
Download:
Share:
 
Abstract:

We explore how a task-parallel model can be implemented on the GPU and address concerns and programming techniques for doing so. We discuss the primitives for building a task-parallel system on the GPU. This includes novel ideas for mapping tasking systems onto the GPU including task granularity, load balancing, memory management, and dependency resolution. We also present several applications which demonstrate how a task-parallel model is more suitable than the regular data parallel model. These applications include a Reyes renderer, tiled deferred lighting renderer, and a video encoding demo.

We explore how a task-parallel model can be implemented on the GPU and address concerns and programming techniques for doing so. We discuss the primitives for building a task-parallel system on the GPU. This includes novel ideas for mapping tasking systems onto the GPU including task granularity, load balancing, memory management, and dependency resolution. We also present several applications which demonstrate how a task-parallel model is more suitable than the regular data parallel model. These applications include a Reyes renderer, tiled deferred lighting renderer, and a video encoding demo.

  Back
 
Topics:
Application Design & Porting Techniques
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2138
Streaming:
Download:
Share:
 
Abstract:

We present a discussion of ideas and techniques behind programmable graphics pipelines on modern GPUs, specifically the example design of a real-time Reyes renderer. Walking through this example, we address the philosophy beneath programmable GPU graphics, the broad strategy for the specific pipeline, and algorithmic and implementation-level details for key rendering stages. We cover several issues concerning GPU efficiency, including those involving work scheduling, parallelization of traditional stages, and balancing of rendering workloads. We expect the audience to gain an in-depth exposure of the state of research in programmable graphics, and an insight into efficient pipeline design for irregular workloads.

We present a discussion of ideas and techniques behind programmable graphics pipelines on modern GPUs, specifically the example design of a real-time Reyes renderer. Walking through this example, we address the philosophy beneath programmable GPU graphics, the broad strategy for the specific pipeline, and algorithmic and implementation-level details for key rendering stages. We cover several issues concerning GPU efficiency, including those involving work scheduling, parallelization of traditional stages, and balancing of rendering workloads. We expect the audience to gain an in-depth exposure of the state of research in programmable graphics, and an insight into efficient pipeline design for irregular workloads.

  Back
 
Topics:
Graphics and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2010
Session ID:
S102162
Streaming:
Download:
Share:
 
Speakers:
Stanley Tzeng
- University of California, Davis
Abstract:
We explore software mechanisms for managing irregular tasks on graphics processing units. Traditional GPU programming guidelines teaches us how to efficiently program the GPU for data parallel pipelines with regular input and output. We present a strategy for solving task parallel pipelines which can handle irregular workloads on the GPU. We demonstrate that dynamic scheduling and efficient memory management are critical problems in achieving high efficiency on irregular workloads. We showcase our results on a real time Reyes rendering pipeline.
We explore software mechanisms for managing irregular tasks on graphics processing units. Traditional GPU programming guidelines teaches us how to efficiently program the GPU for data parallel pipelines with regular input and output. We present a strategy for solving task parallel pipelines which can handle irregular workloads on the GPU. We demonstrate that dynamic scheduling and efficient memory management are critical problems in achieving high efficiency on irregular workloads. We showcase our results on a real time Reyes rendering pipeline.  Back
 
Topics:
Developer - Algorithms
Type:
Poster
Event:
GTC Silicon Valley
Year:
2010
Session ID:
P10A06
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next