SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC On-Demand

Presentation
Media
Abstract:
Many computer vision applications powered by deep learning include multi-stage pre-processing data pipelines with compute-intensive processes like decoding, cropping, and format conversion that are carried out on CPUs. We'll discuss NVIDIA DALI, an open source, GPU-Accelerated data augmentation and image-loading library for optimizing data pipelines of deep learning frameworks. DALI provides a full pre- and post-processing data pipeline ready for training and inference. We'll demonstrate a TensorRT inference workflow within DALI-configurable graphs as well as custom operators.
Many computer vision applications powered by deep learning include multi-stage pre-processing data pipelines with compute-intensive processes like decoding, cropping, and format conversion that are carried out on CPUs. We'll discuss NVIDIA DALI, an open source, GPU-Accelerated data augmentation and image-loading library for optimizing data pipelines of deep learning frameworks. DALI provides a full pre- and post-processing data pipeline ready for training and inference. We'll demonstrate a TensorRT inference workflow within DALI-configurable graphs as well as custom operators.  Back
 
Topics:
AI Application Deployment and Inference, Deep Learning and AI Frameworks
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9818
Streaming:
Download:
Share:
 
Abstract:
Autonomous driving systems use various neural network models that require extremely accurate and efficient computation on GPUs. This session will outline how Zoox employs two strategies to improve inference performance (i.e., latency) of trained neural network models without loss of accuracy: (1) inference with NVIDIA TensorRT, and (2) inference with lower precision (i.e., Fp16 and Int8). We will share our learned lessons about neural network deployment with TensorRT and our current conversion workflow to tackle limitations.
Autonomous driving systems use various neural network models that require extremely accurate and efficient computation on GPUs. This session will outline how Zoox employs two strategies to improve inference performance (i.e., latency) of trained neural network models without loss of accuracy: (1) inference with NVIDIA TensorRT, and (2) inference with lower precision (i.e., Fp16 and Int8). We will share our learned lessons about neural network deployment with TensorRT and our current conversion workflow to tackle limitations.  Back
 
Topics:
Autonomous Vehicles
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9895
Streaming:
Download:
Share: