SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

AI Application, Deployment & Inference
Presentation
Media
Low-Latency GPU Accelerated Inferencing with TensorRT
Abstract:
Come learn how you can optimize the deployment of your trained neural networks using the GPU-accelerated inferencing library called TensorRT. TensorRT is a high-performance tool for low-latency, high-throughput deep neural network (DNN) inference that runs on NVIDIA GPUs. The latest release of TensorRT introduces a novel, framework-agnostic network definition format called universal framework format, allowing TensorRT to support and optimize DNN models trained in multiple deep learning frameworks like Caffe and TensorFlow. It also provides the capability to run inference at reduced precision, giving developers the ability to take advantage of new GPU hardware features like the Volta Tensor Core architecture. This session will be a combination of lecture and live demos.
 
Topics:
AI Application, Deployment & Inference, Tools & Libraries, Performance Optimization, Data Center & Cloud Infrastructure
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8496
Streaming:
Share: