SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

AI Application, Deployment & Inference
Presentation
Media
Deploying Deep Neural Networks as a Service Using TensorRT and NVIDIA-Docker
Abstract:
Learn how you can utilize TensorRT and NVIDIA Docker to quickly configure and deploy a GPU-accelerated inference server and start gaining insights from your trained deep neural network (DNN) models. TensorRT is a high-performance tool for low-latency, high-throughput DNN inference. The latest release of TensorRT introduces a novel, framework-agnostic network definition format called universal framework format, which allows TensorRT to support and optimize DNN models trained in multiple deep learning frameworks. We'll leverage the TensorRT Python API to create a lightweight Python Flask application capable of serving multiple DNN models trained using TensorFlow, PyTorch, and Caffe, and also discuss how to containerize this inference service using NVIDIA Docker for ease of deployment at scale. This session will consist of a lecture, live demos, and detailed instructions.
 
Topics:
AI Application, Deployment & Inference, Tools & Libraries, Data Center & Cloud Infrastructure
Type:
Tutorial
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8495
Streaming:
Download:
Share: