GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Artificial Intelligence and Deep Learning
Presentation
Media
The Journey from a Small Development Lab Environment to a Production GPU Inference Datacenter
Abstract:

We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such as the DGX Station, to rapidly develop and prototype deep learning applications. This journey will take you from experimenting and iterating fast and often, to obtaining a trained model, to eventually deploying scale-out GPU inference servers in a datacenter. Tools available, that will be explained, are NGC (NVIDIA GPU Cloud), TensorRT, TensorRT Inference Server, and YAIS. NGC is a cloud registry of docker images, TensorRT is an inference optimizing compiler, TensorRT Inference Server is a containerized microservice that maximizes GPU utilization and runs multiple models from different frameworks concurrently on a node, and YAIS is a C++ library for developing compute intensive asynchronous microservices using gRPC. This session will consist of a lecture, live demos, and detailed instructions about different inference compute options, pre- and post-processing considerations, inference serving options, monitoring considerations, and scalable workload orchestration using Kubernetes.

 
Topics:
Artificial Intelligence and Deep Learning, Data Center & Cloud Infrastructure, Developer Tools
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8147
Streaming:
Share: