GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:

We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such as the DGX Station, to rapidly develop and prototype deep learning applications. This journey will take you from experimenting and iterating fast and often, to obtaining a trained model, to eventually deploying scale-out GPU inference servers in a datacenter. Tools available, that will be explained, are NGC (NVIDIA GPU Cloud), TensorRT, TensorRT Inference Server, and YAIS. NGC is a cloud registry of docker images, TensorRT is an inference optimizing compiler, TensorRT Inference Server is a containerized microservice that maximizes GPU utilization and runs multiple models from different frameworks concurrently on a node, and YAIS is a C++ library for developing compute intensive asynchronous microservices using gRPC. This session will consist of a lecture, live demos, and detailed instructions about different inference compute options, pre- and post-processing considerations, inference serving options, monitoring considerations, and scalable workload orchestration using Kubernetes.

We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such as the DGX Station, to rapidly develop and prototype deep learning applications. This journey will take you from experimenting and iterating fast and often, to obtaining a trained model, to eventually deploying scale-out GPU inference servers in a datacenter. Tools available, that will be explained, are NGC (NVIDIA GPU Cloud), TensorRT, TensorRT Inference Server, and YAIS. NGC is a cloud registry of docker images, TensorRT is an inference optimizing compiler, TensorRT Inference Server is a containerized microservice that maximizes GPU utilization and runs multiple models from different frameworks concurrently on a node, and YAIS is a C++ library for developing compute intensive asynchronous microservices using gRPC. This session will consist of a lecture, live demos, and detailed instructions about different inference compute options, pre- and post-processing considerations, inference serving options, monitoring considerations, and scalable workload orchestration using Kubernetes.

  Back
 
Topics:
Artificial Intelligence and Deep Learning, Data Center & Cloud Infrastructure, Developer Tools
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8147
Streaming:
Share:
 
Abstract:
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such as the DGX Station, to rapidly develop and prototype deep learning applications. This journey will take you from experimenting and iterating fast and often, to obtaining a trained model, to eventually deploying scale-out GPU inference servers in a datacenter. Tools available, that will be explained, are NGC (NVIDIA GPU Cloud), TensorRT, TensorRT Inference Server, and YAIS.
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such as the DGX Station, to rapidly develop and prototype deep learning applications. This journey will take you from experimenting and iterating fast and often, to obtaining a trained model, to eventually deploying scale-out GPU inference servers in a datacenter. Tools available, that will be explained, are NGC (NVIDIA GPU Cloud), TensorRT, TensorRT Inference Server, and YAIS.  Back
 
Topics:
Artificial Intelligence and Deep Learning, HPC and Supercomputing
Type:
Talk
Event:
GTC Europe
Year:
2018
Session ID:
E8150
Streaming:
Download:
Share:
 
Abstract:
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such has the DGX Station, to rapidly develop and prototype deep learning applications. We'll demonstrate the setup of our small lab, which is capable of supporting a team of several developers/researchers, and our journey as we moved from lab to data center. Specifically, we'll walk through our experience of building the TensorRT Inference Demo, aka Flowers, used by Jensen to demonstrate the value of GPU computing throughout the world-wide GTCs. As an added bonus, get first-hand insights into the latest advancements coming to AI workstations this year. The flexibility for fast prototyping provided by our lab was an invaluable asset as we toyed with different software and hardware components. As the models and applications stabilized and we moved from lab to data center, we were able to run fully load-balanced over 64 V100s serving video inference demonstrating Software-in-the-Loop's (SIL) ReSim capabilities for Autonomous Vehicles at GTC EU. Real live examples will be given.
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such has the DGX Station, to rapidly develop and prototype deep learning applications. We'll demonstrate the setup of our small lab, which is capable of supporting a team of several developers/researchers, and our journey as we moved from lab to data center. Specifically, we'll walk through our experience of building the TensorRT Inference Demo, aka Flowers, used by Jensen to demonstrate the value of GPU computing throughout the world-wide GTCs. As an added bonus, get first-hand insights into the latest advancements coming to AI workstations this year. The flexibility for fast prototyping provided by our lab was an invaluable asset as we toyed with different software and hardware components. As the models and applications stabilized and we moved from lab to data center, we were able to run fully load-balanced over 64 V100s serving video inference demonstrating Software-in-the-Loop's (SIL) ReSim capabilities for Autonomous Vehicles at GTC EU. Real live examples will be given.  Back
 
Topics:
Deep Learning & AI Frameworks, HPC and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8263
Streaming:
Share:
 
 
Topics:
Deep Learning & AI Frameworks
Type:
Talk
Event:
SIGGRAPH
Year:
2017
Session ID:
SC1736
Download:
Share:
 
Abstract:

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

  Back
 
Topics:
Science and Research
Type:
Panel
Event:
GTC Washington D.C.
Year:
2016
Session ID:
DCL16105
Streaming:
Share:
 
Abstract:

In this lab you will test three different approaches to deploying a trained DNN for inference. The first approach is to directly use inference functionality within a deep learning framework, in this case DIGITS and Caffe. The second approach is to integrate inference within a custom application by using a deep learning framework API, again using Caffe but this time through it's Python API. The final approach is to use the NVIDIA GPU Inference Engine (GIE) which will automatically create an optimized inference run-time from a trained Caffe model and network description file. You will learn about the role of batch size in inference performance as well as various optimizations that can be made in the inference process. You'll also explore inference for a variety of different DNN architecture

In this lab you will test three different approaches to deploying a trained DNN for inference. The first approach is to directly use inference functionality within a deep learning framework, in this case DIGITS and Caffe. The second approach is to integrate inference within a custom application by using a deep learning framework API, again using Caffe but this time through it's Python API. The final approach is to use the NVIDIA GPU Inference Engine (GIE) which will automatically create an optimized inference run-time from a trained Caffe model and network description file. You will learn about the role of batch size in inference performance as well as various optimizations that can be made in the inference process. You'll also explore inference for a variety of different DNN architecture

  Back
 
Topics:
Science and Research
Type:
Instructor-Led Lab
Event:
GTC Washington D.C.
Year:
2016
Session ID:
DCL16106
Download:
Share:
 
Abstract:

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

  Back
 
Topics:
Science and Research
Type:
Instructor-Led Lab
Event:
GTC Washington D.C.
Year:
2016
Session ID:
DCL16114
Streaming:
Share:
 
Abstract:

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

Building upon the foundational understanding of how deep learning is applied to image classification, this lab explores different approaches to the more challenging problem of detecting if an object of interest is present within an image and recognizing its precise location within the image. Numerous approaches have been proposed for training deep neural networks for this task, each having pros and cons in relation to model training time, model accuracy and speed of detection during deployment. On completion of this lab, you will understand each approach and their relative merits. You'll receive hands-on training applying cutting edge object detection networks trained using NVIDIA DIGITS on a challenging real-world dataset.

  Back
 
Topics:
Science and Research
Type:
Instructor-Led Lab
Event:
GTC Washington D.C.
Year:
2016
Session ID:
DCL16112
Streaming:
Share:
 
Abstract:

Cray Cluster Systems have long been used to support Supercomputing and Scientific Applications. In this talk we'll demonstrate how these same systems can be easily configured to support Docker and subsequently various Machine Learning Software Packages ? including NVIDIA's Digits Software. Additionally, these systems can be configured in such a way that their Docker containers can be configured to pull data from Cray's Sonexion Scale-out Lustre Storage System. With this configuration our systems can have maximum application flexibility through docker as well as simultaneously being able to support the high performance storage requirements of many types of machine learning workloads through a connection with our Lustre ecosystem.

Cray Cluster Systems have long been used to support Supercomputing and Scientific Applications. In this talk we'll demonstrate how these same systems can be easily configured to support Docker and subsequently various Machine Learning Software Packages ? including NVIDIA's Digits Software. Additionally, these systems can be configured in such a way that their Docker containers can be configured to pull data from Cray's Sonexion Scale-out Lustre Storage System. With this configuration our systems can have maximum application flexibility through docker as well as simultaneously being able to support the high performance storage requirements of many types of machine learning workloads through a connection with our Lustre ecosystem.

  Back
 
Topics:
Artificial Intelligence and Deep Learning, Tools & Libraries, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6848
Streaming:
Share:
 
Abstract:

The distributed shared-memory implementation of the coupled-cluster singles and doubles with perturbative triples algorithm, CCSD(T), in the GAMESS chemistry package was ported to the GPU using the directive-based OpenACC standard. The focus of this port was to achieve maximum strong-scaling performance for small molecular systems (

The distributed shared-memory implementation of the coupled-cluster singles and doubles with perturbative triples algorithm, CCSD(T), in the GAMESS chemistry package was ported to the GPU using the directive-based OpenACC standard. The focus of this port was to achieve maximum strong-scaling performance for small molecular systems (

  Back
 
Topics:
Quantum Chemistry, Programming Languages, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3506
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next