SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC On-Demand

Presentation
Media
Abstract:
MATLAB's deep learning, visualization, and C++/CUDA code generation technology make it a uniquely complete solution for your entire AI workflow. In MATLAB, you can easily manage data, perform complex image and signal processing, prototype and train deep networks, and deploy to your desktop, embedded or cloud environments. Using GPU Coder technology MATLAB generates CUDA kernels that optimize loops and memory access, and C++ that leverages cuDNN and TensorRT, providing the fastest deep network inference of any framework. With MATLAB's NVIDIA docker container available through the NVIDIA GPU Cloud, you can now easily access all this AI power, deploy it in your cloud or DGX environment, and get up and running in seconds. In this presentation we will demonstrate a complete end-to-end workflow that starts from 'docker run', prototypes and trains a network on a multi-GPU machine in the cloud, and ends with a highly optimized inference engine to deploy to data centers, clouds, and embedded devices.
MATLAB's deep learning, visualization, and C++/CUDA code generation technology make it a uniquely complete solution for your entire AI workflow. In MATLAB, you can easily manage data, perform complex image and signal processing, prototype and train deep networks, and deploy to your desktop, embedded or cloud environments. Using GPU Coder technology MATLAB generates CUDA kernels that optimize loops and memory access, and C++ that leverages cuDNN and TensorRT, providing the fastest deep network inference of any framework. With MATLAB's NVIDIA docker container available through the NVIDIA GPU Cloud, you can now easily access all this AI power, deploy it in your cloud or DGX environment, and get up and running in seconds. In this presentation we will demonstrate a complete end-to-end workflow that starts from 'docker run', prototypes and trains a network on a multi-GPU machine in the cloud, and ends with a highly optimized inference engine to deploy to data centers, clouds, and embedded devices.  Back
 
Topics:
AI and DL Research, Data Center and Cloud Infrastructure
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9469
Streaming:
Download:
Share:
 
Abstract:
Learn how to adopt a MATLAB-centric workflow to design, develop, scale and deploy deep learning applications on to GPUs whether on your desktop, a cluster, or on embedded Tegra platforms, including Jetson TK1/TX1 and DRIVE PX boards. The workflow starts with algorithm design in MATLAB, which enjoys universal appeal among engineers and scientists because of its expressive power and ease of use. The algorithm may employ deep learning networks augmented with traditional computer vision techniques and can be tested and verified within MATLAB. Next, those networks are trained using MATLAB''s GPU and parallel computing support either on the desktop, a local compute cluster, or in the cloud. Finally, a compiler auto-generates portable and optimized CUDA code from the MATLAB algorithm, which is then cross-compiled and deployed to the Tegra board. We''ll use examples of common computer vision algorithms and deep learning networks to describe this workflow, and we''ll present their performance benchmarks, including training with multiple GPUs on an Amazon P2 cloud instance.
Learn how to adopt a MATLAB-centric workflow to design, develop, scale and deploy deep learning applications on to GPUs whether on your desktop, a cluster, or on embedded Tegra platforms, including Jetson TK1/TX1 and DRIVE PX boards. The workflow starts with algorithm design in MATLAB, which enjoys universal appeal among engineers and scientists because of its expressive power and ease of use. The algorithm may employ deep learning networks augmented with traditional computer vision techniques and can be tested and verified within MATLAB. Next, those networks are trained using MATLAB''s GPU and parallel computing support either on the desktop, a local compute cluster, or in the cloud. Finally, a compiler auto-generates portable and optimized CUDA code from the MATLAB algorithm, which is then cross-compiled and deployed to the Tegra board. We''ll use examples of common computer vision algorithms and deep learning networks to describe this workflow, and we''ll present their performance benchmarks, including training with multiple GPUs on an Amazon P2 cloud instance.  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
SIGGRAPH
Year:
2017
Session ID:
SC1706
Download:
Share: