GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Artificial Intelligence and Deep Learning
Presentation
Media
Effectively Scaling Deep Learning Frameworks to 40 GPUs and Beyond
Abstract:
A variety of deep learning frameworks now make it simple to train deep neural networks of many types. However, scaling deep learning frameworks to large models with data parallel training on many GPUs remains a challenge, as the default utilities for inter-device and inter-node communication provided by these frameworks are often not optimal. Using examples from several frameworks, we demonstrate that linear strong scaling to many nodes and many devices can be achieved augmenting deep learning frameworks with CUDA-aware MPI allreduce and allgather operations, which allow them to be used in an HPC setting where multi-GPU nodes are augmented with high-speed Infiniband interconnects. We'll show that these operations allow us to quickly train very large speech recognition models.
 
Topics:
Artificial Intelligence and Deep Learning
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7543
Download:
Share: