GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Performance Optimization
Presentation
Media
Unified Memory for Data Analytics and Deep Learning
Abstract:
Unified Memory significantly improves productivity, while explicit memory management often provides better performance. We'll examine performance of Unified Memory applications from key AI domains and describe memory-optimization techniques to find the right balance of productivity and performance when you're developing applications. Unified Memory was designed for data analytics, to keep frequently accessed data in GPU memory. We'll analyze performance of large analytic workloads and review bottlenecks for GPU oversubscription on PCIe and NVLINK systems. We'll also discuss results from our study integrating Unified Memory in PyTorch for training deep neural networks. We found that Unified Memory matches explicit cudaMalloc for workloads that fit on GPU memory. In addition, applications can oversubscribe the GPU, which facilitates using bigger batch sizes or training deeper models.
 
Topics:
Performance Optimization, Deep Learning & AI Frameworks, Accelerated Data Science
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9726
Streaming:
Download:
Share: