GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Artificial Intelligence and Deep Learning
Presentation
Media
Tuning Performance on Kepler GPUs: An Introduction to Kepler Assembler and Its Usage in CNN optimization
Abstract:
Learn some advanced skills about performance optimization on Kepler GPUs. NVIDIA has provided many powerful tools to analyze and improve efficiency of CUDA kernels. However, in many specific cases, developers need to do some more detailed adjusting to get expected performance. In this session, a native assembler for Kepler architecture used in Alibaba will be introduced. Also, turning experiences of CNN and gemm implementation with this assembler will be shown as examples. If you are interested in assembly level optimization and want to use such a tool in Kepler architecture, you shouldn't miss this session!
 
Topics:
Artificial Intelligence and Deep Learning, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6173
Streaming:
Download:
Share: