GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Performance Optimization
Presentation
Media
Optimize Deep FSMN Network
Abstract:
Learn how to speed up a deep feedforward sequential memory network (FSMN) on Volta. We'll describe how to use Tensor Cores to speed up GEMM operations and explain how to optimize an FSMN kernel by increasing its locatiy and reducing its math workload. Although RNNs are a powerful tool to process sequence-to-sequence problems, their recurrent structure increases computational complexity. As an alternative, FSMN can effectively model long-term dependency without using any recurrent structure. We'll show how GPU-friendly FSMN can outperform RNN in both accuracy and speed. Our work is based on Alibaba's deep FSMN model.
 
Topics:
Performance Optimization, AI Application, Deployment & Inference, Speech & Language Processing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9113
Streaming:
Share: