GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Finance - Deep Learning
Presentation
Media
Discovering the Turing T4 GPU Architecture with Microbenchmarks
Abstract:

We'll do a deep dive into previously undisclosed architectural details of NVIDIA's Turing T4 Cloud GPU, which we unearthed via micro-benchmarks, and compare the architecture's features with previous generations of NVIDIA GPUs. We'll also reveal the geometry and latency of Turing's complex memory hierarchy, the format of its encoded instructions, and the latency of instructions. Learn how developers can use this knowledge to design workloads that adapt exactly to the characteristics of the T4 GPU. We'll also explain how to manually assemble binary code that squeezes every bit of bare-metal performance from the hardware, which maximizes dual issues and avoids bank conflicts.

 
Topics:
Finance - Deep Learning, Performance Optimization, HPC and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9839
Streaming:
Download:
Share: