GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
We'll explain how to configure a system for benchmarking CUDA applications, point out common mistakes that can occur, and describe how to avoid these errors. Measuring performance in a deterministic and reproducible way is difficult. It is particularly challenging on GPU-Accelerated heterogeneous systems in which complex interactions among CPUs, GPUs, the memory subsystem, the OS, and many other factors need to be properly addressed. We will cover topics such as power management, system topology, NUMA-awareness, thread affinity, OS thread scheduling, and CUDA JIT caches.
We'll explain how to configure a system for benchmarking CUDA applications, point out common mistakes that can occur, and describe how to avoid these errors. Measuring performance in a deterministic and reproducible way is difficult. It is particularly challenging on GPU-Accelerated heterogeneous systems in which complex interactions among CPUs, GPUs, the memory subsystem, the OS, and many other factors need to be properly addressed. We will cover topics such as power management, system topology, NUMA-awareness, thread affinity, OS thread scheduling, and CUDA JIT caches.  Back
 
Topics:
Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9956
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next