This tutorial builds on the two previous sessions (An Introduction to GPU Programming and the Introduction to GPU Memory Model) and is intended for those with a basic understanding of CUDA programming. This tutorial dives deep into asynchronous operations and how to maximize throughput on both the CPU and GPU with streams. We will demonstrate how to build a CPU/GPU pipeline and how to design your algorithm to take advantage of asynchronous operations. The second part of the session will focus on dynamic parallelism.A programming demo involving asynchronous operations will be delivered. Printed copies of the material will be provided to all attendees for each session – collect all four!
Get the low down on debugging and profiling your GPU program from Dan Cyca, Chief Technology Officer, Acceleware.This webinar dives deep into profiling techniques and the tools available to help you optimize your code. We will demonstrate NVIDIA’s Visual Profiler, nvcc flags and cuobjdump and highlight the various methods available for understanding the performance of your CUDA program.The second part of the webinar will focus on debugging techniques and available tools to help you identify issues in your kernels. The latest debugging tools provided in CUDA 5.5 including Nsight and cuda-memcheck will be presented.