We explore how a task-parallel model can be implemented on the GPU and address concerns and programming techniques for doing so. We discuss the primitives for building a task-parallel system on the GPU. This includes novel ideas for mapping tasking systems onto the GPU including task granularity, load balancing, memory management, and dependency resolution. We also present several applications which demonstrate how a task-parallel model is more suitable than the regular data parallel model. These applications include a Reyes renderer, tiled deferred lighting renderer, and a video encoding demo.
We present a discussion of ideas and techniques behind programmable graphics pipelines on modern GPUs, specifically the example design of a real-time Reyes renderer. Walking through this example, we address the philosophy beneath programmable GPU graphics, the broad strategy for the specific pipeline, and algorithmic and implementation-level details for key rendering stages. We cover several issues concerning GPU efficiency, including those involving work scheduling, parallelization of traditional stages, and balancing of rendering workloads. We expect the audience to gain an in-depth exposure of the state of research in programmable graphics, and an insight into efficient pipeline design for irregular workloads.