GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
With computational rates in the teraflops, GPUs can accumulate round-off errors at an alarming rate. The errors are no different than those on other IEEE-754-compliant hardware, but GPUs are commonly used for much more intense calculations, so the concern for error is or should be significantly increased. In this talk, we'll examine the accumulation of round-off errors in the n-body application from the CUDA SDK, showing how varied the results can be, depending on the order of operations. We'll then explore a solution that tracks the accumulated errors, motivated by the methods suggested by Kahan (Kahan Summation) and Gustavson, Moreira & Enekel (from their work on stability and accuracy regarding Java portability). The result is a dramatic reduction in round-off error, typically resulting in the nearest floating-point value to the infinitely-precise answer. Furthermore, we will show the performance impact of tracking the errors, which is small, even on numerically-intense algorithms such as the n-body algorithm.
With computational rates in the teraflops, GPUs can accumulate round-off errors at an alarming rate. The errors are no different than those on other IEEE-754-compliant hardware, but GPUs are commonly used for much more intense calculations, so the concern for error is or should be significantly increased. In this talk, we'll examine the accumulation of round-off errors in the n-body application from the CUDA SDK, showing how varied the results can be, depending on the order of operations. We'll then explore a solution that tracks the accumulated errors, motivated by the methods suggested by Kahan (Kahan Summation) and Gustavson, Moreira & Enekel (from their work on stability and accuracy regarding Java portability). The result is a dramatic reduction in round-off error, typically resulting in the nearest floating-point value to the infinitely-precise answer. Furthermore, we will show the performance impact of tracking the errors, which is small, even on numerically-intense algorithms such as the n-body algorithm.  Back
 
Topics:
Numerical Algorithms & Libraries, Programming Languages, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4370
Streaming:
Share:
 
Abstract:

As a result of continuing improvements, NVIDIA offers GPU-accelerated floating-point performance in compliance with IEEE 754. It is our experience that a number of issues related to floating point accuracy and compliance are a frequent source of confusion both on CPUs and GPUs. The purpose of this talk is to discuss the most common ones related to NVIDIA GPUs and to supplement the documentation in the CUDA C Programming Guide

As a result of continuing improvements, NVIDIA offers GPU-accelerated floating-point performance in compliance with IEEE 754. It is our experience that a number of issues related to floating point accuracy and compliance are a frequent source of confusion both on CPUs and GPUs. The purpose of this talk is to discuss the most common ones related to NVIDIA GPUs and to supplement the documentation in the CUDA C Programming Guide

  Back
 
Topics:
Developer - Algorithms
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2085
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next