GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:

We''ll discuss the GPU accelerated Monte Carlo compute at JP Morgan which was architected for C1060 cards and revamped a few times as new architectures were released. The key features of the code are exclusive use of double precision, data caching, and code structure where significant amount of CPU pre-compute is followed by running multiple GPU kernels. On the latest devices, memory per flop is a throughput limiting factor for a class of our GPU-accelerated models. As byte/flop ratio is continuing to fall from one generation of GPU to the next, we are exploring the ways to re-architecture Monte Carlo simulation code to decrease memory requirements and improve TCO of the GPU-enabled compute. Obvious next steps are store less, re-calculate more, and unified memory. 

We''ll discuss the GPU accelerated Monte Carlo compute at JP Morgan which was architected for C1060 cards and revamped a few times as new architectures were released. The key features of the code are exclusive use of double precision, data caching, and code structure where significant amount of CPU pre-compute is followed by running multiple GPU kernels. On the latest devices, memory per flop is a throughput limiting factor for a class of our GPU-accelerated models. As byte/flop ratio is continuing to fall from one generation of GPU to the next, we are exploring the ways to re-architecture Monte Carlo simulation code to decrease memory requirements and improve TCO of the GPU-enabled compute. Obvious next steps are store less, re-calculate more, and unified memory. 

  Back
 
Topics:
Consumer Engagement & Personalization, Finance - Quantitative Risk & Derivative Calculations
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8802
Download:
Share:
 
Abstract:

The Pascal generation of GPUs is bringing an increased compute density to data centers and NVLink on IBM Power 8 CPUs makes this compute density ever more accessible to HPC applications. However, reduced memory-to-compute ratios present some unique challenges for the cost of throughput-oriented compute. We'll present a case study of moving up production Monte Carlo GPU codes to IBM's "Minsky" S822L servers with NVIDIA Tesla P100 GPUs.

The Pascal generation of GPUs is bringing an increased compute density to data centers and NVLink on IBM Power 8 CPUs makes this compute density ever more accessible to HPC applications. However, reduced memory-to-compute ratios present some unique challenges for the cost of throughput-oriented compute. We'll present a case study of moving up production Monte Carlo GPU codes to IBM's "Minsky" S822L servers with NVIDIA Tesla P100 GPUs.

  Back
 
Topics:
Finance, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7668
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next