GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Learn how to unleash the full power of GPUs on one of the more difficult problems -- preconditioning in sparse solvers -- by using fast N-body methods as a preconditioner. Fast N-body methods have been able to achieve high percentage of the peak performance since the early days of GPU computing. However, its successful applications have been limited to astrophysics and molecular dynamics, where the physics itself is naturally described by a collection of discrete points. Mathematically, there is nothing that prevents the use of fast N-body methods as a solver for a more general class of PDEs. This would not have been a good idea back when Flops were expensive, since it essentially turns the sparse matrix into a dense matrix of the same size, before hierarchically grouping the off-diagonal blocks. But now that Flops are becoming comparatively cheap, the notion of a "compute-bound preconditioner" sounds attractive more than ever. We will demonstrate how competitive such a preconditioner actually is on Kepler.
Learn how to unleash the full power of GPUs on one of the more difficult problems -- preconditioning in sparse solvers -- by using fast N-body methods as a preconditioner. Fast N-body methods have been able to achieve high percentage of the peak performance since the early days of GPU computing. However, its successful applications have been limited to astrophysics and molecular dynamics, where the physics itself is naturally described by a collection of discrete points. Mathematically, there is nothing that prevents the use of fast N-body methods as a solver for a more general class of PDEs. This would not have been a good idea back when Flops were expensive, since it essentially turns the sparse matrix into a dense matrix of the same size, before hierarchically grouping the off-diagonal blocks. But now that Flops are becoming comparatively cheap, the notion of a "compute-bound preconditioner" sounds attractive more than ever. We will demonstrate how competitive such a preconditioner actually is on Kepler.  Back
 
Topics:
Numerical Algorithms & Libraries, Computational Physics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4228
Streaming:
Download:
Share:
 
Abstract:
Learn how to leverage current numerical algorithms for solving challenging reservoir and seismic simulation problems on GPUs using: 1) a novel preconditioner technique based on massively parallel, compute intensive Fast N-body methods, 2) an optimized implementation of the Sparse Matrix-Vector multiplication used during the iterative solver phase, which exploits the existing structure of the sparse matrix and 3) a synchronization-reducing algorithm for stencil-based computation during explicit time integration.
Learn how to leverage current numerical algorithms for solving challenging reservoir and seismic simulation problems on GPUs using: 1) a novel preconditioner technique based on massively parallel, compute intensive Fast N-body methods, 2) an optimized implementation of the Sparse Matrix-Vector multiplication used during the iterative solver phase, which exploits the existing structure of the sparse matrix and 3) a synchronization-reducing algorithm for stencil-based computation during explicit time integration.   Back
 
Topics:
Seismic & Geosciences, Numerical Algorithms & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4287
Streaming:
Download:
Share:
 
Abstract:

Reservoir simulation involve sparse iterative solvers for linear systems that arise from implicit discretizations of coupled PDEs from high-fidelity reservoir simulators. One of the major bottlenecks in these solvers is the sparse matrix-vector product. Sparse matrices are usually compressed in some format (e.g., CSR, ELL) before being processed. In this talk, we focus on the low-level design of a sparse matrix-vector (SpMV) kernel on GPUs. Most of the relevant contributions focus on introducing new formats that suit the GPU architecture such as the diagonal format for diagonal matrices and the blocked-ELL format for sparse matrices with small dense blocks. However, we target both generic and domain-specific implementations. Generic implementations basically target the CSR and ELL formats, in order to be part of the KAUST-BLAS library. More chances for further optimizations appear when the matrix has specific structure. In the talk, we will present the major design challenges and outlines, and preliminary results. The primary focus will be on the CSR format, where some preliminary results will be shown. The other bottleneck of reservoir simulations is the preconditioning in the sparse matrix solver. We investigate the possibility of a Fast Multipole Method based technique on GPUs as a compute-bound preconditioner.

Reservoir simulation involve sparse iterative solvers for linear systems that arise from implicit discretizations of coupled PDEs from high-fidelity reservoir simulators. One of the major bottlenecks in these solvers is the sparse matrix-vector product. Sparse matrices are usually compressed in some format (e.g., CSR, ELL) before being processed. In this talk, we focus on the low-level design of a sparse matrix-vector (SpMV) kernel on GPUs. Most of the relevant contributions focus on introducing new formats that suit the GPU architecture such as the diagonal format for diagonal matrices and the blocked-ELL format for sparse matrices with small dense blocks. However, we target both generic and domain-specific implementations. Generic implementations basically target the CSR and ELL formats, in order to be part of the KAUST-BLAS library. More chances for further optimizations appear when the matrix has specific structure. In the talk, we will present the major design challenges and outlines, and preliminary results. The primary focus will be on the CSR format, where some preliminary results will be shown. The other bottleneck of reservoir simulations is the preconditioning in the sparse matrix solver. We investigate the possibility of a Fast Multipole Method based technique on GPUs as a compute-bound preconditioner.

  Back
 
Topics:
Developer - Algorithms, Seismic & Geosciences
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3449
Streaming:
Download:
Share:
 
Abstract:

See the newest developments in the area of hierarchical N-body methods for GPU computing. Hierarchical N-body methods have O(N) complexity, are compute bound, and require very little synchronization, which makes them a favorable algorithm on next-generation supercomputers. In this session we will cover topics such as hybridization of treecodes and fast multipole methods, auto-tuning kernels for heterogenous systems, fast tree construction based on prefix sums, fast load balancing of global trees, and more. Examples will be given using ExaFMM --an open source hierarchical N-body library for heterogenous systems developed by the speaker. (released at SC11)

See the newest developments in the area of hierarchical N-body methods for GPU computing. Hierarchical N-body methods have O(N) complexity, are compute bound, and require very little synchronization, which makes them a favorable algorithm on next-generation supercomputers. In this session we will cover topics such as hybridization of treecodes and fast multipole methods, auto-tuning kernels for heterogenous systems, fast tree construction based on prefix sums, fast load balancing of global trees, and more. Examples will be given using ExaFMM --an open source hierarchical N-body library for heterogenous systems developed by the speaker. (released at SC11)

  Back
 
Topics:
Developer - Algorithms
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2308
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next