GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Algorithms & Numerical Techniques
Presentation
Media
One Kernel To Rule Them All. Performance-Portable FMM for CPUs and GPUs
Abstract:
We focus on a single code base for a certain scientific algorithm, a performance portable C++ implementation, using only a single code base that is easily executable in both CPU and GPU. For that purpose, we present our core algorithm -- the fast multipole method -- embedded in a stack of abstraction layers, allowing us to achieve portability without maintaining separate kernels for each architecture. In addition, we'll review common implementation pitfalls that might help other developers when aiming at a unified code base. Especially memory allocation, memory access, and the abstraction of SIMT for complex user-defined data structures are investigated. Finally, we present results/comparisons of the performance on a CPU and GPU.
 
Topics:
Algorithms & Numerical Techniques, HPC and Supercomputing
Type:
Poster
Event:
GTC Silicon Valley
Year:
2016
Session ID:
P6265
Download:
Share: