GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
The Kokkos library provides C++ HPC applications with a performance portable programming model for disparate manycore architectures such as NVIDIA?Pascal?, AMD Fusion, and Intel Xeon Phi. Until last year Kokkos supported only composition of data parallel patterns (foreach, reduce, and scan) with range and hierarchical team parallel execution policies. Our latest parallel pattern is a dynamic, directed acyclic graph (DAG) of heterogeneous tasks where each task supports internal data parallelism. At GTC16 we presented preliminary results based upon just-in-time access to an early release of NVIDIA CUDA?8. We've had a year to mature this highly challenging task-DAG capability and present results using the NVIDIA Pascal GPU.
The Kokkos library provides C++ HPC applications with a performance portable programming model for disparate manycore architectures such as NVIDIA?Pascal?, AMD Fusion, and Intel Xeon Phi. Until last year Kokkos supported only composition of data parallel patterns (foreach, reduce, and scan) with range and hierarchical team parallel execution policies. Our latest parallel pattern is a dynamic, directed acyclic graph (DAG) of heterogeneous tasks where each task supports internal data parallelism. At GTC16 we presented preliminary results based upon just-in-time access to an early release of NVIDIA CUDA?8. We've had a year to mature this highly challenging task-DAG capability and present results using the NVIDIA Pascal GPU.  Back
 
Topics:
HPC and Supercomputing, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7253
Download:
Share:
 
Abstract:
Kokkos is a programming model developed at Sandia National Laboratories for enabling application developers to achieve performance portability for C++ codes. It is now the primary programming model at Sandia to port production-level applications to modern architectures, including GPUs. We'll discuss the core abstractions of Kokkos for parallel execution as well as data management, and how they are used to provide a critically important set of capabilities for the efficient implementation of a wide range of HPC algorithms. We'll present performance evaluations on a range of platforms to demonstrate the state of the art of performance portability. This will include data from Intel KNL-based systems as well as IBM Power8 with NVIDIA NVLink-connected NVIDIA Tesla P100 GPUs. We'll also provide an overview of how Kokkos fits into the larger exascale project at the Department of Energy, and how it is used to advance the development of parallel programming support in the C++ language standa
Kokkos is a programming model developed at Sandia National Laboratories for enabling application developers to achieve performance portability for C++ codes. It is now the primary programming model at Sandia to port production-level applications to modern architectures, including GPUs. We'll discuss the core abstractions of Kokkos for parallel execution as well as data management, and how they are used to provide a critically important set of capabilities for the efficient implementation of a wide range of HPC algorithms. We'll present performance evaluations on a range of platforms to demonstrate the state of the art of performance portability. This will include data from Intel KNL-based systems as well as IBM Power8 with NVIDIA NVLink-connected NVIDIA Tesla P100 GPUs. We'll also provide an overview of how Kokkos fits into the larger exascale project at the Department of Energy, and how it is used to advance the development of parallel programming support in the C++ language standa  Back
 
Topics:
HPC and Supercomputing, Programming Languages
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7344
Download:
Share:
 
Abstract:
The Kokkos library provides C++ HPC applications with a performance portable programming model for disparate manycore architectures such as NVIDIA Kepler, AMD Fusion, and Intel Xeon Phi. Until this year, Kokkos supported only composition of data parallel patterns (foreach, reduce, and scan) with range and hierarchical team parallel execution policies. Our latest capability enhancement is the addition of hierarchical task-DAG (directed acyclic graph) pattern and policy, where each task supports internal data parallelism. We present our GPU-suitable abstractions and interface for non-blocking task-DAG, and their application to incomplete sparse matrix factorization and graph triangle enumeration.
The Kokkos library provides C++ HPC applications with a performance portable programming model for disparate manycore architectures such as NVIDIA Kepler, AMD Fusion, and Intel Xeon Phi. Until this year, Kokkos supported only composition of data parallel patterns (foreach, reduce, and scan) with range and hierarchical team parallel execution policies. Our latest capability enhancement is the addition of hierarchical task-DAG (directed acyclic graph) pattern and policy, where each task supports internal data parallelism. We present our GPU-suitable abstractions and interface for non-blocking task-DAG, and their application to incomplete sparse matrix factorization and graph triangle enumeration.  Back
 
Topics:
HPC and Supercomputing, Tools & Libraries
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6145
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next