We present the Alexa shock hydrodynamics code, built using Kokkos, and its performance on hardware including Intel KNL and NVIDIA P100 (which is twice as fast). Alexa performs 3D simulations of multiple materials undergoing large deformation at large energies. Part of the goal of Alexa is to bring complex simulations onto laptops for users.
Performance on manycore devices is dependent data access patterns where different devices (NVIDIA, Intel-Phi, NUMA) require different data access patterns. A performance-portable programming model does not force a false-choice between arrays-of-structures or structures-of-arrays, instead it defines abstractions to transparently adapt data structures to meet device requirements. The KokkosArray library implements this strategy through simple and intuitive multidimensional array abstractions. Usability and performance-portability is demonstrated with proxy-applications for finite element and molecular dynamics codes. MiniMD, a proxy-application for the LAMMPS molecular dynamic code, has implementations in OpenMP, OpenCL, CUDA, and now KokkosArray. A comparison of miniMD''s KokkosArray implementation with the previous three versions demonstrate the relative strengths and weaknesses of KokkosArray, and that how the portable version retains about 95% of the performance of the "native" versions. Multiphysics applications with heterogeneous finite element discretizations have complex and highly irregular data structures. A KokkosArray-based prototype unstructured heterogeneous finite element mesh library and its support for heterogeneous manycore parallel computations will be presented.