GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Learn how to port multi-dimensional multi-group discrete ordinate neutron transport equations code SNAP to GPU clusters. It will show that GPU is a good fit for this class of applications. GPU enables both faster throughput at small scale and better scalability at large scale. The porting strategy and performance model on GPU will be described.
Learn how to port multi-dimensional multi-group discrete ordinate neutron transport equations code SNAP to GPU clusters. It will show that GPU is a good fit for this class of applications. GPU enables both faster throughput at small scale and better scalability at large scale. The porting strategy and performance model on GPU will be described.  Back
 
Topics:
Computational Physics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4164
Streaming:
Share:
 
Abstract:

Learn the porting of ENZO solvers to GPU. ENZO is a block-structured adaptive mesh refinement (AMR) astrophysical fluid dynamics code used for simulating cosmological structure formation. It is one of the most commonly used community code in astrophysics. We have ported the PPM Hydrodynamics and Magnetohydrodynamics solvers to GPU and integrated the GPU solvers fully into the AMR framework. This talk will describe the porting strategy and performance results.

Learn the porting of ENZO solvers to GPU. ENZO is a block-structured adaptive mesh refinement (AMR) astrophysical fluid dynamics code used for simulating cosmological structure formation. It is one of the most commonly used community code in astrophysics. We have ported the PPM Hydrodynamics and Magnetohydrodynamics solvers to GPU and integrated the GPU solvers fully into the AMR framework. This talk will describe the porting strategy and performance results.

  Back
 
Topics:
Astronomy & Astrophysics, Computational Physics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3401
Streaming:
Download:
Share:
 
Abstract:

Learn how to port legacy Fortran plasma codes to GPU. Many legacy plasma codes are written in Fortran and have many lines of codes. We will discuss techniques in porting such legacy codes easily and efficiently to CUDA C/C++. Performance analysis of major algorithmic patterns in plasma codes will be discussed. The discussion will use the GTC and GeFi plasma code as realistic examples.

Learn how to port legacy Fortran plasma codes to GPU. Many legacy plasma codes are written in Fortran and have many lines of codes. We will discuss techniques in porting such legacy codes easily and efficiently to CUDA C/C++. Performance analysis of major algorithmic patterns in plasma codes will be discussed. The discussion will use the GTC and GeFi plasma code as realistic examples.

  Back
 
Topics:
Computational Physics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2245
Streaming:
Download:
Share:
 
Abstract:

SJTU-NS3D is an in-house CFD code co-developed by SJTU and COMAC for large civil airplane, solving 3D Reynolds Average Navier-Stokes (RANS) equations on structured grids by finite volume method, which could be used in designing wing model. In this talk, we will present the design and further optimization of CUDA version of SJTU-NS3D, and it achieves 20-fold speedup for standard M6 wing model and 37-fold speedup for wing model candidate from COMAC on single Fermi C2050.

SJTU-NS3D is an in-house CFD code co-developed by SJTU and COMAC for large civil airplane, solving 3D Reynolds Average Navier-Stokes (RANS) equations on structured grids by finite volume method, which could be used in designing wing model. In this talk, we will present the design and further optimization of CUDA version of SJTU-NS3D, and it achieves 20-fold speedup for standard M6 wing model and 37-fold speedup for wing model candidate from COMAC on single Fermi C2050.

  Back
 
Topics:
Computational Fluid Dynamics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2251
Streaming:
Download:
Share:
 
Speakers:
Peng Wang
- NVIDIA
Abstract:
Learn how to optimize your OpenCL application to achieve maximum performance on NVIDIA GPUs. We will first briefly discuss how the OpenCL programming model maps onto NVIDIA GPU's architecture. We will then talk about memory, instruction, and NDRange optimization techniques, illustrating each with small code samples.
Learn how to optimize your OpenCL application to achieve maximum performance on NVIDIA GPUs. We will first briefly discuss how the OpenCL programming model maps onto NVIDIA GPU's architecture. We will then talk about memory, instruction, and NDRange optimization techniques, illustrating each with small code samples.  Back
 
Topics:
Tools & Libraries, HPC and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2010
Session ID:
S09068
Download:
Share:
 
Speakers:
Peng Wang
- NVIDIA
Abstract:
Learn how to accelerate short-range molecular dynamics using CUDA C. We will cover building the neighbor list and calculating the forces on the GPU. To handle the case where a few particles have significantly more neighbors than most other particles, we propose a hybrid data structure for the neighbor list that can achieve a good balance between performance and storage efficiency. A CUDA C implementation of the technique for Leonard-Jones forces can be found in the LAMMPS molecular dynamics open source code.
Learn how to accelerate short-range molecular dynamics using CUDA C. We will cover building the neighbor list and calculating the forces on the GPU. To handle the case where a few particles have significantly more neighbors than most other particles, we propose a hybrid data structure for the neighbor list that can achieve a good balance between performance and storage efficiency. A CUDA C implementation of the technique for Leonard-Jones forces can be found in the LAMMPS molecular dynamics open source code.  Back
 
Topics:
Molecular Dynamics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2010
Session ID:
2006
Streaming:
Download:
Share:
 
Abstract:

Adaptive mesh fluid simulations play a crucial role in many areas of astrophysical research including the formation and explosion of stars, jets from black holes, etc. A parallel adaptive mesh multi-physics fluid code, Enzo, has been widely used in astrophysical community in recent years. In this talk I will describe a CUDA implementation of the finite volume fluid solver used in Enzo. The GPU version shows significant speed-up compared to the CPU version.

Adaptive mesh fluid simulations play a crucial role in many areas of astrophysical research including the formation and explosion of stars, jets from black holes, etc. A parallel adaptive mesh multi-physics fluid code, Enzo, has been widely used in astrophysical community in recent years. In this talk I will describe a CUDA implementation of the finite volume fluid solver used in Enzo. The GPU version shows significant speed-up compared to the CPU version.

  Back
 
Topics:
Astronomy & Astrophysics, HPC and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2009
Session ID:
S09062
Streaming:
Download:
Share:
 
Abstract:

In this session, we will discuss how to optimize OpenCL programs on NVIDIA GPUs. Three main aspects are discussed: memory, execution configuration, and instruction throughput. On memory optimization, we will discuss how to increase bandwidth by global memory coalescing and using local memory. Then we will discuss the concept of occupancy and various considerations in specifying the execution configuration of a kernel. Finally, we discuss techniques for improving instruction throughput.

In this session, we will discuss how to optimize OpenCL programs on NVIDIA GPUs. Three main aspects are discussed: memory, execution configuration, and instruction throughput. On memory optimization, we will discuss how to increase bandwidth by global memory coalescing and using local memory. Then we will discuss the concept of occupancy and various considerations in specifying the execution configuration of a kernel. Finally, we discuss techniques for improving instruction throughput.

  Back
 
Topics:
Graphics and AI, Tools & Libraries, Professional Visualisation, Medical Imaging & Radiology
Type:
Talk
Event:
GTC Silicon Valley
Year:
2009
Session ID:
S09068
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next