GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Learn the best way to introduce Tensor Core acceleration in HPC applications, followed by quick introduction on Tensor Core architecture and functionality. This session will also present case studies of HPC applications using Tensor Core.
Learn the best way to introduce Tensor Core acceleration in HPC applications, followed by quick introduction on Tensor Core architecture and functionality. This session will also present case studies of HPC applications using Tensor Core.  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
Supercomputing
Year:
2019
Session ID:
SC1909
Streaming:
Download:
Share:
 
Abstract:
This session will cover details of performance features released in the latest version of CUDA, new features of Turing architecture alongside a wealth of optimization techniques, and in-depth information to get the most out of the Volta/Turing GPU architecture.
This session will cover details of performance features released in the latest version of CUDA, new features of Turing architecture alongside a wealth of optimization techniques, and in-depth information to get the most out of the Volta/Turing GPU architecture.  Back
 
Topics:
Developer Tools
Type:
Talk
Event:
GTC Israel
Year:
2018
Session ID:
SIL8140
Streaming:
Download:
Share:
 
Abstract:

We'll discuss how NVIDIA IndeX Advanced Rendering Tools are helping researchers get more insight through in-situ visualizations. HPC applications have always been centered around large computations, small input, and extremely large simulated output. HPC applications running on big supercomputers are executed using a queuing system, where researchers have to wait a couple of hours before analyzing the outputs. We've designed essential software components that allow in-situ visualizations of sparse volume data from ALYA multiphysics simulation code (Barcelona Supercomputing Center) using NVIDIA IndeX. ALYA multiphysics is one of the two European exascale benchmarks and is used in targeted medicine, cardiac modeling, renewable energy, etc. We'll guide you through techniques that have been used in enabling in-situ rendering and analysis of data

We'll discuss how NVIDIA IndeX Advanced Rendering Tools are helping researchers get more insight through in-situ visualizations. HPC applications have always been centered around large computations, small input, and extremely large simulated output. HPC applications running on big supercomputers are executed using a queuing system, where researchers have to wait a couple of hours before analyzing the outputs. We've designed essential software components that allow in-situ visualizations of sparse volume data from ALYA multiphysics simulation code (Barcelona Supercomputing Center) using NVIDIA IndeX. ALYA multiphysics is one of the two European exascale benchmarks and is used in targeted medicine, cardiac modeling, renewable energy, etc. We'll guide you through techniques that have been used in enabling in-situ rendering and analysis of data

  Back
 
Topics:
In-Situ & Scientific Visualization, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7199
Download:
Share:
 
Abstract:

We explore NVIDIA advanced rendering products for visualizations. HPC generates terabytes of simulation data. Post-processing and visualization of the simulation data can give scientists and engineers a good insight in the simulation process. Large scale visualizations require distributed rendering process for seamless experience. In this talk we describe our experience and techniques in using NVIDIA rendering products for bio-mechanical simulation of human respiratory system.

We explore NVIDIA advanced rendering products for visualizations. HPC generates terabytes of simulation data. Post-processing and visualization of the simulation data can give scientists and engineers a good insight in the simulation process. Large scale visualizations require distributed rendering process for seamless experience. In this talk we describe our experience and techniques in using NVIDIA rendering products for bio-mechanical simulation of human respiratory system.

  Back
 
Topics:
HPC and Supercomputing, Visualization - In-Situ & Scientific
Type:
Talk
Event:
Supercomputing
Year:
2016
Session ID:
SC6103
Streaming:
Download:
Share:
 
Abstract:
Learn to interface CUDA kernels, CUDA library API and driver APIs with existing Fortran applications in HPC. This session informs you about the Alya multi-physics code developed at Barcelona Supercomputing Centre. The code is based on Fortran95 and scales across thousands of cores. We describe in depth how to port computationally heavy modules from Fortran to CUDA. The session will teach in depth on how to use CUDA features like dynamic parallelism, CUDA streams, unified memory, and error handling features for Fortran applications with NVCC compiler. We also discuss future directions using next-generation programming models such as OmpSs for hybrid CPU and GPU computing. The presentation includes various example codes for improving the programming skills of the scientific community.
Learn to interface CUDA kernels, CUDA library API and driver APIs with existing Fortran applications in HPC. This session informs you about the Alya multi-physics code developed at Barcelona Supercomputing Centre. The code is based on Fortran95 and scales across thousands of cores. We describe in depth how to port computationally heavy modules from Fortran to CUDA. The session will teach in depth on how to use CUDA features like dynamic parallelism, CUDA streams, unified memory, and error handling features for Fortran applications with NVCC compiler. We also discuss future directions using next-generation programming models such as OmpSs for hybrid CPU and GPU computing. The presentation includes various example codes for improving the programming skills of the scientific community.  Back
 
Topics:
Performance Optimization, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6318
Streaming:
Download:
Share:
 
Abstract:
Learn to exploit CUDA features for saving energy and thus your pockets. This session briefs about the Pedraforca prototype developed at Barcelona Supercomputing Centre under the Mont-Blanc project. The prototype is based on NVIDIA® Tegra® and NVIDIA® Tesla® platforms and aims at reducing the raw power footprint of the HPC clusters. This session describes in depth how to exploit CUDA dynamic parallelism and CUDA streams for GPU applications to be ported on low power ARM based prototypes. Also includes architectural description of the prototype, power budget comparisons, and various example codes for improving the programming skills of CUDA users.
Learn to exploit CUDA features for saving energy and thus your pockets. This session briefs about the Pedraforca prototype developed at Barcelona Supercomputing Centre under the Mont-Blanc project. The prototype is based on NVIDIA® Tegra® and NVIDIA® Tesla® platforms and aims at reducing the raw power footprint of the HPC clusters. This session describes in depth how to exploit CUDA dynamic parallelism and CUDA streams for GPU applications to be ported on low power ARM based prototypes. Also includes architectural description of the prototype, power budget comparisons, and various example codes for improving the programming skills of CUDA users.  Back
 
Topics:
Performance Optimization1, Intelligent Machines, IoT & Robotics, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2015
Session ID:
S5384
Streaming:
Download:
Share:
 
Abstract:
The increased demand for higher resolution and detailed SAR imaging builds up a pressure on the processing power of the existing systems for real time or near real time processing. Exploitation of GPU processing power could meet the increasing demands in processing. This poster comprises results and analysis of parallelizing Range - Doppler algorithm for SAR imaging and comparison of computational time over traditional CPU and NVIDIA TESLA platform.
The increased demand for higher resolution and detailed SAR imaging builds up a pressure on the processing power of the existing systems for real time or near real time processing. Exploitation of GPU processing power could meet the increasing demands in processing. This poster comprises results and analysis of parallelizing Range - Doppler algorithm for SAR imaging and comparison of computational time over traditional CPU and NVIDIA TESLA platform.  Back
 
Topics:
Programming Languages, Signal and Audio Processing
Type:
Poster
Event:
GTC Silicon Valley
Year:
2013
Session ID:
P3117
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next