GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
This talk consists of two parts. In the first part, we explain how we use Tensor Cores to obtain extreme signal-processing performance. Tensor Cores are special-purpose matrix-multiplication units found in the latest GPUs, and are designed to speed up deep learning. However, their use is not limited to deep learning: we show how a single Tesla V100 GPU can achieve speeds of up to 75 TFLOPS on signal-processing algorithms like correlations and beam forming. In the second part of this talk, we explain how we solve the largest computational challenge in the imaging pipeline of modern radio telescopes. We explain how we implemented and optimized the novel Image-Domain Gridding algorithm on GPUs and compare performance and energy efficiencies with other devices. We show that our solution is an ideal candidate for the world's largest radio telescope (the Square Kilometre Array) as it meets the challenging performance and power consumption constraints.
This talk consists of two parts. In the first part, we explain how we use Tensor Cores to obtain extreme signal-processing performance. Tensor Cores are special-purpose matrix-multiplication units found in the latest GPUs, and are designed to speed up deep learning. However, their use is not limited to deep learning: we show how a single Tesla V100 GPU can achieve speeds of up to 75 TFLOPS on signal-processing algorithms like correlations and beam forming. In the second part of this talk, we explain how we solve the largest computational challenge in the imaging pipeline of modern radio telescopes. We explain how we implemented and optimized the novel Image-Domain Gridding algorithm on GPUs and compare performance and energy efficiencies with other devices. We show that our solution is an ideal candidate for the world's largest radio telescope (the Square Kilometre Array) as it meets the challenging performance and power consumption constraints.  Back
 
Topics:
Performance Optimization, Astronomy & Astrophysics, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9306
Streaming:
Download:
Share:
 
Abstract:
We'll discuss how FPGAs are changing as a result of new technology such as the Open CL high-level programming language, hard floating-point units, and tight integration with CPU cores. Traditionally energy-efficient FPGAs were considered notoriously difficult to program and unsuitable for complex HPC applications. We'll compare the latest FPGAs to GPUs, examining the architecture, programming models, programming effort, performance, and energy efficiency by considering some real applications.
We'll discuss how FPGAs are changing as a result of new technology such as the Open CL high-level programming language, hard floating-point units, and tight integration with CPU cores. Traditionally energy-efficient FPGAs were considered notoriously difficult to program and unsuitable for complex HPC applications. We'll compare the latest FPGAs to GPUs, examining the architecture, programming models, programming effort, performance, and energy efficiency by considering some real applications.  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9338
Streaming:
Download:
Share:
 
Abstract:
We will present our latest results on Image Domain Gridding, an algorithm for radio astronomical imaging. This algorithm outperforms the state of the art in traditional imaging algorithms both in terms of image quality (by applying more corrections) and performance. In this talk, we will first introduce the algorithm and then demonstrate that this algorithm works very well on highly parallel accelerators. We will show the in-depth performance analysis and optimization techniques that we applied to get there.
We will present our latest results on Image Domain Gridding, an algorithm for radio astronomical imaging. This algorithm outperforms the state of the art in traditional imaging algorithms both in terms of image quality (by applying more corrections) and performance. In this talk, we will first introduce the algorithm and then demonstrate that this algorithm works very well on highly parallel accelerators. We will show the in-depth performance analysis and optimization techniques that we applied to get there.  Back
 
Topics:
Astronomy & Astrophysics, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8128
Streaming:
Download:
Share:
 
Abstract:
Realizing the next generation of radio telescopes such as the Square Kilometre Array requires both more efficient hardware and algorithms than today's technology provides. We'll present our work on the recently introduced Image-Domain Gridding (IDG) algorithm that tries to avoid the performance bottlenecks of traditional AW-projection gridding. We'll demonstrate how we implemented this algorithm on various architectures. By applying a modified roofline analysis, we show that our parallelization approaches and optimization leads to nearly optimal performance on all architectures. The analysis also indicates that, by leveraging dedicated hardware to evaluate trigonometric functions, NVIDIA GPUs are much faster and more energy-efficient than regular CPUs. This makes IDG on GPUs a candidate for meeting the computational and energy-efficiency constraints for future telescopes.
Realizing the next generation of radio telescopes such as the Square Kilometre Array requires both more efficient hardware and algorithms than today's technology provides. We'll present our work on the recently introduced Image-Domain Gridding (IDG) algorithm that tries to avoid the performance bottlenecks of traditional AW-projection gridding. We'll demonstrate how we implemented this algorithm on various architectures. By applying a modified roofline analysis, we show that our parallelization approaches and optimization leads to nearly optimal performance on all architectures. The analysis also indicates that, by leveraging dedicated hardware to evaluate trigonometric functions, NVIDIA GPUs are much faster and more energy-efficient than regular CPUs. This makes IDG on GPUs a candidate for meeting the computational and energy-efficiency constraints for future telescopes.  Back
 
Topics:
Astronomy & Astrophysics, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7125
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next