GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Over the past year numerous updates to the CUDA platform have been released for libraries, language and system software. These target a range of diverse features from mixed precision solvers to scalable programming models to memory management to applications of ray tracing in numerical methods. This talk will present a tour of all thats new and how to take advantage of it.
Over the past year numerous updates to the CUDA platform have been released for libraries, language and system software. These target a range of diverse features from mixed precision solvers to scalable programming models to memory management to applications of ray tracing in numerical methods. This talk will present a tour of all thats new and how to take advantage of it.  Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
Supercomputing
Year:
2019
Session ID:
SC1931
Streaming:
Download:
Share:
 
Abstract:
Well discuss the new features of the latest CUDA release and what they mean to clients applications and the work they do with GPUs. Well also peer ahead at what the future holds for the platform that underlies all GPU programming and applications.
Well discuss the new features of the latest CUDA release and what they mean to clients applications and the work they do with GPUs. Well also peer ahead at what the future holds for the platform that underlies all GPU programming and applications.  Back
 
Topics:
AI & Deep Learning Research, Developer Tools
Type:
Talk
Event:
GTC Washington D.C.
Year:
2019
Session ID:
DC91187
Download:
Share:
 
Abstract:
CUDA is NVIDIA's parallel computing platform and programming model. You'll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, and gain insight into the philosophy driving the development of CUDA and how it will take advantage of current and future GPUs. You'll also learn about NVIDIA's vision for CUDA and the challenges for the future of parallel software development.
CUDA is NVIDIA's parallel computing platform and programming model. You'll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, and gain insight into the philosophy driving the development of CUDA and how it will take advantage of current and future GPUs. You'll also learn about NVIDIA's vision for CUDA and the challenges for the future of parallel software development.  Back
 
Topics:
Programming Languages
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9240
Streaming:
Download:
Share:
 
Abstract:

NVIDIA's DGX-2 system offers a unique architecture which connects 16 GPUs together via the high-speed NVLink interface, along with NVSwitch which enables unprecedented bandwidth between processors. This talk will take an in depth look at the properties of this system along with programming techniques to take maximum advantage of the system architecture.

NVIDIA's DGX-2 system offers a unique architecture which connects 16 GPUs together via the high-speed NVLink interface, along with NVSwitch which enables unprecedented bandwidth between processors. This talk will take an in depth look at the properties of this system along with programming techniques to take maximum advantage of the system architecture.

  Back
 
Topics:
Performance Optimization, Programming Languages, Algorithms & Numerical Techniques
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9241
Streaming:
Download:
Share:
 
Abstract:

CUDA 10.0, the latest major revision of the CUDA platform, was released in September and introduces the latest Turing GPU architecture and a host of new features. This talk presents all the details including a new graphs programming model, more flexible system support mechanisms and of course the new capabilities offered by the Turing GPU.

CUDA 10.0, the latest major revision of the CUDA platform, was released in September and introduces the latest Turing GPU architecture and a host of new features. This talk presents all the details including a new graphs programming model, more flexible system support mechanisms and of course the new capabilities offered by the Turing GPU.

  Back
 
Topics:
Programming Languages
Type:
Talk
Event:
Supercomputing
Year:
2018
Session ID:
SC1813
Download:
Share:
 
Abstract:
CUDA is NVIDIA's parallel computing platform and programming model. You'll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, gain insight into the philosophy driving the development of CUDA, and see how it will take advantage of current and future GPUs. You'll also learn about NVIDIA's vision for CUDA and the challenges for the future of parallel software development.
CUDA is NVIDIA's parallel computing platform and programming model. You'll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, gain insight into the philosophy driving the development of CUDA, and see how it will take advantage of current and future GPUs. You'll also learn about NVIDIA's vision for CUDA and the challenges for the future of parallel software development.   Back
 
Topics:
HPC and Supercomputing
Type:
Talk
Event:
GTC Europe
Year:
2018
Session ID:
E8128
Streaming:
Download:
Share:
 
Abstract:

CUDA is NVIDIA''s parallel computing platform and programming model. You''ll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, and gain insight into the philosophy driving the development of CUDA and how it will take advantage of current and future GPUs. You''ll also learn about NVIDIA''s vision for CUDA and the challenges for the future of parallel software development.

CUDA is NVIDIA''s parallel computing platform and programming model. You''ll learn about new programming model enhancements and performance improvements in the latest release of CUDA, preview upcoming GPU programming technology, and gain insight into the philosophy driving the development of CUDA and how it will take advantage of current and future GPUs. You''ll also learn about NVIDIA''s vision for CUDA and the challenges for the future of parallel software development.

  Back
 
Topics:
Programming Languages, Developer Tools
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8278
Streaming:
Download:
Share:
 
Abstract:
Systems with multiple GPUs in a single node are almost universal in the cloud and high-performance computing worlds, and are increasingly common in power-user desktop systems such as NVIDIA''s DGX station. Effective use of these GPUs is critical to scaling programs, but developers have typically treated them as independent machines. Targeting multiple GPUs from a single process offers the potential for far greater performance, especially with the advent of NVLink which transforms the way that these GPUs can cooperate. We will cover a number of techniques and pitfalls for direct multi-GPU programming in CUDA, then look in depth at one novel method of using NVLink to scale some programs with minimal effort.
Systems with multiple GPUs in a single node are almost universal in the cloud and high-performance computing worlds, and are increasingly common in power-user desktop systems such as NVIDIA''s DGX station. Effective use of these GPUs is critical to scaling programs, but developers have typically treated them as independent machines. Targeting multiple GPUs from a single process offers the potential for far greater performance, especially with the advent of NVLink which transforms the way that these GPUs can cooperate. We will cover a number of techniques and pitfalls for direct multi-GPU programming in CUDA, then look in depth at one novel method of using NVLink to scale some programs with minimal effort.  Back
 
Topics:
Performance Optimization, Programming Languages
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8670
Streaming:
Share:
 
Abstract:
Optimizing your code can be one of the most challenging tasks in GPU programming, but also one of the most rewarding: the performance difference between an initial version and well-tuned code can be a factor of 10 or more. Some optimizations can be quite straightforward while others require care and deep understanding of how the code is executing. A particular focus will be on optimization of the CPU part of your code, which is frequently overlooked even though it is often easier to tune and just as effective. Sometimes the biggest obstacle is just knowing what to look for, so we'll cover a range of techniques that everyone from beginners to CUDA ninjas might not have thought of before.
Optimizing your code can be one of the most challenging tasks in GPU programming, but also one of the most rewarding: the performance difference between an initial version and well-tuned code can be a factor of 10 or more. Some optimizations can be quite straightforward while others require care and deep understanding of how the code is executing. A particular focus will be on optimization of the CPU part of your code, which is frequently overlooked even though it is often easier to tune and just as effective. Sometimes the biggest obstacle is just knowing what to look for, so we'll cover a range of techniques that everyone from beginners to CUDA ninjas might not have thought of before.  Back
 
Topics:
Developer Tools, Accelerated Data Science, HPC and Supercomputing
Type:
Talk
Event:
GTC Washington D.C.
Year:
2017
Session ID:
DC7112
Download:
Share:
 
Abstract:
The NVIDIA Volta architecture powers the worlds most advanced data center GPU for AI, HPC, and Graphics. Features like Independent Thread Scheduling and game-changing Tensor Cores enable Volta to simultaneously deliver the fastest and most accessible performance of any comparable processor. Join two lead hardware and software architects for Volta on a tour of the features that will make Volta the platform for your next innovation in AI and HPC supercomputing.
The NVIDIA Volta architecture powers the worlds most advanced data center GPU for AI, HPC, and Graphics. Features like Independent Thread Scheduling and game-changing Tensor Cores enable Volta to simultaneously deliver the fastest and most accessible performance of any comparable processor. Join two lead hardware and software architects for Volta on a tour of the features that will make Volta the platform for your next innovation in AI and HPC supercomputing.  Back
 
Topics:
Accelerated Data Science
Type:
Talk
Event:
SIGGRAPH
Year:
2017
Session ID:
SC1739
Download:
Share:
 
Abstract:

Optimizing your code can be one of the most challenging tasks in GPU programming, but also one of the most rewarding: the performance difference between an initial version and well-tuned code can be a factor of 10 or more. Some optimizations can be quite straightforward while others require care and deep understanding of how the code is executing. A particular focus will be on optimization of the CPU part of your code, which is frequently overlooked even though it is often easier to tune and just as effective. Sometimes the biggest obstacle is just knowing what to look for, so we'll cover a range of techniques that everyone from beginners to CUDA ninjas might not have thought of before.

Optimizing your code can be one of the most challenging tasks in GPU programming, but also one of the most rewarding: the performance difference between an initial version and well-tuned code can be a factor of 10 or more. Some optimizations can be quite straightforward while others require care and deep understanding of how the code is executing. A particular focus will be on optimization of the CPU part of your code, which is frequently overlooked even though it is often easier to tune and just as effective. Sometimes the biggest obstacle is just knowing what to look for, so we'll cover a range of techniques that everyone from beginners to CUDA ninjas might not have thought of before.

  Back
 
Topics:
Performance Optimization, Accelerated Data Science, Algorithms & Numerical Techniques, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7122
Download:
Share:
 
Abstract:

Deep Learning is delivering the future today, enabling computers to perform tasks once thought possible only in science fiction. Innovations such as autonomous vehicles, speech recognition and advances in medical imaging will transform the world as we know it. GPUs are at the core of this transformation, providing the engines that power Deep Learning. In this session, we'll discuss the software tools NVIDIA provides to unlock the power of Deep Learning on GPUs. We'll provide an overview of NVIDIA's Deep Learning Software, including cuDNN and DIGITS, and pointers to maximize your experience with Deep Learning at GTC.

Deep Learning is delivering the future today, enabling computers to perform tasks once thought possible only in science fiction. Innovations such as autonomous vehicles, speech recognition and advances in medical imaging will transform the world as we know it. GPUs are at the core of this transformation, providing the engines that power Deep Learning. In this session, we'll discuss the software tools NVIDIA provides to unlock the power of Deep Learning on GPUs. We'll provide an overview of NVIDIA's Deep Learning Software, including cuDNN and DIGITS, and pointers to maximize your experience with Deep Learning at GTC.

  Back
 
Topics:
Artificial Intelligence and Deep Learning
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6847
Streaming:
Download:
Share:
 
Abstract:
SpaceX is designing a new, methane-fueled engine powerful enough to lift the equipment and personnel needed to colonize Mars. A vital aspect of this effort involves the creation of a multi-physics code to accurately model a running rocket engine. The scale and complexity of turbulent non-premixed combustion has so far made it impractical to simulate, even on today's largest supercomputers. We present a novel approach using wavelets on GPUs, capable of capturing physics down to the finest turbulent scales.
SpaceX is designing a new, methane-fueled engine powerful enough to lift the equipment and personnel needed to colonize Mars. A vital aspect of this effort involves the creation of a multi-physics code to accurately model a running rocket engine. The scale and complexity of turbulent non-premixed combustion has so far made it impractical to simulate, even on today's largest supercomputers. We present a novel approach using wavelets on GPUs, capable of capturing physics down to the finest turbulent scales.  Back
 
Topics:
AEC & Manufacturing, Developer - Algorithms, Computational Physics, HPC and Supercomputing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2015
Session ID:
S5398
Streaming:
Download:
Share:
 
Abstract:
GPUs can push teraflops of mathematical power, but feeding the SMs with data can often be harder than optimising your algorithm. A well-designed program must take into account both access of data from within the GPU as well as allocation and transfer of data between CPU and GPU. This talk will cover techniques including sub-allocation, shared memory management, and parallel memory structures such as stacks, queues and ring-buffers which can greatly improve the throughput of your algorithms. 75% of programs are limited by memory bandwidth and not compute power, so careful memory management is critical to a high-performance program.
GPUs can push teraflops of mathematical power, but feeding the SMs with data can often be harder than optimising your algorithm. A well-designed program must take into account both access of data from within the GPU as well as allocation and transfer of data between CPU and GPU. This talk will cover techniques including sub-allocation, shared memory management, and parallel memory structures such as stacks, queues and ring-buffers which can greatly improve the throughput of your algorithms. 75% of programs are limited by memory bandwidth and not compute power, so careful memory management is critical to a high-performance program.  Back
 
Topics:
Performance Optimization, Programming Languages, Developer - Algorithms
Type:
Talk
Event:
GTC Silicon Valley
Year:
2015
Session ID:
S5530
Streaming:
Download:
Share:
 
Abstract:

NVIDIA provides tools that help you get the most out of your Android application. Come learn how to minimize your time to market while maximizing stability and performance. This session will cover native Android GPU debugging and profiling tools, CPU debugging and profiling tools, including Nsight Tegra, the premiere Android development for Microsoft Visual Studio.

NVIDIA provides tools that help you get the most out of your Android application. Come learn how to minimize your time to market while maximizing stability and performance. This session will cover native Android GPU debugging and profiling tools, CPU debugging and profiling tools, including Nsight Tegra, the premiere Android development for Microsoft Visual Studio.

  Back
 
Topics:
Mobile Summit
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S3489
Streaming:
Download:
Share:
 
Abstract:

Atomic memory operations provide powerful communication and coordination capabilities for parallel programs, including the well-known operations compare-and-swap and fetch-and-add. The atomic operations enable the creation of parallel algorithms and data structures that would otherwise be very difficult (or impossible) to express without them - for example: shared parallel data structures, parallel data aggregation, and control primitives such as semaphores and mutexes. In this talk we will use examples to describe atomic operations, explain how they work, and discuss performance considerations and pitfalls when using them.

Atomic memory operations provide powerful communication and coordination capabilities for parallel programs, including the well-known operations compare-and-swap and fetch-and-add. The atomic operations enable the creation of parallel algorithms and data structures that would otherwise be very difficult (or impossible) to express without them - for example: shared parallel data structures, parallel data aggregation, and control primitives such as semaphores and mutexes. In this talk we will use examples to describe atomic operations, explain how they work, and discuss performance considerations and pitfalls when using them.

  Back
 
Topics:
Programming Languages, Developer - Algorithms
Type:
Talk
Event:
GTC Silicon Valley
Year:
2013
Session ID:
S2313
Streaming:
Download:
Share:
 
Abstract:

This presentation looks into the features of NVIDIA's latest Kepler GPU architecture. Join us as one of CUDA's language architects explains what's new, why it's exciting, and demonstrates the power of Kepler GPU accelerators with a real-time cosmology simulation in full 3D.

This presentation looks into the features of NVIDIA's latest Kepler GPU architecture. Join us as one of CUDA's language architects explains what's new, why it's exciting, and demonstrates the power of Kepler GPU accelerators with a real-time cosmology simulation in full 3D.

  Back
 
Topics:
HPC and AI
Type:
Talk
Event:
Supercomputing
Year:
2012
Session ID:
SC2032
Download:
Share:
 
Abstract:

The continuing evolution of the GPU brings with it new hardware capabilities and new functionality. Simultaneously, ongoing development of CUDA and its tools, libraries and ecosystem brings new features to the software stack as well. Come and learn from on of CUDA's programming model architects about what's new in the GPU, what's coming in the next release of CUDA, how it works, and how it all fits together.

The continuing evolution of the GPU brings with it new hardware capabilities and new functionality. Simultaneously, ongoing development of CUDA and its tools, libraries and ecosystem brings new features to the software stack as well. Come and learn from on of CUDA's programming model architects about what's new in the GPU, what's coming in the next release of CUDA, how it works, and how it all fits together.

  Back
 
Topics:
Programming Languages
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2338
Streaming:
Download:
Share:
 
 
Topics:
Tools & Libraries
Type:
Webinar
Event:
GTC Webinars
Year:
2012
Session ID:
GTCE020
Download:
Share:
 
 
Topics:
Developer Tools
Type:
Webinar
Event:
GTC Webinars
Year:
2012
Session ID:
GTCE026
Streaming:
Download:
Share:
 
Speakers:
Stephen Jones
- NVIDIA
 
Topics:
Tools & Libraries
Type:
Talk
Event:
Supercomputing
Year:
2009
Session ID:
SC0902
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next