GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Automatic speech recognition (ASR) algorithms allow us to interact with devices, appliances, and services using spoken language. Used in cloud services like Siri, Google Voice, and Amazon Echo, speech recognition is growing in popularity, which substantially increases the computational demand on the data center. We'll discuss the latest work by NVIDIA to accelerate the ASR pipeline, which includes a lattice-generating language model decoder, and explain how we're enabling online speech decoding across a range of NVIDIA GPUs.
Automatic speech recognition (ASR) algorithms allow us to interact with devices, appliances, and services using spoken language. Used in cloud services like Siri, Google Voice, and Amazon Echo, speech recognition is growing in popularity, which substantially increases the computational demand on the data center. We'll discuss the latest work by NVIDIA to accelerate the ASR pipeline, which includes a lattice-generating language model decoder, and explain how we're enabling online speech decoding across a range of NVIDIA GPUs.  Back
 
Topics:
Speech & Language Processing, AI Application, Deployment & Inference
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9672
Streaming:
Download:
Share:
 
Abstract:
NVIDIA and John Hopkins have partnered up to accelerated speech recognition within the popular Kaldi framework. This framework is the de facto standard when it comes to transcoding recorded audio into the written text. Early results have shown NVIDIA GPUs can provide substantial speedups over pure CPU implementations. This talk will focus on the progress of this effort and the value that GPU acceleration adds to speech recognition.
NVIDIA and John Hopkins have partnered up to accelerated speech recognition within the popular Kaldi framework. This framework is the de facto standard when it comes to transcoding recorded audio into the written text. Early results have shown NVIDIA GPUs can provide substantial speedups over pure CPU implementations. This talk will focus on the progress of this effort and the value that GPU acceleration adds to speech recognition.   Back
 
Topics:
Artificial Intelligence and Deep Learning, Accelerated Data Science
Type:
Talk
Event:
GTC Washington D.C.
Year:
2018
Session ID:
DC8189
Streaming:
Share:
 
Abstract:
Using streams in CUDA is a fundamental optimization that many programmers overlook. This tutorial will teach you how to use streams in your application and cover the many mistakes that people make when using streams. After attending this talk you will be equipped with the necessary knowledge to use streams within your application. This talk is appropriate for all skill levels, whether you have never heard of streams or use them regularly.
Using streams in CUDA is a fundamental optimization that many programmers overlook. This tutorial will teach you how to use streams in your application and cover the many mistakes that people make when using streams. After attending this talk you will be equipped with the necessary knowledge to use streams within your application. This talk is appropriate for all skill levels, whether you have never heard of streams or use them regularly.  Back
 
Topics:
Programming Languages, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4158
Streaming:
Download:
Share:
 
Abstract:

The Mantevo performance project is a collection of self-contained proxy applications that illustrate the main performance characteristics of important algorithms. miniFE is intended to be and approximation to an unstructured implicit finite element or finite volume application. Our work investigated algorithms for assembling a matrix on the GPU. Parallelization algorithms using both 1 thread and 8 threads per element were investigated. Using these approaches a significant speedup (over 60x for double precision) compared to the serial algorithm.

The Mantevo performance project is a collection of self-contained proxy applications that illustrate the main performance characteristics of important algorithms. miniFE is intended to be and approximation to an unstructured implicit finite element or finite volume application. Our work investigated algorithms for assembling a matrix on the GPU. Parallelization algorithms using both 1 thread and 8 threads per element were investigated. Using these approaches a significant speedup (over 60x for double precision) compared to the serial algorithm.

  Back
 
Topics:
Programming Languages
Type:
Talk
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2302
Streaming:
Download:
Share:
 
Abstract:

Starting with a background in C or C++, learn everything you need to know in order to start programming in CUDA C. Beginning with a "Hello, World" CUDA C program, explore parallel programming with CUDA through a number of hands-on code examples. Examine more deeply the various APIs available to CUDA applications and learn the best (and worst) ways in which to employ them in applications.

Starting with a background in C or C++, learn everything you need to know in order to start programming in CUDA C. Beginning with a "Hello, World" CUDA C program, explore parallel programming with CUDA through a number of hands-on code examples. Examine more deeply the various APIs available to CUDA applications and learn the best (and worst) ways in which to employ them in applications.

  Back
 
Topics:
Programming Languages
Type:
Tutorial
Event:
GTC Silicon Valley
Year:
2012
Session ID:
S2624
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next