GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
We'll describe how large data scale (over two millennia of speech data per year) and low-latency requirements have enabled and required novel approaches to several speech and language models. Our talk will cover the GPU speech recognition training pipeline, continuous feedback-based training, optimizations for training, and inference on TensorRT for ultra- low latency text-to-speech models for call centers. We will discuss accuracy and latency benchmarks for speech recognition on conversational speech, speech synthesis, data-driven dialogue systems, emotion recognition, and speech act classification. We'll also demonstrate our system running on a scaled simulated call center and show live speech recognition, synthesis, and language processing.
We'll describe how large data scale (over two millennia of speech data per year) and low-latency requirements have enabled and required novel approaches to several speech and language models. Our talk will cover the GPU speech recognition training pipeline, continuous feedback-based training, optimizations for training, and inference on TensorRT for ultra- low latency text-to-speech models for call centers. We will discuss accuracy and latency benchmarks for speech recognition on conversational speech, speech synthesis, data-driven dialogue systems, emotion recognition, and speech act classification. We'll also demonstrate our system running on a scaled simulated call center and show live speech recognition, synthesis, and language processing.  Back
 
Topics:
Data Center & Cloud Infrastructure, AI in Healthcare, Medical Imaging & Radiology
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9776
Streaming:
Download:
Share:
 
Abstract:
Gridspace uses GPU-accelerated deep learning to analyze conversational speech on phone calls. We'll outline our DNN-based approach as well as several commercial applications of call grading. Our GPU-based software stack provides a novel way to process large-scale speech data. Results from a recent case study show call grading to be as accurate as human call grading and highly scalable in production. Deep call analysis with 100% coverage has never been achieved before. Also we'll discuss how this system can be improved by training continuously without expert supervision.
Gridspace uses GPU-accelerated deep learning to analyze conversational speech on phone calls. We'll outline our DNN-based approach as well as several commercial applications of call grading. Our GPU-based software stack provides a novel way to process large-scale speech data. Results from a recent case study show call grading to be as accurate as human call grading and highly scalable in production. Deep call analysis with 100% coverage has never been achieved before. Also we'll discuss how this system can be improved by training continuously without expert supervision.  Back
 
Topics:
Finance, AI Startup, Artificial Intelligence and Deep Learning, Signal and Audio Processing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2017
Session ID:
S7360
Download:
Share:
 
Abstract:

Learn how to develop GPU-accelerated model combination for Robust Speech Recognition and Keyword Search that is built on (1) GPU-accelerated acoustic score computation for DNN and GMM models, (2) acoustic score level combination with different combination techniques, and (3) efficiency rescoring of hypothesis over hybrid architectures of GPU and multicore CPUs. Evaluation will be given under 2013 OpenKWS evaluation task, which is challenging corpus to see how combination helps speech recognition task and keyword search task.

Learn how to develop GPU-accelerated model combination for Robust Speech Recognition and Keyword Search that is built on (1) GPU-accelerated acoustic score computation for DNN and GMM models, (2) acoustic score level combination with different combination techniques, and (3) efficiency rescoring of hypothesis over hybrid architectures of GPU and multicore CPUs. Evaluation will be given under 2013 OpenKWS evaluation task, which is challenging corpus to see how combination helps speech recognition task and keyword search task.

  Back
 
Topics:
Signal and Audio Processing, Defense, Artificial Intelligence and Deep Learning, Mobile Applications
Type:
Talk
Event:
GTC Silicon Valley
Year:
2014
Session ID:
S4533
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next