GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:

Learn about the plans of market leaders in streaming VR and AR content from the cloud in this panel discussion. From enterprise use cases to streaming VR to the 5G edge, panelists will describe the state-of-the-art and challenges to making XR truly mobile.

Learn about the plans of market leaders in streaming VR and AR content from the cloud in this panel discussion. From enterprise use cases to streaming VR to the 5G edge, panelists will describe the state-of-the-art and challenges to making XR truly mobile.

  Back
 
Topics:
Virtual Reality & Augmented Reality
Type:
Panel
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9914
Streaming:
Share:
 
Abstract:

We'll examine the challenges for telecommunications companies of harvesting the considerable computational capacity of modern GPU architectures. One issue is that low latency inference requires small batch sizes, which are inherently detrimental to Tensor Core performance. Another involves efficient coefficient reuse, which demands very large matrix-matrix multiplications, while feedforward DNNs typically used for telecommunications ML have relatively small vector-matrix multiplications. We'll discuss our approach, which aims to provide low latency with significantly higher performance by improving use of computation capacity available in Tensor Cores.

We'll examine the challenges for telecommunications companies of harvesting the considerable computational capacity of modern GPU architectures. One issue is that low latency inference requires small batch sizes, which are inherently detrimental to Tensor Core performance. Another involves efficient coefficient reuse, which demands very large matrix-matrix multiplications, while feedforward DNNs typically used for telecommunications ML have relatively small vector-matrix multiplications. We'll discuss our approach, which aims to provide low latency with significantly higher performance by improving use of computation capacity available in Tensor Cores.

  Back
 
Topics:
AI Application, Deployment & Inference, 5G & Edge, Performance Optimization
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9769
Streaming:
Download:
Share:
 
Abstract:
We describe the design of scalable CUDA-based service framework for ML model inference tasks to efficiently distribute such workloads across a cluster of dedicated GPU-based servers. These servers can also be easily integrated with existing telecom cloud infrastructure. In telecom data centres, ML models are increasingly being deployed for use cases such as automation, analytics and anomaly detection. Handling diverse datatypes and request rates ranging between hours and milliseconds can become a challenge with a legacy CPU-dominated cloud environment.
We describe the design of scalable CUDA-based service framework for ML model inference tasks to efficiently distribute such workloads across a cluster of dedicated GPU-based servers. These servers can also be easily integrated with existing telecom cloud infrastructure. In telecom data centres, ML models are increasingly being deployed for use cases such as automation, analytics and anomaly detection. Handling diverse datatypes and request rates ranging between hours and milliseconds can become a challenge with a legacy CPU-dominated cloud environment.   Back
 
Topics:
Artificial Intelligence and Deep Learning
Type:
Talk
Event:
GTC Europe
Year:
2018
Session ID:
E8421
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next