GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Learn how to scale distributed training of TensorFlow and PyTorch models with Horovod, a library designed to make distributed training fast and easy to use. Although frameworks like TensorFlow and PyTorch simplify the design and training of deep learning models, difficulties usually arise when scaling models to multiple GPUs in a server or multiple servers in a cluster. We'll explain the role of Horovod in taking a model designed on a single GPU and training it on a cluster of GPU servers.
Learn how to scale distributed training of TensorFlow and PyTorch models with Horovod, a library designed to make distributed training fast and easy to use. Although frameworks like TensorFlow and PyTorch simplify the design and training of deep learning models, difficulties usually arise when scaling models to multiple GPUs in a server or multiple servers in a cluster. We'll explain the role of Horovod in taking a model designed on a single GPU and training it on a cluster of GPU servers.  Back
 
Topics:
Deep Learning & AI Frameworks, AI & Deep Learning Research, HPC and AI
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9321
Streaming:
Download:
Share:
 
 
Previous
  • Amazon Web Services
  • IBM
  • Cisco
  • Dell EMC
  • Hewlett Packard Enterprise
  • Inspur
  • Lenovo
  • SenseTime
  • Supermicro Computers
  • Synnex
  • Autodesk
  • HP
  • Linear Technology
  • MSI Computer Corp.
  • OPTIS
  • PNY
  • SK Hynix
  • vmware
  • Abaco Systems
  • Acceleware Ltd.
  • ASUSTeK COMPUTER INC
  • Cray Inc.
  • Exxact Corporation
  • Flanders - Belgium
  • Google Cloud
  • HTC VIVE
  • Liqid
  • MapD
  • Penguin Computing
  • SAP
  • Sugon
  • Twitter
Next