SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC On-Demand

Deep Learning and AI
Presentation
Media
Opening Keynote
Jensen Huang (NVIDIA)
The 2018 GTC opening keynote is delivered by the NVIDIA Founder and CEO, Jensen Huang, speaking on the future of computing. ...Read More

The 2018 GTC opening keynote is delivered by the NVIDIA Founder and CEO, Jensen Huang, speaking on the future of computing.

  Back
 
Keywords:
Deep Learning and AI, GTC Silicon Valley 2018 - ID S8885
Streaming:
AI Application Deployment and Inference
Presentation
Media
Deep Learning Implementers Panel: Field Insights for Accelerating Deep Learning Performance, Productivity and Scale
Tony Paikeday (NVIDIA), Scott Stephenson (Deepgram), Arun Subramaniyan (Baker Hughes, a GE Company), Neil Tenenholtz (MGH and BWH Center for Clinical Data Science)
This customer panel brings together A.I. implementers who have deployed deep learning at scale using NVIDIA DGX Systems. We'll focus on specific technical challenges we faced, solution design considerations, and best practices learned from i ...Read More

This customer panel brings together A.I. implementers who have deployed deep learning at scale using NVIDIA DGX Systems. We'll focus on specific technical challenges we faced, solution design considerations, and best practices learned from implementing our respective solutions. Attendees will gain insights such as: 1) how to set up your deep learning project for success by matching the right hardware and software platform options to your use case and operational needs; 2) how to design your architecture to overcome unnecessary bottlenecks that inhibit scalable training performance; and 3) how to build an end-to-end deep learning workflow that enables productive experimentation, training at scale, and model refinement.

  Back
 
Keywords:
AI Application Deployment and Inference, AI and DL Business Track (high level), Data Center and Cloud Infrastructure, AI for Business, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8194
Streaming:
Download:
 
Deploying Deep Neural Networks as a Service Using TensorRT and NVIDIA-Docker
Alec Gunny (NVIDIA), Prethvi Kashinkunti (NVIDIA)
Learn how you can utilize TensorRT and NVIDIA Docker to quickly configure and deploy a GPU-accelerated inference server and start gaining insights from your trained deep neural network (DNN) models. TensorRT is a high-performance tool for low-latency ...Read More
Learn how you can utilize TensorRT and NVIDIA Docker to quickly configure and deploy a GPU-accelerated inference server and start gaining insights from your trained deep neural network (DNN) models. TensorRT is a high-performance tool for low-latency, high-throughput DNN inference. The latest release of TensorRT introduces a novel, framework-agnostic network definition format called universal framework format, which allows TensorRT to support and optimize DNN models trained in multiple deep learning frameworks. We'll leverage the TensorRT Python API to create a lightweight Python Flask application capable of serving multiple DNN models trained using TensorFlow, PyTorch, and Caffe, and also discuss how to containerize this inference service using NVIDIA Docker for ease of deployment at scale. This session will consist of a lecture, live demos, and detailed instructions.  Back
 
Keywords:
AI Application Deployment and Inference, Tools and Libraries, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8495
Streaming:
Download:
 
Monte Carlo Methods and Neural Networks
Noah Gamboa (Stanford University)
The average human brain has about 100 billion nerve cells. We therefore investigate the question whether there are algorithms for artificial neural networks that are linear in the number of neurons, while the number of connections incident to a neuro ...Read More
The average human brain has about 100 billion nerve cells. We therefore investigate the question whether there are algorithms for artificial neural networks that are linear in the number of neurons, while the number of connections incident to a neuron is bounded by a constant. We offer two approaches to answer this question: First, we derive an algorithm that quantizes a trained artificial neural network such that the resulting complexity is linear. Second, we demonstrate that training networks, whose connections are determined by uniform sampling can achieve a similar precision as compared to using fully connected layers. Due to sparsity upfront, these networks can be trained much faster. Both approaches are made plausible by relating artificial neural units to Monte Carlo integration. We'll demonstrate the results for classic test datasets.  Back
 
Keywords:
AI Application Deployment and Inference, AI and DL Research, GTC Silicon Valley 2018 - ID S8780
Streaming:
Download:
 
AI Solutions and Use Cases Up Close (Presented by Inspur Systems)
Dolly Wu (Inspur)
Inspur has been deploying AI solutions with our customers, such as Microsoft, Alibaba, Baidu, BMW, for many years. We will share AI use cases on how we deploy AI at scale and take a close look at the technologies that enable AI deployments.
Inspur has been deploying AI solutions with our customers, such as Microsoft, Alibaba, Baidu, BMW, for many years. We will share AI use cases on how we deploy AI at scale and take a close look at the technologies that enable AI deployments.  Back
 
Keywords:
AI Application Deployment and Inference, AI and DL Research, HPC and AI, GTC Silicon Valley 2018 - ID S8996
Streaming:
 
Putting AI to Work in an Enterprise: Deep Learning as a Service (Presented by IBM)
Nick Werstiuk (IBM)
Now that Deep learning has moved out of the lab and into production, how do you provide training environments to all your internal customers working across business units with different requirements and avoid provisioning separate clusters? IBM has a ...Read More
Now that Deep learning has moved out of the lab and into production, how do you provide training environments to all your internal customers working across business units with different requirements and avoid provisioning separate clusters? IBM has applied decades of HPC experience to build a production ready learning stack, including servers accelerated with NVIDIA GPUs, workload and resource management software, ready to use open source frameworks and it's all covered by IBM support. The solution provides a secure multi-tenant environment so multiple data scientists can share a common set of resources, eliminating silos, while running multiple instances of the same or different applications. The deep learning effort is enhanced with end-to-end pipeline support from data ingestion and preparation, through model training and tuning, to inference. In this session, we will explore what an enterprise deep learning environment looks like and provide insights into the unique IBM value for accelerating the use of deep learning across a wide variety of industries.  Back
 
Keywords:
AI Application Deployment and Inference, GTC Silicon Valley 2018 - ID S81049
Streaming:
Download:
 
GPU-Powered Megacity Scale Transport Management, Municipal Services and Public Safety Solutions
Anton Nazarkin (VisionLabs)
Learn how VisionLabs GPU-powered solutions contribute to creating a safer, smarter Megacity a metropolitan area with a total population in excess of ten million people. We'll do a deep dive into three implemented and ongoing huge scale smart-city ...Read More
Learn how VisionLabs GPU-powered solutions contribute to creating a safer, smarter Megacity a metropolitan area with a total population in excess of ten million people. We'll do a deep dive into three implemented and ongoing huge scale smart-city projects, understand challenges, technical specifics and how GPU computing impacts each of these cases: Face authentication-based immobilizer and driver monitoring systems for municipal service vehicles powered by the NVIDIA Jetson TX2 embedded platform; Megacity scale vehicle traffic analysis and anomalies detection powered by NVIDIA Tesla P40 with over 80 million daily recognition requests; National scale face identification platform for financial services with over 110 million faces in its database. The foundation of all these projects is VisionLabs LUNA a cross-platform object recognition software based on proprietary deep neural networks (DNN) inference framework. To build cost-effective solutions, VisionLabs use know-hows in DNN quantization and acceleration. In terms of accuracy, VisionLabs is recognized as a top three best in the world by National Institute of Standards and Technology's face recognition vendor test, and LFW by University of Massachusetts challenges.  Back
 
Keywords:
AI Application Deployment and Inference, NVIDIA Inception Program, Intelligent Video Analytics and Smart Cities, Deep Learning and AI Frameworks, Computer Vision, GTC Silicon Valley 2018 - ID S8584
Streaming:
Download:
 
VACnet: Using Deep Learning to Combat Cheating in 'Counter-Strike: Global Offensive'
John McDonald (Valve)
We'll delve into the nuts and bolts of how Valve has utilized deep learning to combat cheating in "Counter-Strike: Global Offensive." We'll cover total system details, from the high-level server architecture to the low-level features fed ...Read More
We'll delve into the nuts and bolts of how Valve has utilized deep learning to combat cheating in "Counter-Strike: Global Offensive." We'll cover total system details, from the high-level server architecture to the low-level features fed into the AI. Deep learning has proven to be very effective at identifying cheating behavior without any client-side instrumentation, making it robust against malicious attack by cheaters and cheat vendors. By retraining regularly, the network continues to evolve, picking up new cheating behaviors within hours of their appearance. As a result of this approach, certain types of cheats have been reduced by a factor of 100.  Back
 
Keywords:
AI Application Deployment and Inference, AI for Gaming, GTC Silicon Valley 2018 - ID S8732
Streaming:
 
Autoregressive Wavenet Inference on Volta GPUs
Brian Pharris (NVIDIA)
Autoregressive wavenets have demonstrated extremely high quality real-time speech synthesis results.  However, the compute requirements and tight latency bounds have made them impractical for deployment on traditional CPU-only systems.  In ...Read More
Autoregressive wavenets have demonstrated extremely high quality real-time speech synthesis results.  However, the compute requirements and tight latency bounds have made them impractical for deployment on traditional CPU-only systems.  In this talk we demonstrate that Volta GPUs provide excellent real-time inference performance on these networks, making practical deployments possible.  We discuss several alternative implementation techniques and demonstrate their achieved performance on a V100 GPU.  Back
 
Keywords:
AI Application Deployment and Inference, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8968
Streaming:
 
Adopting Artificial Intelligence Technologies in Networking (Presented by Cisco)
Hugo Latapie (Cisco)
This talk will provide an overview of what is happening in the world of artificial intelligence as it relates to networking, IT infrastructure, and IoT technologies. We will broadly cover AI topics ranging from machine learning and deep learning to s ...Read More
This talk will provide an overview of what is happening in the world of artificial intelligence as it relates to networking, IT infrastructure, and IoT technologies. We will broadly cover AI topics ranging from machine learning and deep learning to symbolic AI. Applied AI as well as general AI and their hybrids are all critical in solving many of today's complex long tail problems in real-time. Just as the capabilities, business opportunities, and positive benefits of AI are growing at a seemingly exponential rate so are the security vulnerabilities, failure modes, and potential adverse business impacts. We will discuss new hybrid neural symbolic approaches that promise to address these issues while simultaneously opening the door to powerful systems that dynamically learn and reason at multiple levels of abstraction, from raw data to high-level symbolic reasoning. We will cover use cases and solutions ranging from smart city, transportation, manufacturing, to security and networking.  Back
 
Keywords:
AI Application Deployment and Inference, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8971
Streaming:
Download:
 
Accelerating AI Adoption and Impact (Presented by Dell EMC)
Jay Boisseau (Dell)
Attendees will learn and understand why AI techniques are so powerful, why developing and deploying optimal AI solutions is complex, why using AI techniques effectively is still difficult--and what Dell Technologies is doing to remove these difficult ...Read More
Attendees will learn and understand why AI techniques are so powerful, why developing and deploying optimal AI solutions is complex, why using AI techniques effectively is still difficult--and what Dell Technologies is doing to remove these difficulties and bring easier, effective AI to everyone. Dell Technologies includes seven companies with a comprehensive portfolio of technology products, services and solutions for global industry, government, and education markets, and aims to be the leader in designing and delivering the best AI solutions for every customer, of every type and scale. From Dell Precision workstations for developers and Gateways for edge sensors, to Dell EMC GPU-optimized PowerEdge Servers and Ready Solutions for Deep Learning and hybrid cloud offerings, Dell is leveraging its leadership in technology and in enterprise relationships to design a world-class portfolio of AI solutions for diverse customer workloads, requirements and objectives. This presentation will cover AI and deep learning in an enterprise context, including customer challenges and needs, and then discuss Dell AI solutions and strategy to empower people to use AI rapidly and effectively.  Back
 
Keywords:
AI Application Deployment and Inference, GTC Silicon Valley 2018 - ID S81046
Streaming:
 
space.ml: Artificial Intelligence Meets Data-Driven Astrophysics
Kevin Schawinski (ETH Zurich), Ce Zhang (ETH Zurich)
We'll present a suite of artificial intelligence applications and computation geared towards increasing our understanding of the universe. The intensive collaboration between astrophysics and computer science has long started since Jim Gray and Alex ...Read More
We'll present a suite of artificial intelligence applications and computation geared towards increasing our understanding of the universe. The intensive collaboration between astrophysics and computer science has long started since Jim Gray and Alex Szalay. Nowadays, astrophysics continues to offer rich datasets, which are ideal for exploration with the latest in AI and computer science in general. We'll present successful projects in our space.ml initiative that try to answer a range of fascinating astrophysics questions. We'll show how we can use generative adversarial networks to go slightly beyond the Nyquist resolution limit in images, and to study the host galaxies of powerful quasars. We demonstrate how we can use transfer learning to identify rare galaxy mergers, and how to use variational autoencoders to forward model the processes in cosmology and galaxy evolution. We'll illustrate how we can use GPUs for compressive sensing to better analyze data from radio arrays, and to model the evolution of black holes over the age of the universe. Attendees will not only get our current answers to these questions but also get a taste of how AI is reshaping science today.  Back
 
Keywords:
AI Application Deployment and Inference, Astronomy and Astrophysics, GTC Silicon Valley 2018 - ID S8667
Streaming:
Download:
 
Accelerate TensorFlow Inference with New TensorRT Integration
Julie Bernauer (NVIDIA)
TensorFlow is an open source software library for numerical computation using data flow graphs. NVIDIA TensorRT is an inference optimizer and runtime for runtime deployment. TensorRT provides optimizations for deep neural networks and uses reduced pr ...Read More
TensorFlow is an open source software library for numerical computation using data flow graphs. NVIDIA TensorRT is an inference optimizer and runtime for runtime deployment. TensorRT provides optimizations for deep neural networks and uses reduced precision to increase throughput, reduce latency, while maintaining accuracy. Today we announced tighter integration in TensorFlow for TensorRT through with new TensorFlow APIs, sub-graph optimizations and INT8 calibration to automatically leverage Tensor Cores on Volta GPUs. TensorRT delivers 2.5x faster inference throughput compared to inference without TensorRT. In this session, NVIDIA developers will use an example based workflow to show how to use this new capability.  Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S81009
Streaming:
Download:
 
Deep Learning of Railway Track Faults using GPUs
Nathalie Rauschmayr (CSEM (Swiss Center for Electronics and Microtechnology))
Swiss Federal Railways (SBB) operate a 'diagnosis' train fitted with multiple high-resolution cameras that obtain images of tracks - all while traveling at a speed of 75 mph. Current data processing software conducted in real time on the train prod ...Read More
Swiss Federal Railways (SBB) operate a 'diagnosis' train fitted with multiple high-resolution cameras that obtain images of tracks - all while traveling at a speed of 75 mph. Current data processing software conducted in real time on the train produces  a too high rate of false positives/negatives to the extent that railway experts still need to go on the track to physically inspect anomalies. This is not only very dangerous, but sometimes even impossible and in addition it requires a lot of human labor. We describe how deep learning technologies have been developed to massively improve the automatic detection and classification of railway faults. This is not just a nice-to-have, but rather a must-have in order to ensure the safety of future rail transportation.  Back
 
Keywords:
AI Application Deployment and Inference, Industrial Inspection, GTC Silicon Valley 2018 - ID S8944
Streaming:
Download:
 
IBM PowerAI: Realizing Business Value with Machine Learning (Presented by IBM)
Adel El-Hallak (IBM)
There is no shortage of hype around AI, but realizing value through machine and deep learning comes with its challenges. IBM PowerAI removes the inhibitors across each stage of a workflow, allowing enterprises to rapidly realize business value with A ...Read More
There is no shortage of hype around AI, but realizing value through machine and deep learning comes with its challenges. IBM PowerAI removes the inhibitors across each stage of a workflow, allowing enterprises to rapidly realize business value with AI.  Back
 
Keywords:
AI Application Deployment and Inference, GTC Silicon Valley 2018 - ID S81048
Streaming:
 
NVIDIA GPU Video Technologies and Video Codec SDK: Updates and Roadmap
Abhijit Patait (NVIDIA)
NVIDIA's video SDK is a set of APIs for hardware-accelerated video encoding and decoding using NVIDIA GPUs. We'll provide an overview of the APIs, with particular emphasis on the latest features, such as FFmpeg support of NVIDIA-accelerated transco ...Read More
NVIDIA's video SDK is a set of APIs for hardware-accelerated video encoding and decoding using NVIDIA GPUs. We'll provide an overview of the APIs, with particular emphasis on the latest features, such as FFmpeg support of NVIDIA-accelerated transcoding, quality and performance enhancements. We'll discuss some strategies on efficient usage of GPU video hardware acceleration for use cases such as video inferencing, transcoding, and media archiving.  Back
 
Keywords:
AI Application Deployment and Inference, Video and Image Processing, GTC Silicon Valley 2018 - ID S8601
Streaming:
Download:
 
Monitoring Honey Bee Health Using TensorRT and Microsoft Cognitive Toolkit
Jacqueline Cenci-McGrody (NVIDIA), Anusua Trivedi (Microsoft)
We'll take a deep dive into honey bee hive health monitoring with NVIDIA's TX2, TensorRT (a high-performance deep learning inference optimizer), Kineticas insight engine running on DGX-1/DGXStaion, and Microsoft Cognitive Toolkit to rapidly o ...Read More
We'll take a deep dive into honey bee hive health monitoring with NVIDIA's TX2, TensorRT (a high-performance deep learning inference optimizer), Kineticas insight engine running on DGX-1/DGXStaion, and Microsoft Cognitive Toolkit to rapidly optimize, validate, and deploy trained neural networks for inference. In recent years, the media has reported that bees seem to be dying at an unprecedented rate. We'll explore how new accelerated analytics technologies and their corresponding compute platforms can deliver game-changing possibilities for innovation as we follow a honey bee farm scientist in California, who agreed to field test this real-time monitoring solution with her beehives.  See first-hand how adaptable and accessible these complex, cutting-edge technologies have become and how we can use intelligent monitoring technologies to help rescue the honey bee in the real-world environmental analytics opportunity.  Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8508
Streaming:
 
Practical Application of Deep Learning in Smart Factory: Visual Inspection System of Semiconductor Laser
Hiroyuki Kusaka (Fujikura Ltd)
Fujikura is pushing forward of implementation of the smart factory with AI and IoT for improving the productivity and production quality. In this presentation, we will present visual inspection system incorporating deep learning in the production pro ...Read More
Fujikura is pushing forward of implementation of the smart factory with AI and IoT for improving the productivity and production quality. In this presentation, we will present visual inspection system incorporating deep learning in the production process of semiconductor lasers. Not only OK/NG classification, but also multiple NG mode classification was performed. The inspection accuracy of 95 % that is equivalent to skilled workers' accuracy was achieved by optimizing the data set and the hyper parameters of a CNN model. The activation map was used for reliability and validity assurance. We will present the difficulty in our practical application in manufacturing industry, such as the small number of some category and small defect/chip size ratio, and also introduce our countermeasures.  Back
 
Keywords:
AI Application Deployment and Inference, Industrial Inspection, GTC Silicon Valley 2018 - ID S8911
Streaming:
Download:
 
Deep Learning for Heliophysics
Mark Cheung (Lockheed Martin Solar & Astrophysics Laboratory)
NASA's heliophysics division operates a fleet of spacecraft, the so-called Heliophysics System Observatory, to monitor the Sun's activity and how its changes drive space weather in interplanetary space and in the near-Earth environment. We'll pres ...Read More
NASA's heliophysics division operates a fleet of spacecraft, the so-called Heliophysics System Observatory, to monitor the Sun's activity and how its changes drive space weather in interplanetary space and in the near-Earth environment. We'll present case studies of how a number of challenging problems encountered in heliophysics can be tackled using deep learning: spectropolarimetric inversions for measuring the magnetic field on the solar surface, and mega-Kelvin thermometry of the Sun's corona by using a deep neural network to solve a compressed sensing problem. These low-cost solutions make possible new concepts for deep space missions for space weather monitoring. Some of the work in this presentation was made possible by NASA's Frontier Development Lab, a public-private partnership between the agency and industry partners (including the SETI Institute, NVIDIA, IBM, Intel, kx & Lockheed Martin), whose mission is to use artificial intelligence to tackle problems related to planetary defense and heliophysics.  Back
 
Keywords:
AI Application Deployment and Inference, Accelerated Analytics, Astronomy and Astrophysics, GTC Silicon Valley 2018 - ID S8222
Streaming:
Download:
 
Distributed and Scalable Video Analytics on Tegra X1/X2 Based Embedded Computer Cluster
Toygar Akgun (ASELSAN), Mehmet Fatih Karagoz (MIST ELEKTRONIK)
A wide area and city surveillance system solution for running real-time video analytics on thousands of 1080p video streams will be presented. System hardware is an embedded computer cluster based on NVIDIA TX1/TX2 and NXP iMX6 modules. A custom ...Read More

A wide area and city surveillance system solution for running real-time video analytics on thousands of 1080p video streams will be presented. System hardware is an embedded computer cluster based on NVIDIA TX1/TX2 and NXP iMX6 modules. A custom designed system software manages job distribution, resulting in collection and system wide diagnostics including instantaneous voltage, power and temperature readings. System is fully integrated with a custom designed video management software, IP cameras and network video recorders. Instead of drawing algorithm results on the processed video frames, re-encoding and streaming back to the operator computer for display, only the obtained metadata is sent to the operator computer. Video management software streams video sources independently, and synchronizes decoded video frames with the corresponding metadata locally, before presenting the processed frames to the operator.

  Back
 
Keywords:
AI Application Deployment and Inference, Intelligent Video Analytics and Smart Cities, GTC Silicon Valley 2018 - ID S8409
Streaming:
Download:
 
How AI Technology Lifts the Ads Business in JD.com
Juyan Song (NVIDIA), YAN YAN (JD.com)
Deep learning and reinforcement learning are widely used in ads products of JD.com, e.g. ranking model in recommender systems, bidding model in ad exchange business and automatic ads review systems. These technologies have brought great benefits to J ...Read More
Deep learning and reinforcement learning are widely used in ads products of JD.com, e.g. ranking model in recommender systems, bidding model in ad exchange business and automatic ads review systems. These technologies have brought great benefits to JD.com and all of them are built on Nvidia GPUs.  Back
 
Keywords:
AI Application Deployment and Inference, Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S81016
Streaming:
Download:
 
A Map of Knowledge: Using Behavioral Data in Higher-Ed to Surface Novel Semantic Structure and Personalized Guidance
Zachary Pardos (UC Berkeley)
Personalized learning has been a promising but often elusive ideal sought after in education. We'll demonstrate the progress made with two concrete examples of personalized learning supports implemented at scale in a massive open online course (MOOC ...Read More
Personalized learning has been a promising but often elusive ideal sought after in education. We'll demonstrate the progress made with two concrete examples of personalized learning supports implemented at scale in a massive open online course (MOOC) and on the UC Berkeley campus in a collaboration with the Office of the Registrar. Both approaches employ long short-term memory to leverage a collaborative signal out of millions of historic learner actions. In the case of the MOOC, the next page a learner is expected to spend considerable time on is predicted and offered as a real-time suggestion. At the university, we consider sequences of millions of historic enrollments over the past eight years. These sequences of course identifiers, when modeled with representation learning approaches most commonly applied to natural language, reveal a tremendous degree of semantic relational information about the courses which can be visualized, reasoned about, and surfaced to students. Our course information platform uses this automatically inferred semantic information to help students navigate the university's offerings and provides personalized course suggestions based on topic preference.  Back
 
Keywords:
AI Application Deployment and Inference, Consumer Engagement and Personalization, AI and DL Research, GTC Silicon Valley 2018 - ID S8597
Streaming:
 
Pioneering AI for All
Danny Lange (Unity)
Businesses of all sizes are increasingly recognizing the potential value of AI, but few are sure how to prepare for the transformational change it is sure to bring to their organizations. Danny Lange rolled out company-wide AI platforms at Uber ...Read More

Businesses of all sizes are increasingly recognizing the potential value of AI, but few are sure how to prepare for the transformational change it is sure to bring to their organizations. Danny Lange rolled out company-wide AI platforms at Uber and Amazon; now, through Unity Technologies, he's making AI available to the rest of us. He'll also share his thoughts for the most exciting advances that AI will bring over the next year. His insights will help you understand the true potential of AI, regardless of your role or industry.

  Back
 
Keywords:
AI Application Deployment and Inference, Advanced AI Learning Techniques (incl. GANs and NTMs), AI and DL Business Track (high level), AI for Business, GTC Silicon Valley 2018 - ID S8729
Streaming:
 
Low-Latency GPU Accelerated Inferencing with TensorRT
Prethvi Kashinkunti (NVIDIA)
Come learn how you can optimize the deployment of your trained neural networks using the GPU-accelerated inferencing library called TensorRT. TensorRT is a high-performance tool for low-latency, high-throughput deep neural network (DNN) inference tha ...Read More
Come learn how you can optimize the deployment of your trained neural networks using the GPU-accelerated inferencing library called TensorRT. TensorRT is a high-performance tool for low-latency, high-throughput deep neural network (DNN) inference that runs on NVIDIA GPUs. The latest release of TensorRT introduces a novel, framework-agnostic network definition format called universal framework format, allowing TensorRT to support and optimize DNN models trained in multiple deep learning frameworks like Caffe and TensorFlow. It also provides the capability to run inference at reduced precision, giving developers the ability to take advantage of new GPU hardware features like the Volta Tensor Core architecture. This session will be a combination of lecture and live demos.  Back
 
Keywords:
AI Application Deployment and Inference, Tools and Libraries, Performance Optimization, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8496
Streaming:
 
Intelligent Talent Management - AI Drives Transformation
Arjun Pratap (AVR EdGE Networks Pvt. Ltd.)
Artificial intelligence helps you hire faster and smarter. It also helps you determine your career path, learning, and development. Wondering how? AI platforms have a brain that reads, understands, and analyzes just as human beings do. They can read ...Read More
Artificial intelligence helps you hire faster and smarter. It also helps you determine your career path, learning, and development. Wondering how? AI platforms have a brain that reads, understands, and analyzes just as human beings do. They can read thousands and millions of resumes, JDs, career progressions, and learning content in a matter of seconds. This equips them with intelligence creating a neural network of skills, demographics, industries, occupations, and courses/certifications. This acts as the central intelligence powering search and match algorithms to find accurate matches to job demands in a few seconds. The NLP layer helps understand intent, for example, it differentiates between 'Worked with a PM' and 'Worked as a PM' to determine that the former could work collaboratively and the latter could drive projects. AI platforms mimic a recruiter or hiring manager's brain to find that right match. What takes HR 20-30 days is done in a few seconds by an AI platform. It helps HR leaders in workforce planning by forecasting what skills and domains to invest, maintain, or upgrade in their organizations, which could be a game changer especially for people-centric organizations.  Back
 
Keywords:
AI Application Deployment and Inference, Accelerated Analytics, AI and DL Research, AI and DL Business Track (high level), GTC Silicon Valley 2018 - ID S8303
Streaming:
 
Deploying, Profiling, and Optimizing Distributed TensorFlow in Production with GPUs
Chris Fregly (PipelineAI)
Using the latest advancements from TensorFlow including the Accelerated Linear Algebra (XLA) Framework, JIT/AOT Compiler, and Graph Transform Tool, we'll demonstrate how to optimize, profile, and deploy TensorFlow models in GPU-based production envi ...Read More
Using the latest advancements from TensorFlow including the Accelerated Linear Algebra (XLA) Framework, JIT/AOT Compiler, and Graph Transform Tool, we'll demonstrate how to optimize, profile, and deploy TensorFlow models in GPU-based production environments. We'll cover many demos based on open source tools. You can completely reproduce all demos through Docker on your own GPU cluster. See http://pipeline.ai for links to the GitHub Repo.  Back
 
Keywords:
AI Application Deployment and Inference, NVIDIA Inception Program, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8621
Streaming:
 
Latest Tools and Techniques for Training and Deploying Deep Neural Networks in Educational Environments
Joseph Bungo (NVIDIA), Dmytro Lituiev (UC Berkeley and UCSF), Craig Morioka (UCLA)
Craig Morioka, UCLA Adjunct Associate Professor of Radiological Sciences, and Dima Lituiev, Postdoctoral Scholar at the University of California San Francisco, Institute for Computational Health Sciences, will discuss how they empower their fellow fa ...Read More
Craig Morioka, UCLA Adjunct Associate Professor of Radiological Sciences, and Dima Lituiev, Postdoctoral Scholar at the University of California San Francisco, Institute for Computational Health Sciences, will discuss how they empower their fellow faculty, staff, and students with the latest techniques in training and deploying deep neural networks through NVIDIAs Deep Learning Institute (DLI) University Ambassador Program - a new AI and Deep Learning education enablement program for universities. This will include a dive into the benefits of an online learning platform, which uses GPUs in the cloud, by stepping through the DLIs online Image Segmentation and Radiomics labs. The Image Segmentation lab leverages an example from medical image analysis where it is often important to separate pixels corresponding to different types of tissue or cells for the purposes of diagnostics and treatment planning. Dima uses image segmentation in his research to facilitate diagnostics of kidney rejection by analyzing histological slides from patients with kidney transplants. We will explore how the Tensorflow code is structured and how the Tensorboard tool can be used to visualize structure and training dynamics of segmentation models. The focus of the Radiomics lab is detection of the 1p19q co-deletion biomarker using deep learning - specifically convolutional neural networks using the Keras and TensorFlow computing frameworks. Attendees will also learn how they can apply to become a DLI University Ambassador and bring the latest in Deep Learning and AI education to their academic communities.    Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, AI and DL Business Track (high level), GTC Silicon Valley 2018 - ID S8823
Streaming:
 
Protecting Pulsed High-Power Lasers with Real-Time Image Classification
Jeffrey Kelling (Helmholtz-Zentrum Dresden - Rossendorf)
Learn how to combine computer vision techniques and deep learning to improve the sensitivity of a real-time, GPU-powered safety system. In petawatt laser systems, firing at 10 Hz, suddenly appearing scatterers can damage components. Spreading of dama ...Read More
Learn how to combine computer vision techniques and deep learning to improve the sensitivity of a real-time, GPU-powered safety system. In petawatt laser systems, firing at 10 Hz, suddenly appearing scatterers can damage components. Spreading of damage can be avoided by suspending operation immediately on occurrence of such an event. We'll present our approach for the automatic detection of critical failure states from intensity profiles of the laser beam. By incorporating quick feature detection and learned heuristics for feature classification, both real-time constraints and limited available training data are accommodated. Localization of triggering feature is crucial for when the problem is located in non-sensitive sections and will not be removed from the beam in production.  Back
 
Keywords:
AI Application Deployment and Inference, Advanced AI Learning Techniques (incl. GANs and NTMs), Computer Vision, GTC Silicon Valley 2018 - ID S8330
Streaming:
 
Driver Drowsiness Detection for ADAS
Sidharth Varier (NVIDIA)
We'll present an in-car ADAS technology to detect drowsy driving. This technique can be used to alert and awaken the driver, or take corrective actions if required. We employ a CNN-based approach for this technique, which is trained on a mix of synt ...Read More
We'll present an in-car ADAS technology to detect drowsy driving. This technique can be used to alert and awaken the driver, or take corrective actions if required. We employ a CNN-based approach for this technique, which is trained on a mix of synthetic and real images. We'll cover the details of the detection system pipeline and the synthetic dataset generation. We'll also show a demonstration of the detection system in action.  Back
 
Keywords:
AI Application Deployment and Inference, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8399
Streaming:
 
Deep Learning Demystified
William Ramey (NVIDIA)
What is Deep Learning? In what fields is it useful? How does it relate to artificial intelligence? We'll discuss  deep learning and why this powerful new technology is getting so much attention, learn how deep neural networks are traine ...Read More

What is Deep Learning? In what fields is it useful? How does it relate to artificial intelligence? We'll discuss  deep learning and why this powerful new technology is getting so much attention, learn how deep neural networks are trained to perform tasks with super-human accuracy, and the challenges organizations face in adopting this new approach. We'll also cover some of the best practices, software, hardware, and training resources that many organizations are using to overcome these challenges and deliver breakthrough results.

  Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, Deep Learning and AI, GTC Silicon Valley 2018 - ID S8669
Streaming:
 
CatBoost: Fast Open-Source Gradient Boosting Library For GPU
Vasily Ershov (Yandex)
Learn how to use GPUs to accelerate gradient boosting on decision trees. We'll discuss CUDA implementation of CatBoost an open-source library that successfully handles categorical features and shows better quality compared to other open-source gra ...Read More
Learn how to use GPUs to accelerate gradient boosting on decision trees. We'll discuss CUDA implementation of CatBoost an open-source library that successfully handles categorical features and shows better quality compared to other open-source gradient boosted decision trees libraries. We'll provide a brief overview of problems which could be solved with CatBoost. Then, we'll discuss challenges and key optimizations in the most significant computation blocks. We'll describe how one can efficiently build histograms in shared memory to construct decision trees and how to avoid atomic operation during this step. We'll provide benchmarks that shows that our GPU implementation is five to 40 times faster compared to Intel server CPUs. We'll also provide performance comparison against GPU implementations of gradient boosting in other open-source libraries.  Back
 
Keywords:
AI Application Deployment and Inference, Tools and Libraries, HPC and AI, GTC Silicon Valley 2018 - ID S8393
Streaming:
 
Leveraging GPUs for Bayesian Inference
Alec Gunny (NVIDIA), Alex Kozlov (NVIDIA)
We'll present results on speeding up Bayesian inference in NVIDIA DGX-1 server for medical diagnostics. Bayesian inference is an AI technique to reason under uncertainty that is computationally and data intensive. We'll discuss the implications for ...Read More
We'll present results on speeding up Bayesian inference in NVIDIA DGX-1 server for medical diagnostics. Bayesian inference is an AI technique to reason under uncertainty that is computationally and data intensive. We'll discuss the implications for both inference and training of Bayesian networks.  Back
 
Keywords:
AI Application Deployment and Inference, Accelerated Analytics, GTC Silicon Valley 2018 - ID S8488
Streaming:
 
Prototyping Vision-Based Classifiers in Constrained Environments
Ted Hromadka (Integrity Applications Incorporated)
SOFWERX developed a vision-based classifier using commodity hardware and machine learning libraries to satisfy an urgent high-level requirement. To track the usage of tank ammunition, the team had to address challenges involving unavailable training ...Read More
SOFWERX developed a vision-based classifier using commodity hardware and machine learning libraries to satisfy an urgent high-level requirement. To track the usage of tank ammunition, the team had to address challenges involving unavailable training data, varying spatial orientations, and limited power consumption. To resolve these challenges, SOFWERX generated an augmented dataset using synthetic models, implemented spatial transformers, and experimented with different hardware/software optimizations.  Back
 
Keywords:
AI Application Deployment and Inference, Performance Optimization, GTC Silicon Valley 2018 - ID S8193
Streaming:
Download:
 
Enabling Deep Learning Applications in Radio Frequency Systems
John Ferguson (Deepwave Digital)
Artificial intelligence has made great strides in many technology sectors, however, it has yet to impact the design and applications of radio frequency (RF) and wireless systems. This is primarily due to the industry''s preference towards field progr ...Read More
Artificial intelligence has made great strides in many technology sectors, however, it has yet to impact the design and applications of radio frequency (RF) and wireless systems. This is primarily due to the industry''s preference towards field programmable gate array (FPGA) systems. Conversely, the deep learning revolution has been fueled by GPUs and the ease with which they may be programmed for highly parallel computations. The next generation RF and wireless technology will require wide-band systems capable of real-time machine learning with GPUs. Working with strategic partners, we''ve designed a software configurable wide-band RF transceiver system capable of performing real-time signal processing and machine learning with a Jetson TX2. We discuss system performance, collection of RF training data, and the software used by the community to create custom applications. Additionally, we''ll present data demonstrating applications in the field of RF machine learning and deep learning.  Back
 
Keywords:
AI Application Deployment and Inference, NVIDIA Inception Program, Cyber Security, IoT, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8375
Streaming:
 
Performance Optimization for Deep Image Matting in Photoshop
Christopher Hebert (NVIDIA), Betty Leong (Adobe Systems), Salil Tambe (Adobe Systems)
Learn how a research paper from Adobe Research Labs makes it into a real customer product like Photoshop. We attempted to solve a number of challenging issues about applying the technology to real-world use cases, including large model size, heavy me ...Read More
Learn how a research paper from Adobe Research Labs makes it into a real customer product like Photoshop. We attempted to solve a number of challenging issues about applying the technology to real-world use cases, including large model size, heavy memory consumption, and slow runtime performance.  Back
 
Keywords:
AI Application Deployment and Inference, GTC Silicon Valley 2018 - ID S8550
Streaming:
Download:
 
Optimizing NMT with TensorRT
Micah Villmow (NVIDIA)
OpenNMT is an open source neural machine translation and neural machine sequencing model. Using Volta Tensor Cores and TensorRT, we''re able to improve performance by 100 times over CPU implementation. We''ll discuss OpenNMT and how we implement it v ...Read More
OpenNMT is an open source neural machine translation and neural machine sequencing model. Using Volta Tensor Cores and TensorRT, we''re able to improve performance by 100 times over CPU implementation. We''ll discuss OpenNMT and how we implement it via TensorRT. We''ll show how by using our plugin interface and new TensorRT features, we''re able to implement this network at high performance.  Back
 
Keywords:
AI Application Deployment and Inference, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8822
Streaming:
Download:
 
Breaking the Barriers to AI-Scale in the Enterprise
Charles Boyle (NVIDIA)
Organizations everywhere want to AI-infuse every aspect of their business, but need a platform that delivers the scale and flexibility to fit both IT operational constraints, as well as workload performance demanded by data scientists. Attend this se ...Read More
Organizations everywhere want to AI-infuse every aspect of their business, but need a platform that delivers the scale and flexibility to fit both IT operational constraints, as well as workload performance demanded by data scientists. Attend this session to get see the latest advancements in scaling in GPU servers and deep learning software, and hear how the latest solutions from NVIDIA solve your biggest AI platform challenges  Back
 
Keywords:
AI Application Deployment and Inference, Data Center and Cloud Infrastructure, AI and DL Research, GTC Silicon Valley 2018 - ID S8196
Streaming:
Download:
 
Continuous Delivery of AI Applications
Asif Khan (Amazon)
Deep learning systems are usually developed by data scientists, who are good at mathematics and computer science. But to deploy and operationalize these models for broader use, you need the devops mindset and tools. We''ll show you how to connect the ...Read More
Deep learning systems are usually developed by data scientists, who are good at mathematics and computer science. But to deploy and operationalize these models for broader use, you need the devops mindset and tools. We''ll show you how to connect the workflow between the data scientists and devops. We''ll also explore basic continuous integration and delivery concepts and how they can be applied to deep learning models. Using a number of AWS services, we''ll showcase how you can take the output of a deep learning model and deploy it to perform predictions in real time with low latency and high availability. In particular, we''ll showcase the ease of deploying DL to predict functions using Apache MXNet (a deep learning library), Amazon ECS, Amazon S3, and Amazon ECR, Amazon developer tools, and AWS CloudFormation.  Back
 
Keywords:
AI Application Deployment and Inference, GTC Silicon Valley 2018 - ID S8173
Streaming:
Download:
 
Defect Inspection from Scratch to Production
Kuan-Liang (Andrew) Liu (NVIDIA), Sheng-Ting Shen (NVIDIA)
In order to fulfill customer''s requirement, companies have to guarantee the quality of delivered products, which can often be achieved only by manually inspection of the finished product. Since human-based defect inspection and classification are ti ...Read More
In order to fulfill customer''s requirement, companies have to guarantee the quality of delivered products, which can often be achieved only by manually inspection of the finished product. Since human-based defect inspection and classification are time-consuming and the results vary by individuals, automatic defect detection and classification has the potential to reduce the cost of quality assurance significantly. In this talk, we will demonstrate how to utilize deep learning algorithms, i.e., Fully Convolutional Neural Network to build a general defect inspection and classification model. We will also share experiences on how to effectively collect labelling data, deal with imbalance data, and also how to optimize the model in terms of latency and throughput with TensorRT before deploy the model to the production line.  Back
 
Keywords:
AI Application Deployment and Inference, Industrial Inspection, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8682
Streaming:
Download:
 
Identifying Defect Patterns in Hard Disk Drive Magnetic Media Manufacturing Processes Using Real and Synthetic Data
Nicholas Propes (Seagate Technology)
Learn how synthetic data can be used to develop traditional and Convolutional Neural Network (CNN) image segmentation models when labelled training data is limited. We will describe hard drive media defect patterns and how they relate to problems i ...Read More
Learn how synthetic data can be used to develop traditional and Convolutional Neural Network (CNN) image segmentation models when labelled training data is limited. We will describe hard drive media defect patterns and how they relate to problems in the manufacturing line, show why CNN models were chosen for some defect patterns, and how the CNN models were trained using both synthetic and real data. Different architectures using CNNs were explored and the resulting benefits and drawbacks are presented.  Back
 
Keywords:
AI Application Deployment and Inference, Industrial Inspection, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8415
Streaming:
Download:
 
Using AI for Interactive Applications
Ahmed Zakaria (Microsoft)
Machine learning has revolutionized many important fields, ranging from computer vision and natural language processing to healthcare and robotics. In this session, we will discuss how developers can embrace machine learning methods for graphics and ...Read More
Machine learning has revolutionized many important fields, ranging from computer vision and natural language processing to healthcare and robotics. In this session, we will discuss how developers can embrace machine learning methods for graphics and gaming. We''ll cover both gaming use cases and general applications of machine learning as well as how to best leverage recent GPU hardware for machine learning workloads.  Back
 
Keywords:
AI Application Deployment and Inference, Graphics and AI, AI for Gaming, Rendering and Ray Tracing, GTC Silicon Valley 2018 - ID S8957
Streaming:
Download:
 
Anomaly Detection on Vehicle CAN BUS
Gorkem Batmaz (NVIDIA), Ildiko Pete (NVIDIA)
We''ll discuss anomaly detection on vehicle CAN BUS. We developed a novel solution for neural networks to detect anomalies in CAN data. Due to the inherent characteristics of controller area (CAN) networks, such as lack of authentication and followin ...Read More
We''ll discuss anomaly detection on vehicle CAN BUS. We developed a novel solution for neural networks to detect anomalies in CAN data. Due to the inherent characteristics of controller area (CAN) networks, such as lack of authentication and following a broadcast routing scheme, devices connected to a CAN network are exposed to a broad range of cyberattacks. Our work aims to alleviate this problem by providing an anomaly detection mechanism, that is, identifying deviations from normal network traffic, to enhance the security of CAN networks. This invention is leveraged as one of the intrusion detection methods in a broader NVIDIA embedded software security system deployed in automotive platforms. In this specific application, the embedded system is a car computer -- an embedded system deployed in modern vehicles. Typical examples: infotainment systems, ADAS units, dashboards, head units. The vulnerable endpoints are all the peripherals connected to the computer. Typical examples: sensors, cameras, media devices, local and wide area communication interfaces and devices (for example, WiFi, BT, cellular), specific car network interfaces and devices.  Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, Cyber Security, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8347
Streaming:
Download:
 
Highly-Efficient Caching with Tiling & Chaining in CNN
Yao Yao (NVIDIA)
Learn how to achieve 100% R/W cache hit rate for most intermediate tensors in CNN and over 80% typical DRAM traffic saving, with general applicability to a limited cache size and large tensors. The high-throughput NVIDIA Tensor Core and DLA demand hi ...Read More
Learn how to achieve 100% R/W cache hit rate for most intermediate tensors in CNN and over 80% typical DRAM traffic saving, with general applicability to a limited cache size and large tensors. The high-throughput NVIDIA Tensor Core and DLA demand high memory traffic. Chaining of consecutive layers in CNN can save DRAM traffic by reusing intermediate tensors in cache. This strategy is effective only with small tensors and a large cache. In this work, we slice tensors into small tiles (with halo) and chain these tiles so the requirement for perfect caching can always be fulfilled. Our implementation of this approach is proven to be very effective in saving DRAM traffic. This work allows us to solve the memory bandwidth issue of CNN with a relatively small but high-bandwidth cache.  Back
 
Keywords:
AI Application Deployment and Inference, Performance Optimization, GTC Silicon Valley 2018 - ID S8299
Streaming:
 
Scalable, Responsive, and Cost-Effective Object Detection Service for Web-Scale Images
Yan Wang (Microsoft)
We''ll introduce how Bing built a scalable, responsive, and economical object detection API based on NVIDIA GPUs and Azure cloud platforms. Object detection is an important image understanding technique as the entry point or dispatcher to guide users ...Read More
We''ll introduce how Bing built a scalable, responsive, and economical object detection API based on NVIDIA GPUs and Azure cloud platforms. Object detection is an important image understanding technique as the entry point or dispatcher to guide users to more specific scenarios. However, it is very challenging to provide object detection services on web-scale images because it is intrinsically a compute-intensive task and thus resource demanding. We''ll also introduce how to use NVIDIA''s CUDA profiling toolchain and cuDNN to make the system even more cost-effective. The system currently supports billion-level traffic, covering Bing''s entire index.  Back
 
Keywords:
AI Application Deployment and Inference, Performance Optimization, GTC Silicon Valley 2018 - ID S8620
Streaming:
Download:
 
Revisiting the TurboCharged Test Toolbox: VR, Robotics, and More DL
Martina Sourada (NVIDIA)
Last year, we began to see promising results of applying Deep Learning in an unexpected space: hardware QA. Fast forward +365, and the efforts have been to expand on what we''ve learned, push the technology broader and into other areas that will ulti ...Read More
Last year, we began to see promising results of applying Deep Learning in an unexpected space: hardware QA. Fast forward +365, and the efforts have been to expand on what we''ve learned, push the technology broader and into other areas that will ultimately aid in our greatest challenge: testing at scale. In this session we will highlight a new piece of the problem we are tackling: VR. We will introduce methodologies for not only addressing the unique problems that VR testing presents, but will also showcase some of the other test process areas where we are applying other Deep Learning models to gain efficiency in our overall production pipeline. From using DL on our bug mining to create a quicker path from tester to developer and back, to analysis on end user issues as a method for task automation, explore how we are enabling speed, accuracy and efficiency.  Back
 
Keywords:
AI Application Deployment and Inference, Virtual Reality and Augmented Reality, Tools and Libraries, Graphics and AI, AI for Gaming, GTC Silicon Valley 2018 - ID S8262
Streaming:
 
Building Seeing AI : The Talking Camera App for the Blind
Anirudh Koul (Microsoft)
We''ll detail the journey of building Seeing AI, an app from Microsoft AI & Research that narrates the world around you. Designed for the blind and low-vision community, this research project harnesses the power of AI to describe people, text, an ...Read More
We''ll detail the journey of building Seeing AI, an app from Microsoft AI & Research that narrates the world around you. Designed for the blind and low-vision community, this research project harnesses the power of AI to describe people, text, and objects. Seeing AI leverages object classification, detection, image captioning, and more, with several running on the device in real time at more than 15 frames per second. We''ll go over the learnings, challenges, hits, and misses we encountered while developing the application.  Back
 
Keywords:
AI Application Deployment and Inference, Computer Vision, GTC Silicon Valley 2018 - ID S8598
Streaming:
 
Deep Learning Infrastructure for Autonomous Vehicles
Pradeep Gupta (NVIDIA)
We''ll introduce deep learning infrastructure for building and maintaining autonomous vehicles, including techniques for managing the lifecycle of deep learning models, from definition, training and deployment to reloading and life-long ...Read More

We''ll introduce deep learning infrastructure for building and maintaining autonomous vehicles, including techniques for managing the lifecycle of deep learning models, from definition, training and deployment to reloading and life-long learning. DNN autocurates and pre-labels data in the loop. Given data, it finds the best run-time optimized deep learning models. Training scales with data size beyond multi-nodes. With these methodologies, one takes only data from the application and feeds DL predictors to it. This infrastructure is divided into multiple tiers and is modular, with each of the modules containerized to lower infrastructures like GPU-based cloud infrastructure.

  Back
 
Keywords:
AI Application Deployment and Inference, Data Center and Cloud Infrastructure, Autonomous Vehicles, Autonomous Machines, GTC Silicon Valley 2018 - ID S8531
Streaming:
Download:
 
Deploying Machine Learning on the Oilfield: From the Labs to the Edge
Loryne Bissuel-Beauvais (Schneider Electric), Bartosz Boguslawski (Schneider Electric), Matthieu Boujonnier (Schneider Electric)
Deploying machine learning-based predictive models to the oil field is quite challenging. They are remote, hazardous, and have spotty connectivity to the cloud. The world of operationalizing a model is very different than the perfect lab environment ...Read More
Deploying machine learning-based predictive models to the oil field is quite challenging. They are remote, hazardous, and have spotty connectivity to the cloud. The world of operationalizing a model is very different than the perfect lab environment where the models are born. We'll detail the requirements of our oil and gas customers and how we were able to meet those requirements such that we could deploy a new generation of analytics with a complete software engineering discipline and mentality around it by taking advantage of the Microsoft IoT Edge platform. This is currently a pilot project under way and, due to the engineering principals in place, we are able to complete a loop from the field to the lab and back again.  Back
 
Keywords:
AI Application Deployment and Inference, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8714
Streaming:
Download:
 
Digital Twin for the Railway Network
Dattaraj Rao (General Electric)
We describes concept of Digital Twin with respect to the Railway Network. Railroad customers across the world manage thousands of miles of Track infrastructure that consists of the Rails, Ballast, Ties, Bridges, Tunnels, Wayside equipment, etc. This ...Read More
We describes concept of Digital Twin with respect to the Railway Network. Railroad customers across the world manage thousands of miles of Track infrastructure that consists of the Rails, Ballast, Ties, Bridges, Tunnels, Wayside equipment, etc. This talk demonstrates a new approach to Track infrastructure monitoring that GE is piloting for customers using the concept of Digital Twin for network. Using an offline GPU infrastructure Deep Learning models are created and trained on large volumes of video data to learn the state of healthy Track and predict anomalies. During the talk, real customer use-case videos will be shown that demonstrate Analytics on videos from Locomotive-mounted cameras with Deep Learning models to calculate health index and display on a map for driving Maintenance decisions.  Back
 
Keywords:
AI Application Deployment and Inference, Computer Vision, GTC Silicon Valley 2018 - ID S8614
Streaming:
Download:
 
How Deep Learning Could Predict Weather Events
Sa-Kwang Song (Korea Institute of Science and Technology)
How do meteorologists predict weather or weather events such as hurricanes, typhoons, and heavy rain? Predicting weather events were done based on supercomputer (HPC) simulations using numerical models such as WRF, UM, and MPAS. But recently, many de ...Read More
How do meteorologists predict weather or weather events such as hurricanes, typhoons, and heavy rain? Predicting weather events were done based on supercomputer (HPC) simulations using numerical models such as WRF, UM, and MPAS. But recently, many deep learning-based researches have been showing various kinds of outstanding results. We'll introduce several case studies related to meteorological researches. We'll also describe how the meteorological tasks are different from general deep learning tasks, their detailed approaches, and their input data such as weather radar images and satellite images. We'll also cover typhoon detection and tracking, rainfall amount prediction, forecasting future cloud figure, and more.  Back
 
Keywords:
AI Application Deployment and Inference, Climate, Weather, Ocean Modeling, Computer Vision, HPC and AI, GTC Silicon Valley 2018 - ID S8816
Streaming:
Download:
 
Visual Search at eBay
Fan Yang (eBay)
We'll share information and lessons learned from developing a scalable visual search engine to handle a massive volatile inventory like eBay. We'll describe how eBay data is challenging for visual search, how to leverage a single deep neural networ ...Read More
We'll share information and lessons learned from developing a scalable visual search engine to handle a massive volatile inventory like eBay. We'll describe how eBay data is challenging for visual search, how to leverage a single deep neural network to perform multiple tasks efficiently, how to deploy our solution in a distributed cloud infrastructure, and which optimizations we have made for a trade-off between relevance and latency. We'll give examples and insights to benefit computer vision practitioners in the industry who intend to build up visual search engines from scratch.  Back
 
Keywords:
AI Application Deployment and Inference, Data Center and Cloud Infrastructure, Computer Vision, GTC Silicon Valley 2018 - ID S8766
Streaming:
 
The Long Road to Model Deployment or how to make a good model great!
Gregory Heinrich (NVIDIA)
In this talk we will cover the essential building blocks of the AI platform Nvidia engineers are using to build a world-class automotive perception stack. Through a computer vision application example, we will see how to improve a baseline model to p ...Read More
In this talk we will cover the essential building blocks of the AI platform Nvidia engineers are using to build a world-class automotive perception stack. Through a computer vision application example, we will see how to improve a baseline model to produce better, faster predictions. The talk will focus on: - hyper-parameter optimization, - model complexity reduction (pruning), - target platform optimizations (TensorRT integration), - automation of complex workflows  Back
 
Keywords:
AI Application Deployment and Inference, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8633
Streaming:
Download:
 
Containerizing Deep Learning with Singularity
Nishanth Dandapanthula (Dell EMC)
We'll talk about how to use Singularity to containerize deep learning applications. We'll provide compelling reasons to choose Singularity over Docker. We'll cover deep learning frameworks, including TensorFlow, NV-Caffe, MXNet, and others. We'll ...Read More
We'll talk about how to use Singularity to containerize deep learning applications. We'll provide compelling reasons to choose Singularity over Docker. We'll cover deep learning frameworks, including TensorFlow, NV-Caffe, MXNet, and others. We'll present the current challenges and workarounds when using Singularity in a HPC cluster. We'll compare the performance of Singularity to bare-metal systems.  Back
 
Keywords:
AI Application Deployment and Inference, HPC and AI, GTC Silicon Valley 2018 - ID S8368
Streaming:
Download:
 
ANI-AL: Universal Deep Learning Potentials for Organic Molecules and Materials
Justin Smith (University of Florida)
We'll introduce ANI-AL molecular potentials, which are deep learning based potential energy functions for the fast and accurate prediction of quantum mechanical energies and forces of molecular systems. Thanks to GPU acceleration of training and inf ...Read More
We'll introduce ANI-AL molecular potentials, which are deep learning based potential energy functions for the fast and accurate prediction of quantum mechanical energies and forces of molecular systems. Thanks to GPU acceleration of training and inference, we successfully implement an automated sampling method that borrows techniques from active learning to automatically drive the systematic improvement of ANI-AL potentials. We'll also present results from applications of the ANI-AL potential in various problems relating to computational chemistry, such as molecular structure optimization, reaction path prediction, vibrational frequency calculation, and molecular dynamics simulations.  Back
 
Keywords:
AI Application Deployment and Inference, Computational Biology and Chemistry, GTC Silicon Valley 2018 - ID S8827
Streaming:
 
Designing Large-Scale Machine Learning Systems with NVIDIA GPUs and Mellanox Interconnect
Gil Bloch (Mellanox Technologies)
Come join us and learn how to build a data-centric GPU cluster for artificial intelligence. Mellanox is a leader in high-performance, scalable, low-latency network interconnects for both InfiniBand and Ethernet. We'll present the state of the art te ...Read More
Come join us and learn how to build a data-centric GPU cluster for artificial intelligence. Mellanox is a leader in high-performance, scalable, low-latency network interconnects for both InfiniBand and Ethernet. We'll present the state of the art techniques for distributed machine learning, and discuss what special requirements they impose on the system, followed by an overview of interconnect technologies used to scale and accelerate distributed machine learning including RDMA, NVIDIA's GPUDirect technology, and a special focus on the in-network computing SHARP technology used to accelerate large scale deployments in artificial intelligence and high performance computing.  Back
 
Keywords:
AI Application Deployment and Inference, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8635
Streaming:
 
Accelerate Your Kaldi Speech Pipeline on the GPU
Hugo Braun (NVIDIA)
Voice commands, and advancements in automatic speech recognition algorithms, that help us interact conversationally with devices, appliances and services, are growing within our everyday environment. We will share some highlights and results from wor ...Read More
Voice commands, and advancements in automatic speech recognition algorithms, that help us interact conversationally with devices, appliances and services, are growing within our everyday environment. We will share some highlights and results from work scheduling optimizations in the Kaldi framework. The first part of the talk will describe results focused primarily on optimizing the DNN components of speech pipeline. We will then show results from a GPU optimized fast lattice decode algorithm to achieve high end to end throughput across the whole ASR pipeline from the acoustic model to the language model.  Back
 
Keywords:
AI Application Deployment and Inference, AI and DL Research, GTC Silicon Valley 2018 - ID S81034
Streaming:
Download:
 
Accelerating Large-Scale Video Surveillance for Smart Cities with TensorRT
Shounan An (SK Telecom)
We'll discuss a detailed scale-up method for accelerating deep learning-based object detection inference engine with INT8 by using NVIDIA's TensorRT. Previously, converting convolutional neural networks (CNNs) from 32-bit floating-point arithmetic ...Read More
We'll discuss a detailed scale-up method for accelerating deep learning-based object detection inference engine with INT8 by using NVIDIA's TensorRT. Previously, converting convolutional neural networks (CNNs) from 32-bit floating-point arithmetic (FP32) to 8-bit integer (INT8) for classification tasks has been researched. However, there is no solid work for accelerating CNN-based object detection tasks. We'll explain how to accelerate YOLO-v2, the state-of-the-art CNN-based object detector with TensorRT using INT8. We improved YOLO-v2 network for better acceleration and more accurate for surveillance and named our network SIDNet. We verified SIDNet on several benchmark object detection and intrusion detection datasets and confirmed that SIDNet with INT8 has only 1% accuracy drop compared with FP32 mode and is 5x faster than the original YOLO-v2 on NVIDIA Tesla P40.  Back
 
Keywords:
AI Application Deployment and Inference, Telecom Industry Solutions, Deep Learning and AI Frameworks, Computer Vision, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8296
Streaming:
Download:
 
Applying AI to Simplify Support- Lessons Learnt
Satish Mandalika (Drishyam.ai)
We'll provide insights into how customer support built on the foundation of AI can help streamline customer support for large enterprises, especially manufacturers. With AI technologies like image recognition and natural language processing maturing ...Read More
We'll provide insights into how customer support built on the foundation of AI can help streamline customer support for large enterprises, especially manufacturers. With AI technologies like image recognition and natural language processing maturing, enterprises should strongly consider building an AI-based support platform, especially those with an omni-channel strategy. Delivering an amazing and differentiated user experience will lead to higher net promoter and customer satisfaction scores. By employing AI-based technologies, enterprises can reduce their contacts, consequently reducing their cost and contact. It will also help them sell more replacement parts online.  Back
 
Keywords:
AI Application Deployment and Inference, NVIDIA Inception Program, Video and Image Processing, GTC Silicon Valley 2018 - ID S8517
Streaming:
Download:
 
Simulate and Validate your DNN Inference with CATIA before ADAS Industrial Deployment
Simon Berard (Dassault Systèmes), Cecile Doan (Dassault Systèmes)
One of the tough aspect of Deep Neural Network resides in its behavior validation. Although actual driving should be achieved with physical cars to train the neural network, there is today no tool to appropriately prepare data acquisition campaign or ...Read More
One of the tough aspect of Deep Neural Network resides in its behavior validation. Although actual driving should be achieved with physical cars to train the neural network, there is today no tool to appropriately prepare data acquisition campaign or go through stress validation before further on-road testing and industrial deployment. This talk will show how hardware and software in the loop on 3DEXPERIENCE CATIA, can now be extended to AI in the loop, with the ability to activate the full system engineering simulation with the actual neural network meant to run in the autonomous vehicle, accurately reproducing the neural network inference and checking overall vehicle behavior in various conditions. Every stage from full 3D synthetic data ingest and real-time software simulation, through actual hardware in the loop validation both use cases leveraging TensorRT GPU inference can now consistently be proofed for appropriate in-depth understanding of the network reactions before it drives on the road. A POC showing TensorRT and DNN behavior validation will be presented in details, opening new opportunities to validate GPU inference but also compare actual performance impact versus CPU  Back
 
Keywords:
AI Application Deployment and Inference, Product & Building Design, GTC Silicon Valley 2018 - ID S8748
Streaming:
Download:
 
Deep Learning for Industrial Inspection Analysis
Paul Baines (Wise.io from GE Digital)
We'll show how GE combines extensive domain knowledge with modern deep learning techniques to build intelligent pipeline inspection systems. GE builds a variety of industrial inspection equipment from ultrasonic pipeline inspection gauges to large-s ...Read More
We'll show how GE combines extensive domain knowledge with modern deep learning techniques to build intelligent pipeline inspection systems. GE builds a variety of industrial inspection equipment from ultrasonic pipeline inspection gauges to large-scale CT scanners. As historical producers of hardware, GE is now leading the transformation of the industrial space by building intelligent ecosystems around industrial equipment and processes. Challenges in this space include the esoteric domain-specific nature of the data, as well as the risk averse nature of the industry. However, by leveraging deep learning on large amounts of inspection data, we have been able to build a production system that enhances the reliability and consistency of the inspection process.  Back
 
Keywords:
AI Application Deployment and Inference, Industrial Inspection, GTC Silicon Valley 2018 - ID S8657
Streaming:
 
GBM Inferencing on GPU
Vinay Deshpande (NVIDIA), Shankara Rao Thejasw Nanditale (NVIDIA)
We'll present a novel GPU implementation for batched GBM inferencing. We'll also present detailed performance comparison of our implementation against the state-of-the-art libraries such as XGBoost and Treelite. We'll then compare inference perfor ...Read More
We'll present a novel GPU implementation for batched GBM inferencing. We'll also present detailed performance comparison of our implementation against the state-of-the-art libraries such as XGBoost and Treelite. We'll then compare inference performance on various real-world datasets.  Back
 
Keywords:
AI Application Deployment and Inference, Accelerated Analytics, AI and DL Research, GTC Silicon Valley 2018 - ID S8873
Streaming:
Download:
 
How Microservices and Serverless Computing Enable the Next Generation of Machine Intelligence
Diego Oppenheimer (Algorithmia)
We'll discuss why AI and machine learning are a natural fit for serverless computing and a general architecture for scalable and serverless machine learning in production. We'll discuss issues encountered during implementing our own on-demand scali ...Read More
We'll discuss why AI and machine learning are a natural fit for serverless computing and a general architecture for scalable and serverless machine learning in production. We'll discuss issues encountered during implementing our own on-demand scaling over GPU clusters, show how these apply to more general solutions, and present one possible vision for the future of cloud-based machine learning.  Back
 
Keywords:
AI Application Deployment and Inference, NVIDIA Inception Program, Accelerated Analytics, GTC Silicon Valley 2018 - ID S8900
Streaming:
Download:
AI and DL Business Track (high level)
Presentation
Media
AI for Social Good as an Innovation Driver
Ben Hamner (Kaggle), Catherine Ordun (Booz Allen Hamilton), Josh Sullivan (Booz Allen Hamilton), Richard Wender (American Cancer Society)
Innovation can take many forms, and led by varying stakeholders across an organization. One successful model is utilizing AI for Social Good to drive a proof-of-concept that will advance a critical strategic goal. The Data Science Bowl (DSB) is ...Read More

Innovation can take many forms, and led by varying stakeholders across an organization. One successful model is utilizing AI for Social Good to drive a proof-of-concept that will advance a critical strategic goal. The Data Science Bowl (DSB) is an ideal example, launched by Booz Allen Hamilton in 2014, it galvanizes thousands of data scientists to participate in competitions that will have have far reaching impact across key industries such as healthcare. This session will explore the DSB model, as well as look at other ways organizations are utilizing AI for Social Good to create business and industry transformation.

  Back
 
Keywords:
AI and DL Business Track (high level), AI for Business, GTC Silicon Valley 2018 - ID S8953
Streaming:
Download:
 
Success in the Age of AI
Michael Sutcliff (Accenture)
From healthcare to financial services to retail, businesses are seeing unprecedented levels of efficiencies and productivity, which will only continue to rise and transform how companies operate. This session will look at how Accenture as an ent ...Read More

From healthcare to financial services to retail, businesses are seeing unprecedented levels of efficiencies and productivity, which will only continue to rise and transform how companies operate. This session will look at how Accenture as an enterprise is optimizing itself in the age of AI, as well as how it guides its customers to success. A look at best practices, insights, and measurement to help the audience inform their AI roadmap and journey.

  Back
 
Keywords:
AI and DL Business Track (high level), AI for Business, GTC Silicon Valley 2018 - ID S8984
Streaming:
Download:
 
From Dark Matter Detection to Deep Learning in Enterprise
Scott Stephenson (Deepgram)
Advancements in deep learning are enabling enterprise companies to make meaningful impacts to bottom-line profits. Enterprises capture thousands of hours of customer phone call recordings per day. This voice data is extremely valuable because it cont ...Read More
Advancements in deep learning are enabling enterprise companies to make meaningful impacts to bottom-line profits. Enterprises capture thousands of hours of customer phone call recordings per day. This voice data is extremely valuable because it contains insights that the business can use to improve customer experience and operations. We'll follow Deepgram CEO Dr. Scott Stephenson's path from working in a particle physics lab two miles underground to founding a deep learning company for voice understanding. We'll describe applications of cutting-edge AI techniques to make enterprise voice datasets mineable for valuable business insights. Companies today use these insights to drive the bottom line.  Back
 
Keywords:
AI and DL Business Track (high level), Telecom Industry Solutions, Speech and Language Processing, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8274
Streaming:
Download:
 
The Face Will State The Case
Lisa Hammitt (Beseeq), Rebecca Krauthamer (Beseeq)
We have all heard about Facial Expression and Recognition Systems (FERS) and emotion capture but curiosity looms large. Is it training sets born of Generative Adversarial Networks (GANs) along with GPU architectures that will catapult this technolog ...Read More
We have all heard about Facial Expression and Recognition Systems (FERS) and emotion capture but curiosity looms large. Is it training sets born of Generative Adversarial Networks (GANs) along with GPU architectures that will catapult this technology forward? To be sure, but, something much deeper - a revolution within Computer Science programs in the schools - will accelerate its arrival in consumer platforms. It's called Social Signal Processing and women technologists have a competitive advantage in inventing and enhancing the deep learning algorithms that will fuel it. Come and listen to an industry veteran with 28 years in Artificial Intelligence, including her driving Watson into consumer platforms and a graduate of Stanford University, bolstered by her solid research in Symbolic Systems, discuss their patent-pending technology in the exciting area of Social Signal Processing and FERS. They are both frequent speakers on the ethics of AI usage and will offer their thoughts about how this new class of technology offers a new deal for women to shape the future of AI.  Back
 
Keywords:
AI and DL Business Track (high level), AI and DL Research, GTC Silicon Valley 2018 - ID S8939
Streaming:
 
Matching DS Organizational Maturity to DS Skills to Optimally Grow Your Team
Jesse Spencer-Smith (HCA Healthcare)
An organization''s data science needs change dramatically as they move through stages of data science maturity--their ability to consume, adopt, and deploy advanced analytics solutions. Understanding the maturity stage of your organization will help ...Read More
An organization''s data science needs change dramatically as they move through stages of data science maturity--their ability to consume, adopt, and deploy advanced analytics solutions. Understanding the maturity stage of your organization will help you choose projects that can bring value, grow your ability to derive greater value in the future, and help you make good decisions when growing your data science team. A data scientist might be a journeyman model builder, or a data scientist consultant, or a software engineer, or a developer of new deep learning algorithms. The data scientist that would be successful in a mature organization may well fail in an organization new to data science. Hiring and growing data scientists based on skill sets in line with your data science maturity stage and maximizes your probability of success. We''ll discuss a framework to determine your level of data science readiness, explore a tool to assess the skill sets of data scientists, and find which skills can maximize your organization''s probability of success at each stage.  Back
 
Keywords:
AI and DL Business Track (high level), GTC Silicon Valley 2018 - ID S8954
Streaming:
Download:
 
Rapid Pace of Change and Industry Progress
John Abbott (451 Research), Nick Patience (451 Research)
We are still in the early stages of AI, and its impact on industries is already significant - from healthcare to financial services to retail. Businesses are seeing unprecedented levels of efficiencies and productivity, which will only continue to ri ...Read More
We are still in the early stages of AI, and its impact on industries is already significant - from healthcare to financial services to retail. Businesses are seeing unprecedented levels of efficiencies and productivity, which will only continue to rise and transform how companies operate. This session will explore the progress of AI adoption over the last year, the industries that are leaping ahead, new AI innovations that will serve cross-industry concerns, and what businesses should expect in terms of adoption maturity in 2018.  Back
 
Keywords:
AI and DL Business Track (high level), GTC Silicon Valley 2018 - ID S8952
Streaming:
Download:
 
Scaling AI POCs Across the Enterprise
Omar Dhalla (Element AI)
Has your team developed an AI proof-of-concept with promising metrics? Next step is to broaden the scope to impact larger areas of the enterprise. With its unique challenges and complexities, scaling POCs across multiple business units is a significa ...Read More
Has your team developed an AI proof-of-concept with promising metrics? Next step is to broaden the scope to impact larger areas of the enterprise. With its unique challenges and complexities, scaling POCs across multiple business units is a significant part of any company''s AI roadmap. This session will look at best practices, insights and success, rooted in Element AI''s experience with enterprise customers.  Back
 
Keywords:
AI and DL Business Track (high level), NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8989
Streaming:
Download:
 
Real-Time Genetic Analysis Enabled by GPU
Wayne Thompson (SAS)
For enterprises daunted by the prospect of AI and investing in a new technology platform, the reality is that AI can leverage already-in-place big data and cloud strategies. This session will explore AI and deep learning use cases that are desig ...Read More

For enterprises daunted by the prospect of AI and investing in a new technology platform, the reality is that AI can leverage already-in-place big data and cloud strategies. This session will explore AI and deep learning use cases that are designed for ROI, and look at how success is being measured and optimized.

  Back
 
Keywords:
AI and DL Business Track (high level), AI for Business, GTC Silicon Valley 2018 - ID S8983
Streaming:
 
The Extreme Data Economy: How Businesses Thrive in the Post Big Data Era (Presented by Kinetica)
Daniel Raskin (Kinetica)
Get the latest information on how the proliferation of mobile, cloud, and IoT devices has brought us into a new era: The Extreme Data Economy. There''s a greater variety of data than ever before, and exponentially more of it, streaming in real time. ...Read More
Get the latest information on how the proliferation of mobile, cloud, and IoT devices has brought us into a new era: The Extreme Data Economy. There''s a greater variety of data than ever before, and exponentially more of it, streaming in real time. Across industries, companies are turning data into an asset, above and beyond any product or service they offer. But unprecedented agility is required to keep business in motion and succeed in this post-big data era. To enable this level of agility, companies are turning to instant insight engines that are powered by thousands of advanced GPU cores, bringing unparalleled speed, streaming data analysis, visual foresight, and machine learning to break through the old bottlenecks. Learn about new data-powered use cases you''ll need to address, as well as advances in computing technology, particularly accelerated parallel computing, that will translate data into instant insight to power business in motion.  Back
 
Keywords:
AI and DL Business Track (high level), NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8997
Streaming:
Download:
 
Create Customer Value with Google Cloud AI (Presented by Google)
Chris Kleban (Google Inc.)
In this session, you will learn how Google Cloud helps enterprises make the most out of data, and deliver customer value. We will provide an in-depth overview of the Cloud AI and Data Analytics offering that helps enterprises manage their ML lifecycl ...Read More
In this session, you will learn how Google Cloud helps enterprises make the most out of data, and deliver customer value. We will provide an in-depth overview of the Cloud AI and Data Analytics offering that helps enterprises manage their ML lifecycle, from data ingestion to insights and prediction. We will also demonstrate some breakthrough solutions, like AutoML, that are making ML accessible to everyone.  Back
 
Keywords:
AI and DL Business Track (high level), GTC Silicon Valley 2018 - ID S8976
Streaming:
 
Trends and Opportunities for ML and AI in Consumer Insights Industries
Paul Hendricks (NVIDIA), Eric Thorsen (NVIDIA)
We'll examine business value drivers for artificial intelligence and machine learning in retail and consumer goods industries. Traditionally, traction in AI and ML has been in deep research, scientific, and technical communities. Retailers and consu ...Read More
We'll examine business value drivers for artificial intelligence and machine learning in retail and consumer goods industries. Traditionally, traction in AI and ML has been in deep research, scientific, and technical communities. Retailers and consumer products companies are finding great success applying AI and ML technology to distinct use cases and business challenges. Join us to hear project descriptions and customer examples where AI and ML can impact the business by increasing revenue, protecting margin, and improving consumer satisfaction.  Back
 
Keywords:
AI and DL Business Track (high level), Virtual Reality and Augmented Reality, Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8131
Streaming:
Download:
 
Practical Use Cases Of AI and Deep Learning On GPUs In The Cloud For Marketing And Retail
Alexander Tsyplikhin (Data Monsters)
We'll review three practical use cases of applying AI and deep learning in the marketing and retail industries. For each use case, we'll cover business situations, discuss potential approaches, and describe final solutions from both the AI and infr ...Read More
We'll review three practical use cases of applying AI and deep learning in the marketing and retail industries. For each use case, we'll cover business situations, discuss potential approaches, and describe final solutions from both the AI and infrastructural points of view. Attendees will learn about applications of AI and deep learning in marketing and advertising; AI readiness criteria; selecting the right AI and deep learning methods, infrastructure, and GPUs for specific use cases; and avoiding potential risks.  Back
 
Keywords:
AI and DL Business Track (high level), Predictive Analytics for Retail, Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8265
Streaming:
Download:
 
Earth Observation From Space: Deep Learning based Satellite Image Analysis
Patrick Helber (German Research Center for Artificial Intelligence)
Learn how recent advances in Earth observation are opening up a new exciting area for exploration of satellite image data with deep learning. Focusing on real-world scenarios, we will teach you how to analyze this exciting remote sensing data source ...Read More
Learn how recent advances in Earth observation are opening up a new exciting area for exploration of satellite image data with deep learning. Focusing on real-world scenarios, we will teach you how to analyze this exciting remote sensing data source with deep neural networks. An automated satellite image understanding is of high interest for various research fields and industry sectors such as the insurance, agriculture or investing industry. You will learn how to apply deep neural networks in natural disaster situations and for the classification of land-use, land-cover and building types.  Back
 
Keywords:
AI and DL Business Track (high level), GIS, AI and DL Research, GTC Silicon Valley 2018 - ID S81028
Download:
AI and DL Research
Presentation
Media
Training Neural Networks with Mixed Precision: Real Examples
Benjamin Barsdell (NVIDIA), Michael O'Connor (NVIDIA), Christian M. Sarofeen (NVIDIA)
We will cover the techniques for training DNNs with Tensor Cores described in "S8923 - Training Neural Networks with Mixed Precision: Theory and Practice". These methods were introduced for AI processing with the Volta GPU architecture. T ...Read More
We will cover the techniques for training DNNs with Tensor Cores described in "S8923 - Training Neural Networks with Mixed Precision: Theory and Practice". These methods were introduced for AI processing with the Volta GPU architecture. Tensor Cores provide up to 120 TFlops throughput, mixing operations on IEEE half- and single-precision floats. Techniques used will include loss-scaling, master weights copy, and choosing the proper precision for a given operation. For each of TensorFlow and PyTorch we will describe a fp32 network definition and then demonstrate the same network using mixed precision techniques.  Back
 
Keywords:
AI and DL Research, Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S81012
Streaming:
Download:
 
Supporting a DGX Air-Gapped Production Environments
Sumit Kumar (NVIDIA), Jeffrey Weiss (NVIDIA)
This tutorial will cover the issues encountered when deploying NVIDIA DGX-1/DGXStation into secure environment. For security reasons, some installations require that systems be isolated from the internet or outside networks. Since most DGX-1 softwar ...Read More
This tutorial will cover the issues encountered when deploying NVIDIA DGX-1/DGXStation into secure environment. For security reasons, some installations require that systems be isolated from the internet or outside networks. Since most DGX-1 software updates are accomplished through an over-the-network process with NVIDIA servers, this session will walk the participants through how updates can be made by maintaining an intermediary server. This session will be a combination of lecture, live demos and along with detailed instructions.  Back
 
Keywords:
AI and DL Research, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8568
Streaming:
 
Scaling Machine Learning through Decentralization, Quantization, and Structured Sparsity
Dan Alistarh (IST Austria), Ce Zhang (ETH Zurich)
In this session, participants will get a taste of state-of-the-art techniques for scaling Deep Learning on GPU clusters. We present SuperML, a general and efficient communication layer for machine learning, which can scale neural network training to ...Read More
In this session, participants will get a taste of state-of-the-art techniques for scaling Deep Learning on GPU clusters. We present SuperML, a general and efficient communication layer for machine learning, which can scale neural network training to hundreds of GPU nodes. SuperML builds on three main ideas: decentralization, which allows algorithms to converge without a centralized coordinator (parameter server) or all-to-all communication, communication quantization, which significantly speeds up point-to-point messaging, and structured sparsity, by which SuperML induces model updates which only have a limited number of non-zero entries. From the technical perspective, SuperML provides a new implementation of the classic MPI standard, re-designed and re-implemented to provide efficient support for quantization and sparsity. We illustrate the performance characteristics of SuperML on CSCS Piz Daint, Europe's most powerful supercomputer, and on Amazon EC2, improving upon other highly optimized implementations such as CrayMPI and NVIDIA NCCL.  Back
 
Keywords:
AI and DL Research, Accelerated Analytics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8668
Streaming:
Download:
 
Designing Wireless Systems with Deep Learning - An Autoencoder-Based Approach to PHY Layer Design
Ben Hilburn (DeepSig Inc.), Tim O'Shea (DeepSig Inc.)
The field of wireless engineering is on the cusp of a revolution, driven by deep learning, that will define the next paradigm in wireless system design. While wireless communications technology has advanced considerably since its invention in the 189 ...Read More
The field of wireless engineering is on the cusp of a revolution, driven by deep learning, that will define the next paradigm in wireless system design. While wireless communications technology has advanced considerably since its invention in the 1890s, the fundamental design methodology has remained unchanged throughout its history - expert engineers hand-designing radio systems for specific applications. Deep learning enables a new, radically different approach, where systems are learned from wireless channel data. As the world becomes more connected and the Internet of Things becomes a reality, it is difficult to overstate the enormity of the impact to both commercial and military systems. This talk will provide a high-level overview of deep learning applied to wireless communications, discuss the current state of the technology and research, and present a vision for the future of wireless engineering.  Back
 
Keywords:
AI and DL Research, Telecom Industry Solutions, GTC Silicon Valley 2018 - ID S8791
Streaming:
 
Domain Adaptation Using Adversarial Training for Semantic Segmentation and Caption Style Transfer
Min Sun (National Tsing Hua University)
We'll introduce the basic concept of domain adaptation and how to use adversarial training to achieve unsupervised domain adaptation. We'll then describe how the technique is used in two tasks: improving semantic segmentation across cities, and tr ...Read More
We'll introduce the basic concept of domain adaptation and how to use adversarial training to achieve unsupervised domain adaptation. We'll then describe how the technique is used in two tasks: improving semantic segmentation across cities, and transferring language style for image captioning. In particular, we combine domain adaptation with policy gradient-based reinforcement learning approach to transfer language style. The details and results of both tasks are published in ICCV 2017.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8200
Streaming:
 
Deep Learning Applications for Radio Frequency (RF) Data
Adam Thompson (NVIDIA)
We'll discuss applications of deep learning to radio frequency (RF) data including specific signal and digital modulation scheme classification, identification of nefarious activities, and a general overview of the unique challenges and solutions fo ...Read More
We'll discuss applications of deep learning to radio frequency (RF) data including specific signal and digital modulation scheme classification, identification of nefarious activities, and a general overview of the unique challenges and solutions for AI in this domain. With the ubiquity of RF communication signals in our lives, deep learning can be leveraged to ensure accurate signal transmission and safer communities.  Back
 
Keywords:
AI and DL Research, Computational Physics, GTC Silicon Valley 2018 - ID S8826
Streaming:
Download:
 
Simultaneous Pixel-Localization and Feature Extraction for Multiple Instances in a Scene
Timothy Klein (Arete Associates)
We'll introduce attendees to a new deep learning approach to object-localization. Instead of bounding boxes, our network estimates the center pixel locations for a variable number of targets in a scene while simultaneously extracting a characteristi ...Read More
We'll introduce attendees to a new deep learning approach to object-localization. Instead of bounding boxes, our network estimates the center pixel locations for a variable number of targets in a scene while simultaneously extracting a characteristic feature-set. We'll outline the overall approach and describe the underlying network architecture and training. We'll also present the results of our network as applied to the cars overhead with context dataset and discuss the current and future possibilities of this approach.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8191
Streaming:
 
Inside NVIDIA GPU Cloud Deep Learning Framework Containers
John Barco (NVIDIA), Christopher Lamb (NVIDIA)
In this technical deep dive, get an in-depth look at the deep learning containers on NVIDIA GPU Cloud (NGC) and learn how they can simplify your AI projects. NVIDIA pre-integrates and optimizes the top deep learning frameworks such as TensorFlow, PyT ...Read More
In this technical deep dive, get an in-depth look at the deep learning containers on NVIDIA GPU Cloud (NGC) and learn how they can simplify your AI projects. NVIDIA pre-integrates and optimizes the top deep learning frameworks such as TensorFlow, PyTorch, and MXNet, and makes them available on NVIDIA GPU Cloud, removing time consuming do-it-yourself software integration. We'll look at the NVIDIA framework optimizations, such as reducing GPU memory overhead, improving multi-GPU scaling, and reducing latency. And we'll talk about the integration of runtimes and drivers in the containers to ensure the correct versions of software are working together for peak performance. You'll leave with an understanding of what make an NVIDIA GPU-optimized deep learning container tick.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8497
Streaming:
Download:
 
Matchbox: Automatic Batching for Dynamic Deep Learning
James Bradbury (Salesforce)
Matchbox is an open source PyTorch-based tool that lets users implement their deep learning models as imperative code that applies to individual data samples, then efficiently train and validate them on batched data using GPUs. By automatically keepi ...Read More
Matchbox is an open source PyTorch-based tool that lets users implement their deep learning models as imperative code that applies to individual data samples, then efficiently train and validate them on batched data using GPUs. By automatically keeping track of batch-level masking and padding and rewriting data-dependent control flow, Matchbox simplifies model code, eliminates a class of implementation bugs, and allows programmers to work directly at a more natural level of abstraction.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8977
Streaming:
Download:
 
Tackling the Crowded Radio Frequency Spectrum Using Deep Learning
Krishna Karra (KickView Corporation)
We'll introduce new concepts and algorithms that apply deep learning to radio frequency (RF) data to advance the state of the art in signal processing and digital communications. With the ubiquity of wireless devices, the crowded RF spectrum ...Read More

We'll introduce new concepts and algorithms that apply deep learning to radio frequency (RF) data to advance the state of the art in signal processing and digital communications. With the ubiquity of wireless devices, the crowded RF spectrum poses challenges for cognitive radio and spectral monitoring applications. Furthermore, the RF modality presents unique processing challenges due to the complex-valued data representation, large data rates, and unique temporal structure. We'll present innovative deep learning architectures to address these challenges, which are informed by the latest academic research and our extensive experience building RF processing solutions. We'll also outline various strategies for pre-processing RF data to create feature-rich representations that can significantly improve performance of deep learning approaches in this domain. We'll discuss various use-cases for RF processing engines powered by deep learning that have direct applications to telecommunications, spectral monitoring, and the Internet of Things.

  Back
 
Keywords:
AI and DL Research, Telecom Industry Solutions, Federal, GTC Silicon Valley 2018 - ID S8267
Streaming:
Download:
 
Point Cloud Deep Learning
Innfarn Yoo (NVIDIA)
This presentation shows in-depth comparisons of several neural network models for 3D object classification. Object classification from 2D image is studied thoroughly and widely adopted during last few years by following the advances of deep neural ne ...Read More
This presentation shows in-depth comparisons of several neural network models for 3D object classification. Object classification from 2D image is studied thoroughly and widely adopted during last few years by following the advances of deep neural networks. From then, 3D object classification methods are actively studied, and yet not completely mature. Point cloud is most basic format of 3D objects. In this work, we present many neural network models that can be learned from 3D point cloud. It includes directly learning from 3D point cloud, projected 2D pixels, and voxelated volumes. This work uses Princeton ModelNet datasets and ShapeNetCore.v2 dataset, and then provides the comparisons of those neural network models.  Back
 
Keywords:
AI and DL Research, Graphics and AI, Rendering and Ray Tracing, Real-Time Graphics, GTC Silicon Valley 2018 - ID S8453
Streaming:
Download:
 
GUNREAL: GPU-Accelerated Unsupervised Reinforcement and Auxiliary Learning
Koichi Shirahata (Fujitsu Laboratories Ltd.)
We'll introduce GPU-accelerated unsupervised reinforcement and auxiliary learning (UNREAL) algorithm. Recent state-of-the-art deep reinforcement learning algorithms, such as A3C and UNREAL, are designed to train on a single device with only CPUs. Us ...Read More
We'll introduce GPU-accelerated unsupervised reinforcement and auxiliary learning (UNREAL) algorithm. Recent state-of-the-art deep reinforcement learning algorithms, such as A3C and UNREAL, are designed to train on a single device with only CPUs. Using GPU acceleration for these algorithms results in low GPU utilization, which means the full performance of the GPU is not reached. Motivated by the architecture changes made by the GA3C algorithm, which gave A3C better GPU acceleration, together with the high learning efficiency of the UNREAL algorithm, we extend GA3C with the auxiliary tasks from UNREAL to create GUNREAL. We show that our GUNREAL system finished training faster than UNREAL and reached higher scores than GA3C.  Back
 
Keywords:
AI and DL Research, Performance Optimization, GTC Silicon Valley 2018 - ID S8219
Streaming:
Download:
 
Large-Scale Self-Supervised Robot Learning with GPU-Enabled Video-Prediction Models
Frederik Ebert (UC Berkeley)
To acquire rich repertoires of skills, robots must be able to learn from their own autonomously collected data. We'll describe a video-prediction model that predicts what a robot will see next, and show how this model can be used to solve complex ma ...Read More
To acquire rich repertoires of skills, robots must be able to learn from their own autonomously collected data. We'll describe a video-prediction model that predicts what a robot will see next, and show how this model can be used to solve complex manipulations tasks in real-world settings. Our model was trained on 44,000 video sequences, where the manipulator autonomously pushes various objects. Using the model, the robot is capable of moving objects that were not seen during training to desired locations, handling multiple objects and pushing objects around obstructions. Unlike other methods in robotic learning, video-prediction does not require any human labels. Our experiments show that the method achieves a significant advance in the range and complexity of skills that can be performed entirely with self-supervised robotic learning. This session is for attendees that possess a basic understanding of convolutional and recurrent neural networks.  Back
 
Keywords:
AI and DL Research, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8629
Streaming:
Download:
 
Deep Generative Modeling for Speech Synthesis and Sensor Data Augmentation
Praveen Narayanan (Ford Motor Company)
We'll discuss how we could use deep generative modeling in two application domains; in speech synthesis, and in sensor data modeling. We'll give an overview of what generative modeling is and how it could be used for practical AI tasks through the ...Read More
We'll discuss how we could use deep generative modeling in two application domains; in speech synthesis, and in sensor data modeling. We'll give an overview of what generative modeling is and how it could be used for practical AI tasks through these examples. We'll also give a flavor of latent space methods, which we can use to learn more about our data so as to transform them in meaningful ways, with uses in both reconstruction and in generation.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8617
Streaming:
Download:
 
New Applications of Deep Learning in Dialogue Generation and Question Answering
Mithun Das Gupta (Microsoft)
The current generation AI systems are mostly moving towards dialogue generation and question answering. Human like conversation and dialogue based interaction has been proposed as the interface for tomorrow, which would obliterate key-boards and trac ...Read More
The current generation AI systems are mostly moving towards dialogue generation and question answering. Human like conversation and dialogue based interaction has been proposed as the interface for tomorrow, which would obliterate key-boards and track-pads from computers as we know them. We present two important current developments in these fields. First we talk about a neural dialogue generation system which can be deployed to engage humans in a multi-turn conversation. Next we talk about a segmented question answering module which can find answers from the web. The combination of these two techniques has the potential to unlock numerous new verticals, such as travel, retail etc. We will talk about the technical details as well as the higher level design choices.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8151
Streaming:
 
Object-Level Deep Reinforcement Learning
William Agnew (University of Washington)
We'll show how deep reinforcement learning can be greatly sped up by separating perception and action, with a reward function specified in terms of objects and their motions, which are supplied by the perceptual system. In the past five years, reinf ...Read More
We'll show how deep reinforcement learning can be greatly sped up by separating perception and action, with a reward function specified in terms of objects and their motions, which are supplied by the perceptual system. In the past five years, reinforcement learners have become vastly more powerful by incorporating deep learning techniques, playing Atari, Mario, Go, and other games with superhuman skill. However, these learners require vast amounts of training data to become skilled. For example, to master Pong, state-of-the-art reinforcement learners require tens of millions of game frames, equivalent to months of play time at human speed. We show that endowing the learner with a minimal perceptual system, capable of detecting and tracking objects, greatly reduces the number of frames needed for learning. This shifts the learning bottleneck from the amount of training data available to computations easily accelerated with GPUs.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8581
Streaming:
Download:
 
Recent Advances in Neural Machine Translation: Multilingual, Non-Parametric to Unsupervised Neural Machine Translation
Kyunghyun Cho (New York University)
We'll describe the latest advances in neural machine translation from three different perspectives. We'll start with character-level, multilingual neural machine translation, which aims at harnessing positive language transfer among multiple langua ...Read More
We'll describe the latest advances in neural machine translation from three different perspectives. We'll start with character-level, multilingual neural machine translation, which aims at harnessing positive language transfer among multiple languages to improve the translation quality and the robustness of such a multilingual translation model to intra-sentence code-switching and typos. We'll then discuss the recent research on exploiting data beside oft-used parallel corpora. We'll discuss how another modality, such as vision, can be used to enable zero-resource machine translation, and how purely unsupervised neural machine translation can be done by exploiting the similarity between language distributions of two languages. Finally, we'll discuss a recent trend of retrieval-based approaches to deep learning with a specific example of non-parametric neural machine translation.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8609
Streaming:
Download:
 
Deep Active Learning
Adam Lesnikowski (NVIDIA)
We'll discuss ongoing work at NVIDIA on deep active learning. Attendees can expect to learn what active learning is and some of the challenges of applying it to deep neural network training.
We'll discuss ongoing work at NVIDIA on deep active learning. Attendees can expect to learn what active learning is and some of the challenges of applying it to deep neural network training.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8692
Streaming:
 
Unsupervised Image-to-Image Translation Networks
Ming-Yu Liu (NVIDIA)
We'll introduce a GAN-based framework for unsupervised image-to-image translation. It leverages a shared latent space assumption to learn to translate an image in one domain to a corresponding image in another domain without requiring any pair of co ...Read More
We'll introduce a GAN-based framework for unsupervised image-to-image translation. It leverages a shared latent space assumption to learn to translate an image in one domain to a corresponding image in another domain without requiring any pair of corresponding images in the two domains in the training dataset. We'll show examples on translating street scene images, from sunny day to rainy day or from day time to night time. We also show image translation results on dog breed conversions and cat species conversion as well as human face translation based on attributes.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8114
Streaming:
Download:
 
Towards Lifelong Reinforcement Learning
Pulkit Agrawal (UC Berkeley)
Reinforcement learning aims to determine a mapping from observations to actions that maximize a reward criterion. The agent starts off exploring the environment for rewards with random search, which is only likely to succeed in all but simplest of se ...Read More
Reinforcement learning aims to determine a mapping from observations to actions that maximize a reward criterion. The agent starts off exploring the environment for rewards with random search, which is only likely to succeed in all but simplest of settings. Furthermore, measuring and designing reward functions for real-world tasks is non-trivial. Inspired by research in developmental psychology, in this talk I will discuss how reinforcement learning agents might use curiosity and knowledge accumulated from experience for efficient exploration. I will present results illustrating an agent learning to play the game of Mario and learning to navigate without rewards, a study quantifying the kinds of prior knowledge used by humans for efficient exploration and some robotic manipulation experiments including the use of an anthropomorphic hand for grasping objects.   Back
 
Keywords:
AI and DL Research, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8217
Streaming:
 
How We Can Analyze Profile from Real-Time Conversation by Unsupervised Learning
Shigehisa Omatsu (dAIgnosis,Inc.)
To convert phonemes of telephone conversations and responses at meetings into texts in real time, pass the text to the computational model created by DGX-1, label with a learning without teacher, and add the clusters, we are developing a system which ...Read More
To convert phonemes of telephone conversations and responses at meetings into texts in real time, pass the text to the computational model created by DGX-1, label with a learning without teacher, and add the clusters, we are developing a system which compares objects and analyzes meaning of conversation and profiles of interlocutors. With this technology, customers can receive appropriate responses at the beginning of a conversation with a help desk, and patients can receive correspondence during a remote diagnosis with a doctor based solely off of their dialogue and examination results. By using TensorFlow as a platform and running the K-Means method, Word2vec, Doc2Vec, etc. in DGX-1 clustered environment on DGX-1, the result of arithmetic processing is found at high speed conversation. Even if the amount of sentences is increased, the learning effect increases linearly, demonstrating that the proportion of validity can be raised without taking grammar of languages ??other than English (e.g. Japanese) into account.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8371
Streaming:
Download:
 
Embodied Question Answering
Abhishek Das (Georgia Tech)
Building intelligent agents that possess the ability to perceive the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and execute actions in a physical environment, has been a long-term ...Read More
Building intelligent agents that possess the ability to perceive the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and execute actions in a physical environment, has been a long-term goal of Artificial Intelligence. In this talk, I will present my recent work on an instantiation of this goal -- Embodied Question Answering (EQA) -- where an agent that is spawned at a random location in an environment (a house or building) is asked a natural language question ("What color is the car?"). The agent perceives its environment through first-person vision and can perform a few 'atomic' actions: move-{forward, backward, right, left}, and turn-{right, left}. The objective of the agent is to explore the environment and gather visual information necessary to answer the question ("orange"). I'll introduce our OpenGL-based environments, a large-scale dataset of expert demonstrations for this task and deep models, trained end-to-end using reinforcement learning, from raw pixels to multi-step navigation control to visual question answering.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8582
Streaming:
Download:
 
Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow
Alexander Sergeev (Uber)
Horovod makes it easy to train a single GPU TensorFlow model on many GPUs; both on a single server and across multiple servers. We'll cover Uber's explorations of distributed deep learning, how to use Horovod, and what kind of performance you ...Read More
Horovod makes it easy to train a single GPU TensorFlow model on many GPUs; both on a single server and across multiple servers. We'll cover Uber's explorations of distributed deep learning, how to use Horovod, and what kind of performance you can get on standard models, such as Inception V3 and ResNet-101. Learn how to speed up training of your TensorFlow model with Horovod.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, HPC and AI, GTC Silicon Valley 2018 - ID S8152
Streaming:
 
Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang (Institute of Automation, Chinese Academy of Sciences)
We'll present a unique framework for cross-modal image and sentence matching; namely selective multimodal long short-term memory (LSTM) that incorporates a new deep learning module as multimodal context-modulated attention network to selectively att ...Read More
We'll present a unique framework for cross-modal image and sentence matching; namely selective multimodal long short-term memory (LSTM) that incorporates a new deep learning module as multimodal context-modulated attention network to selectively attend to pairwise semantic concepts. In detail, effective image and sentence matching depends on measuring their global visual-semantic similarity. Based on the observation that such a global similarity arises from a complex aggregation of multiple local similarities between pairwise instances of image (objects) and sentence (words), we propose a selective multimodal LSTM network (sm-LSTM) for instance-aware image and sentence matching. The sm-LSTM includes a multimodal context-modulated attention scheme at each timestep that can selectively attend to a pair of instances of image and sentence by predicting pairwise instance-aware saliency maps for image and sentence. For selected pairwise instances, their representations are obtained based on the predicted saliency maps, and then compared to measure their local similarity. By similarly measuring multiple local similarities within a few timesteps, the sm-LSTM sequentially aggregate.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8281
Streaming:
Download:
 
Towards AI Agents That Can See, Talk, and Act
Dhruv Batra (Georgia Tech and Facebook AI Research)
We are witnessing unprecedented advances in computer vision and AI. What lies next for AI? We believe that the next generation of intelligent systems (say the next generation of Google's Assistant, Facebook's M, Apple's Siri, Amazon's Alexa) will ...Read More
We are witnessing unprecedented advances in computer vision and AI. What lies next for AI? We believe that the next generation of intelligent systems (say the next generation of Google's Assistant, Facebook's M, Apple's Siri, Amazon's Alexa) will need to possess the ability to perceive their environment (through vision, audition, or other sensors), communicate (i.e., hold a natural language dialog with humans and other agents), and act (e.g., aid humans by executing API calls or commands in a virtual or embodied environment), for tasks such as: aiding visually impaired users in understanding their surroundings; interacting with an AI assistant (Human: 'Alexa can you see the baby in the baby monitor?', AI: 'Yes, I can', Human: 'Is he sleeping or playing?'); robotics applications (e.g. search and rescue missions) where the operator may be situationally blind and operating via language. We'll present work from our lab on a range of projects on such visually grounded conversational agents.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8571
Streaming:
Download:
 
Scaling Convolutional Neural Networks with Kubernetes and TensorFlow on AWS GPUs
Reza Zadeh (Matroid)
In this session we present a Kubernetes deployment on Amazon AWS GPUs that provide customized computer vision to a large number of users. Reza offers an overview of Matroid's pipeline and demonstrates how to customize computer vision neural network ...Read More
In this session we present a Kubernetes deployment on Amazon AWS GPUs that provide customized computer vision to a large number of users. Reza offers an overview of Matroid's pipeline and demonstrates how to customize computer vision neural network models in the browser, followed by building, training, and visualizing TensorFlow models, which are provided at scale to monitor video streams.  Back
 
Keywords:
AI and DL Research, Data Center and Cloud Infrastructure, Computer Vision, GTC Silicon Valley 2018 - ID S8610
Streaming:
Download:
 
Audio Recognition, Context-Awareness, and its Applications
Yoonchang Han (cochlear.ai)
We'll explain the concept and the importance of audio recognition, which aims to understand literally all the information contained in the audio, not limiting its scope to speech recognition. It includes the introduction of various types of non ...Read More
We'll explain the concept and the importance of audio recognition, which aims to understand literally all the information contained in the audio, not limiting its scope to speech recognition. It includes the introduction of various types of non-verbal information contained in the audio such as acoustic scenes/events, speech, and music. This session is helpful to the people who are not familiar with audio processing but are interested in the context-aware system. Also, it might be inspiring for someone who develops AI applications such as AI home assistant, a humanoid robot, and self-driving cars. It also covers the potential use-cases and creative applications, including a video demonstration of the audio context-aware system applied to media-art performance for real-time music generation.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, NVIDIA Inception Program, GIS, GTC Silicon Valley 2018 - ID S8696
Streaming:
Download:
 
Trade and Manage Wealth with Deep Reinforcement Learning and Memory
Daniel Egloff (Flink AI)
We'll present how deep reinforcement learning (DRL) and memory extended networks can be used to train agents, which optimize asset allocations or propose trading actions. The memory component is crucial for improved mini-batch parallelization and he ...Read More
We'll present how deep reinforcement learning (DRL) and memory extended networks can be used to train agents, which optimize asset allocations or propose trading actions. The memory component is crucial for improved mini-batch parallelization and helps mitigate catastrophic forgetting. We also address how concepts from risk-sensitive and safe reinforcement learning apply to improve the robustness of the learned policies. The DRL approach has several advantages over the industry standard approach, which is still based on the mean variance portfolio optimization. The most significant benefit is that the information bottleneck between the statistical return model and the portfolio optimizer is removed, and available market data and trade history are used much more efficiently.  Back
 
Keywords:
AI and DL Research, Algorithms and Numerical Techniques, Advanced AI Learning Techniques (incl. GANs and NTMs), Finance, GTC Silicon Valley 2018 - ID S8679
Streaming:
Download:
 
(Deep) Learning to Grasp with a Close-Loop DNN Controller
Iuri Frosio (NVIDIA), Mengyuan Yan (Stanford University)
The paradigm for robot programming is changing with the adoption of the deep learning approach in the field of robotics. Instead of hard coding a complex sequence of actions, tasks are acquired by the robot through an active learning procedure. This ...Read More
The paradigm for robot programming is changing with the adoption of the deep learning approach in the field of robotics. Instead of hard coding a complex sequence of actions, tasks are acquired by the robot through an active learning procedure. This introduces new challenges that have to be solved to achieve effective training. We'll show several issues that can be encountered while learning a close-loop DNN controller aimed at a fundamental task like grasping, and their practical solutions. First, we'll illustrate the advantages of training using a simulator, as well as the effects of choosing different learning algorithms in the reinforcement learning and imitation learning domains. We'll then show how separating the control and vision modules in the DNN can simplify and speed up the learning procedure in the simulator, although the learned controller hardly generalizes to the real world environment. Finally, we'll demonstrate how to use domain transfer to train a DNN controller in a simulator that can be effectively employed to control a robot in the real world.  Back
 
Keywords:
AI and DL Research, IoT, Robotics & Drones, Computer Vision, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8132
Streaming:
Download:
 
Affective Categorization Using Contactless-Based Accelerometers
Refael Shamir (Letos)
We'll cover the four known methods for emotion detection: vision, speech, sentiment analysis, and wearable technology. We'll provide a quick dive through each presented solution, and then introduce a novel approach aimed for the future of autonomou ...Read More
We'll cover the four known methods for emotion detection: vision, speech, sentiment analysis, and wearable technology. We'll provide a quick dive through each presented solution, and then introduce a novel approach aimed for the future of autonomous vehicles.  Back
 
Keywords:
AI and DL Research, Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8352
Streaming:
Download:
 
Graduate Fellowship FastForward Talks
Robin Betz (Stanford University), Awni Hannun (Stanford University), Robert Konrad (Stanford University), Deepak Pathak (UC Berkeley), Fereshteh Sadeghi (University of Washington), Abigail See (Stanford University), Anna Shcherbina (Stanford University), Caroline Trippel (Princeton University)
Join a special presentation from our 2017-2018 Graduate Fellowship recipients to learn "what's next" out of the world of research and academia. Sponsored projects involve a variety of technical challenges, including distributed systems for ...Read More
Join a special presentation from our 2017-2018 Graduate Fellowship recipients to learn "what's next" out of the world of research and academia. Sponsored projects involve a variety of technical challenges, including distributed systems for large-scale deep learning; dynamic data structures for massively parallel machine learning; machine learning techniques for biomedical image analysis; visual dynamics; and compilation frameworks for high-performance graphics systems. We believe that these minds lead the future in our industry and we're proud to support the 2016-2017 NVIDIA Graduate Fellows. We'll also announce the 2017-2018 Graduate Fellows at this session. For more information on the NVIDIA Graduate Fellowship program, visit www.nvidia.com/fellowship.  Back
 
Keywords:
AI and DL Research, Virtual Reality and Augmented Reality, Graphics and AI, Computational Biology and Chemistry, Computer Vision, GTC Silicon Valley 2018 - ID S8793
Streaming:
 
Learning Rigidity in Dynamic Scenes for Scene Flow Estimation
Kihwan Kim (NVIDIA)
Estimation of 3D motion in a dynamic scene from a pair of images is a core task in many scene understanding problems. In real world applications, a dynamic scene is commonly captured by a moving camera (i.e., panning, tilting or hand-held), increasin ...Read More
Estimation of 3D motion in a dynamic scene from a pair of images is a core task in many scene understanding problems. In real world applications, a dynamic scene is commonly captured by a moving camera (i.e., panning, tilting or hand-held), increasing the task complexity because the scene is observed from different viewpoints. The main challenge is the disambiguation of the camera motion from scene motions, which becomes more difficult as the amount of rigid parts observed decreases. In this talk, We introduce a method to learn a rigidity of a scene from a large collection of dynamic scene data, and directly infer a rigidity mask from two sequential RGB-D images in a supervised manner. With the learned network, we show how we can effectively estimate camera motion and projected scene flow using computed 2D optical flow and the inferred rigidity mask. Through evaluations, we show that our methods can make the scene flow estimation more robust and stable over state-of-the-art methods in challenging dynamic scenes. The expected audiences will include people who are interested in computer vision algorithms, but not limited to any audiences interested in AI and machine learning in general. We'll cover: the motivation behind scene flow estimation, potential applications, how we train two networks for the scene flow estimation, and how we evaluate the algorithm with popular benchmark dataset, SINTEL. We'll also show a new semi-synthetic dataset and its generation method where we mix real video footage with virtually rendered foreground scenes.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8798
Streaming:
Download:
 
Deep Learning for Transportation: Fast Estimation of Travel Times Using Historical Routes
Dmitry Kudinov (Esri Inc.)
During this presentation we will review a deep neural network architecture and its training approaches used for producing high volume of estimations of travel times on a road graph with historical routes and traffic. This includes initial and continu ...Read More
During this presentation we will review a deep neural network architecture and its training approaches used for producing high volume of estimations of travel times on a road graph with historical routes and traffic. This includes initial and continuous online training, finding various sources to produce training data, challenges of quality control, and, of course, the invaluable role of GPU's for computation during both training and inference.  Back
 
Keywords:
AI and DL Research, Product & Building Design, Intelligent Video Analytics and Smart Cities, GIS, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8156
Streaming:
Download:
 
Block-Sparse Recurrent Neural Networks
Sharan Narang (Baidu USA), Eric Undersander (Baidu USA)
Recurrent neural networks are used in state-of-the-art models in domains such as speech recognition, machine translation, and language modeling. Sparsity is a technique to reduce compute and memory requirements of deep learning models. Sparse RNNs ar ...Read More
Recurrent neural networks are used in state-of-the-art models in domains such as speech recognition, machine translation, and language modeling. Sparsity is a technique to reduce compute and memory requirements of deep learning models. Sparse RNNs are easier to deploy on devices and high-end server processors. Even though sparse operations need less compute and memory relative to their dense counterparts, the speed-up observed by using sparse operations is less than expected on different hardware platforms. To address this issue, we prune blocks of weights in a layer instead of individual weights. Using these techniques, we can create block-sparse RNNs with sparsity ranging from 80% to 90% with a small loss in accuracy. This technique allows us to reduce the model size by 10x. Additionally, we can prune a larger dense network to recover this loss in accuracy while maintaining high block sparsity and reducing the overall parameter count. Our technique works with a variety of block sizes up to 32x32. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.  Back
 
Keywords:
AI and DL Research, HPC and AI, GTC Silicon Valley 2018 - ID S8924
Streaming:
Download:
 
Learning from Limited Data
Tatsuya Harada (University of Tokyo)
Constructing an accurate prediction model from limited data is one of the important tasks in machine learning. We'll introduce unsupervised domain adaptation and a learning method using interclass patterns as a method to construct accurate predictio ...Read More
Constructing an accurate prediction model from limited data is one of the important tasks in machine learning. We'll introduce unsupervised domain adaptation and a learning method using interclass patterns as a method to construct accurate prediction models from limited data. Regarding unsupervised domain adaptation, we use three networks asymmetrically. Two networks are used to label unlabeled target patterns, and one network is trained by the pseudo-labeled patterns to obtain target-discriminative representations. About the learning method using interclass patterns, we generate interclass patterns by mixing two patterns belonging to different classes with a random ratio and train the model to output the mixing ratio form the mixed patterns. Although the algorithm is very simple, the proposed method significantly improves classification performance on sound recognition and image recognition. In addition, we'll briefly introduce various topics, including WebDNN, which our team is working on.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8786
Streaming:
Download:
 
Deep Generative Models for Image and Video Creation
Vineeth N Balasubramanian (Indian Institute of Technology (IIT), Hyderabad, India)
We'll focus on recent developments in deep learning-based generative models for image and video creation. The last two to three years have seen an explosive growth in the development of generative adversarial networks, variational autoencoders, and ...Read More
We'll focus on recent developments in deep learning-based generative models for image and video creation. The last two to three years have seen an explosive growth in the development of generative adversarial networks, variational autoencoders, and related autoregressive methods that have been made it possible to automatically generate images and videos, by harnessing the power of GPUs and deep learning libraries. These methods present interesting possibilities in automatic generation of datasets for training machine learning methods, as well as in real-world applications for image and video processing such as morphing, editing, advertising, design, and art. We'll present the technical details of these methods and recent results in various settings.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), Video and Image Processing, GTC Silicon Valley 2018 - ID S8784
Streaming:
 
Geometry-Aware Learning of Maps for Camera Localization
Jinwei Gu (NVIDIA)
Maps are a key component in image-based camera localization and visual SLAM systems: they are used to establish geometric constraints between images, correct drift in relative pose estimation, and relocalize cameras after lost tracking. The exact def ...Read More
Maps are a key component in image-based camera localization and visual SLAM systems: they are used to establish geometric constraints between images, correct drift in relative pose estimation, and relocalize cameras after lost tracking. The exact definitions of maps, however, are often application-specific and hand-crafted for different scenarios (e.g., 3D landmarks, lines, planes, bags of visual words). We propose to represent maps as a deep neural net called MapNet, which enables learning a data-driven map representation. Unlike prior work on learning maps, MapNet exploits cheap and ubiquitous sensory inputs like visual odometry and GPS in addition to images and fuses them together for camera localization. Geometric constraints expressed by these inputs, which have traditionally been used in bundle adjustment or pose-graph optimization, are formulated as loss terms in MapNet training and also used during inference. In addition to directly improving localization accuracy, this allows us to update the MapNet (i.e., maps) in a self-supervised manner using additional unlabeled video sequences from the scene.  Back
 
Keywords:
AI and DL Research, Autonomous Vehicles, Computer Vision, GTC Silicon Valley 2018 - ID S8792
Streaming:
 
Dense Connection Networks for Conversational Speech Recognition
Kyu Han (Capio Inc.), Ian Lane (Carnegie Mellon University)
Densely connected neural networks were originally introduced to avoid the problem of layer-wise vanishing gradients when CNNs are stacked in a very deep fashion, specifically for image recognition tasks. Inspired by these works, we've explored the u ...Read More
Densely connected neural networks were originally introduced to avoid the problem of layer-wise vanishing gradients when CNNs are stacked in a very deep fashion, specifically for image recognition tasks. Inspired by these works, we've explored the use of dense networks connections within LSTM models for the task of automatic speech recognition. By introducing additional connections, to connect (almost) every layer to at least one other layer, we mitigate the vanishing gradient effect between LSTM layers and enable error signals to propagated back to the very first layer during training. In this presentation, we'll present the fundamentals of speech recognition and introduce different neural network model structures that have been shown to be effective for this task. We'll then introduce identity, highway, and dense connections and demonstrate how they improve the performance of these models. We'll evaluate the performance of these models across different datasets, and show that with a lattice-based system combination, densely connected LSTMs significantly contributed to reaching the marks of 5.0% and 9.1% in word error rate (WER) for the Switchboard and CallHome testsets.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8903
Streaming:
 
Deep Learning Applications in E-Commerce
Krishnendu Chaudhury (Drishti Technologies)
In this talk we will present four applications of deep learning in e-commerce. 1) A deep neural net architecture which has been successfully deployed as a large scale Visual Search and Recommendation system for e-commerce. The deployment has been at ...Read More
In this talk we will present four applications of deep learning in e-commerce. 1) A deep neural net architecture which has been successfully deployed as a large scale Visual Search and Recommendation system for e-commerce. The deployment has been at Flipkart, India's largest e-Commerce vendor, over a catalog of 50M products, supporting 2K queries per second. Our results beat state of the art on the on the Exact Street2Shop dataset. 2) Visual Semantic embedding of e-Commerce products for enhanced searchability and product ranking. 3) Neural Network based click prediction. 4) A novel neural network architecture for demand prediction.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, Consumer Engagement and Personalization, Computer Vision, GTC Silicon Valley 2018 - ID S8684
Streaming:
 
Model Architectures and Training Techniques for High-Precision Landmark Localization
Sina Honari (University of Montreal - MILA), Pavlo Molchanov (NVIDIA)
We'll discuss training techniques and deep learning architectures for high-precision landmark localization. In the first part of the session, we'll talk about ReCombinator Networks, which aims at maintaining pixel-level image information ...Read More

We'll discuss training techniques and deep learning architectures for high-precision landmark localization. In the first part of the session, we'll talk about ReCombinator Networks, which aims at maintaining pixel-level image information, for high-accuracy landmark localization. This model combines coarse-to-fine features to first observe global (coarse) image information and then recombines local (fine) information. By using this model, we report SOTA on three facial landmark datasets. This model can be used for other tasks that require pixel-level accuracy (for example, image segmentation, image-to-image translation). In the second part, we'll talk about improving landmark localization in a semi-supervised setting, where less labeled data is provided. Specifically, we consider a scenario where few labeled landmarks are given during training, but lots of weaker labels (for example, face emotions, hand gesture) that are easier to obtain are provided. We'll describe training techniques and model architectures that can leverage weaker labels to improve landmark localization.

  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8406
Streaming:
Download:
 
Learning Robotic Plans from Real-World Demonstrations Using only Randomized Simulated Images
Jonathan Tremblay (NVIDIA)
Using only randomized simulated images, we'll present a system to infer and simply execute a human-readable robotic program after watching a real-world task demonstration. The system is comprised of a series of deep neural network modules, each lear ...Read More
Using only randomized simulated images, we'll present a system to infer and simply execute a human-readable robotic program after watching a real-world task demonstration. The system is comprised of a series of deep neural network modules, each learned entirely in simulation. During training, images are generated in a gaming engine and made transferable to the real world by domain randomization. After training, the system is straightforwardly deployed on a real robot with no retuning of the neural networks and having never previously seen a real image. We demonstrate the system on a Baxter robot performing block tower construction tasks.  Back
 
Keywords:
AI and DL Research, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8439
Streaming:
 
Can AI Generate Love Advice? Neural Conclusion-Supplement Answer Generation for Non Factoid Questions
Makoto Nakatsuji (NTT Resonant)
Learn how to generate long answers for non-factoid questions in quality assurance community sites by using the encoder-decoder framework. We'll present our novel extension of the encoder-decoder framework, called the ensemble network, that goes beyo ...Read More
Learn how to generate long answers for non-factoid questions in quality assurance community sites by using the encoder-decoder framework. We'll present our novel extension of the encoder-decoder framework, called the ensemble network, that goes beyond a single short sentence. It handles several sentences (i.e. two major sentence types that organize answers for non-factoid questions, conclusion statements, and its supplementary ones) to generate complicated non-factoid answers.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8301
Streaming:
 
Re3: Realtime Recurrent Regression Networks for Visual Tracking of Generic Objects
Daniel Gordon (University of Washington)
Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, motion, and change over time. A tracker must be able to modify its underlying model and adapt to new observations. We present Re3, a real-ti ...Read More

Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, motion, and change over time. A tracker must be able to modify its underlying model and adapt to new observations. We present Re3, a real-time deep object tracker capable of incorporating temporal information into its model. Rather than focusing on a limited set of objects or training a model at test-time to track a specific instance, we pretrain our generic tracker on a large variety of objects and efficiently update on the fly; Re3 simultaneously tracks and updates the appearance model with a single forward pass. This lightweight model is capable of tracking objects at 150 FPS, while attaining competitive results on challenging benchmarks. We also show that our method handles temporary occlusion better than other comparable trackers using experiments that directly measure performance on sequences with occlusion.

  Back
 
Keywords:
AI and DL Research, Intelligent Video Analytics and Smart Cities, Autonomous Machines, Computer Vision, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8298
Streaming:
 
Training ImageNet In 15 Minutes With ChainerMN: A Scalable Distributed DL Framework
Keisuke Fukuda (Preferred Networks, Inc.)
We''ll present a multi-node distributed deep learning framework called ChainerMN. Even though GPUs are continuously gaining more computation throughput, it is still very time-consuming to train state-of-the-art deep neural network models. For better ...Read More
We''ll present a multi-node distributed deep learning framework called ChainerMN. Even though GPUs are continuously gaining more computation throughput, it is still very time-consuming to train state-of-the-art deep neural network models. For better scalability and productivity, it is paramount to accelerate the training process by using multiple GPUs. To enable high-performance and flexible distributed training, ChainerMN was developed and built on top of Chainer. We''ll first introduce the basic approaches to distributed deep learning and then explain the design choice, basic usage, and implementation details of Chainer and ChainerMN. To demonstrate the scalability and efficiency of ChainerMN, we''ll discuss the remarkable results from training ResNet-50 classification model on ImageNet database using 1024 Tesla P100 GPUs and our in-house cluster, MN-1.    Back
 
Keywords:
AI and DL Research, NVIDIA Inception Program, Deep Learning and AI Frameworks, HPC and AI, GTC Silicon Valley 2018 - ID S8889
Streaming:
Download:
 
Towards Theory of AI's Mind
Devi Parikh (Georgia Tech and Facebook AI Research)
To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, research has focused primarily on making AI more accurate, and ( ...Read More
To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, research has focused primarily on making AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts. The latter involves making AI more human-like and having it develop a theory of our minds. In this talk, I will argue that for human-AI teams to be effective, humans must also develop a Theory of AI''s Mind get to know its strengths, weaknesses, beliefs, and quirks. I will present some (very) initial results in the context of visual question answering and visual dialog where the AI agent is trained to answer natural language questions about images.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8560
Streaming:
Download:
 
Deep Learning for Computational Science
Yang Juntao (NVIDIA)
We''ll review our study of the use of artificial intelligence to augment various domains of computational science in order to improve time to solution for various HPC problems. We''ll discuss the current state-of-the-art approaches and performance ga ...Read More
We''ll review our study of the use of artificial intelligence to augment various domains of computational science in order to improve time to solution for various HPC problems. We''ll discuss the current state-of-the-art approaches and performance gains where applicable. We''ll also investigate current barriers to adoption and consider possible solutions.  Back
 
Keywords:
AI and DL Research, HPC and AI, GTC Silicon Valley 2018 - ID S8242
Streaming:
Download:
 
Deep Learning for Driver State Sensing
Lex Fridman (MIT)
We''ll explore how deep learning approaches can be used for perceiving and interpreting the driver''s state and behavior during manual, semi-autonomous, and fully-autonomous driving. We''ll cover how convolutional, recurr ...Read More

We''ll explore how deep learning approaches can be used for perceiving and interpreting the driver''s state and behavior during manual, semi-autonomous, and fully-autonomous driving. We''ll cover how convolutional, recurrent, and generative neural networks can be used for applications of glance classification, face recognition, cognitive load estimation, emotion recognition, drowsiness detection, body pose estimation, natural language processing, and activity recognition in a mixture of audio and video data.

  Back
 
Keywords:
AI and DL Research, Autonomous Vehicles, Autonomous Driving, GTC Silicon Valley 2018 - ID S8626
Streaming:
 
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
Fangchang Ma (Massachusetts Institute of Technology)
Learn how to predict a dense depth image from a sparse set of depth measurements and a single RGB image. This approach can be applied to serve as a plug-in module in simultaneous localization and mapping to convert sparse maps to dense maps, and as a ...Read More
Learn how to predict a dense depth image from a sparse set of depth measurements and a single RGB image. This approach can be applied to serve as a plug-in module in simultaneous localization and mapping to convert sparse maps to dense maps, and as a super-resolution of LiDAR depth data. We''ll describe the performance of our prediction method, explain how to train the depth prediction network, and showcase examples of its applications. Codes and video demonstration are also publicly available. This session is for registrants who are already familiar with basic machine learning techniques.  Back
 
Keywords:
AI and DL Research, Computer Vision, GTC Silicon Valley 2018 - ID S8216
Streaming:
 
Additive Learning Framework for Self-Evolving AI
Arpit Baheti (NVIDIA), Sagar Bhokre (NVIDIA)
We''ll present a framework that can learn a compute-intensive deep neural networks (DNNs) task using multiple AI blocks and evolve better confidence by combining estimates. We''ll consider the example of establishing the identity of a user using spee ...Read More
We''ll present a framework that can learn a compute-intensive deep neural networks (DNNs) task using multiple AI blocks and evolve better confidence by combining estimates. We''ll consider the example of establishing the identity of a user using speech and image data. The system consists of two blocks - the AI block and Arbiter block. The AI block uses multiple DNNs (voice-based and image-based DNNs that generate a low confidence estimate initially). These AI blocks assist each other using Arbiter blocks and build confidence, improve accuracy, and learn salient features over time. Arbiter can store recent unacquainted data at run time in noisy and distorted environments and train the AI blocks periodically or on an on-demand basis. This concept could potentially improve the automatic speech recognition capabilities and allow detection of faces even when variable features of faces change with time. The GPU is the ideal choice as the task requires inferencing as well as training on the go.  Back
 
Keywords:
AI and DL Research, Intelligent Video Analytics and Smart Cities, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8331
Streaming:
Download:
 
Attention GAN for Fine-Grained Language-to-Image Generation
Pengchuan Zhang (Microsoft Research)
We have long envisioned that machines one day can perform human-like perception, reasoning, and expression across multiple modalities including vision and language, which will augment and transform the ways humans communicate with each other and with ...Read More
We have long envisioned that machines one day can perform human-like perception, reasoning, and expression across multiple modalities including vision and language, which will augment and transform the ways humans communicate with each other and with the real world. With this vision, we''ll introduce the latest work of developing a deep attention GAN for fine-grained language-to-image synthesis. We''ll discuss the open problems behind the task that we''re thrilled to solve, including image and language understanding, joint reasoning across both modalities, and expressing abstract concepts into full imagination, which are of fundamental importance to reaching general intelligence.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8867
Streaming:
Download:
 
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Song Han (Stanford/Google/MIT)
We find 99.9 percent of the gradient exchange in distributed SGD is redundant, and we propose deep gradient compression (DGC) to greatly reduce the communication bandwidth and improve the scalability of distributed training. To preserve accuracy duri ...Read More
We find 99.9 percent of the gradient exchange in distributed SGD is redundant, and we propose deep gradient compression (DGC) to greatly reduce the communication bandwidth and improve the scalability of distributed training. To preserve accuracy during this compression, DGC employs four methods: momentum correction, local gradient clipping, momentum factor masking, and warm-up training. We have applied DGC to image classification, speech recognition, and language modeling with multiple datasets including Cifar10, ImageNet, Penn Treebank, and Librispeech Corpus. In all these scenarios, DGC achieves a gradient compression ratio from 270x to 600x without losing accuracy, cutting the gradient size of ResNet-50 from 97MB to 0.35MB, and for DeepSpeech from 488MB to 0.74MB. DGC enables large-scale distributed training on inexpensive commodity 1Gbps Ethernet and facilitates distributed training on mobile.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8607
Streaming:
 
Deep Learning for Recommender Systems
Justin Basilico (Netflix), Yves Raimond (Netflix)
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don''t perform better than typical collaborative filtering techniques. Then ...Read More

In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don''t perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.

  Back
 
Keywords:
AI and DL Research, Consumer Engagement and Personalization, Deep Learning and AI, GTC Silicon Valley 2018 - ID S81011
Streaming:
Download:
 
Efficient Communication Library for Large-Scale Deep-Learning
Minsik Cho (IBM Research)
We''ll talk about the challenges in a large-scale distributed, GPU-based deep learning, and propose an efficient communication algorithm to achieve state-of-the-art scalability. In detail, we''ll explain various ways to speed up GPU-based deep learni ...Read More
We''ll talk about the challenges in a large-scale distributed, GPU-based deep learning, and propose an efficient communication algorithm to achieve state-of-the-art scalability. In detail, we''ll explain various ways to speed up GPU-based deep learning, and motivate the large-scale deep-learning in the performance context. Then, we will state that efficient communication is a grand challenge in the large-scale deep-learning, especially with upcoming more powerful GPUs such as Volta architecture Tesla V100. We''ll present the technical details on a proposed communication algorithm along with the supporting data collected with more than 100 GPUs.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8479
Download:
 
Designing Human Centric Spaces with Holodeck and Machine Learning
Cobus Bothma (KPF), Xin Zhang (Kohn Pedersen Fox Associates)
The growth in density of housing in cities like London and New York has resulted in the higher demand for efficient smaller apartments. These designs challenge the use of space and function while trying to ensure the occupants have the perceptio ...Read More

The growth in density of housing in cities like London and New York has resulted in the higher demand for efficient smaller apartments. These designs challenge the use of space and function while trying to ensure the occupants have the perception of a larger space than provided. The process of designing these spaces has always been the responsibility and perception of a handful of designers using 2D and 3D static platforms as part of the overall building design and evaluation, typically constraint by a prescriptive program and functional requirement. A combination of human- and AI-based agents creating and testing these spaces through design and virtual immersive environments (NVIDIA Holodeck) will attempt to ensure the final results are efficient and best fit for human occupancy prior to construction.

  Back
 
Keywords:
AI and DL Research, Virtual Reality and Augmented Reality, GTC Silicon Valley 2018 - ID S8398
Streaming:
Download:
 
Learning Steering Bounds for Parallel Autonomy: Handling Ambiguity in End-to-End Driving
Alexander Amini (Massachusetts Institute of Technology)
End-to-end learning is a powerful new strategy for training neural networks from perception to control. While such systems have been shown to perform well for reactionary control, the representation learned is not usable for higher level decision mak ...Read More
End-to-end learning is a powerful new strategy for training neural networks from perception to control. While such systems have been shown to perform well for reactionary control, the representation learned is not usable for higher level decision making, such as navigation. We''ll discuss the latest methodologies for training end-to-end systems for parallel autonomy, and demonstrate some of the shortcomings when such decision making capability is needed.  Back
 
Keywords:
AI and DL Research, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8605
Streaming:
Download:
 
Synthetic Data Generation for an All-in-One Driver Monitoring System
Sagar Bhokre (NVIDIA)
Driver monitoring systems are used to detect many driver attributes like gaze, head pose, eye openness, and other features pertaining to attention and assistance. We''ll present a synthetic method of generating data for training DNNs, which caters to ...Read More
Driver monitoring systems are used to detect many driver attributes like gaze, head pose, eye openness, and other features pertaining to attention and assistance. We''ll present a synthetic method of generating data for training DNNs, which caters to the above mentioned features of the subject. We use blender for generating synthetic images, powered by NVIDIA GPUs, which can be scaled to match training needs. Synthetic data generatation allows precise control over data points that are difficult to control in a real environment, like pupil dialation. This approach avoids noisy measurements and results in high accuracy without the need for a high-precision 3D sensor.  Back
 
Keywords:
AI and DL Research, Autonomous Vehicles, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8324
Streaming:
Download:
 
Deep Learning For Intelligent Multi-Sensor Analytics
Kyle Muchmore (KickView), David Ohm (KickView)
Go beyond working with a single sensor and enter the realm of Intelligent Multi-Sensor Analytics (IMSA). We''ll introduce concepts and methods for using deep learning with multi-sensor, or heterogenous, data. There are many resources and ...Read More

Go beyond working with a single sensor and enter the realm of Intelligent Multi-Sensor Analytics (IMSA). We''ll introduce concepts and methods for using deep learning with multi-sensor, or heterogenous, data. There are many resources and examples available for learning how to leverage deep learning with public imagery datasets. However, few resources exist to demonstrate how to combine and use these techniques to process multi-sensor data. As an example, we''ll introduce some basic methods for using deep learning to process radio frequency (RF) signals and make it a part of your intelligent video analytics solutions. We''ll also introduce methods for adapting existing deep learning frameworks for multiple sensor signal types (for example, RF, acoustic, and radar). We''ll share multiple use cases and examples for leveraging IMSA in smart city, telecommunications, and security applications.

  Back
 
Keywords:
AI and DL Research, Intelligent Video Analytics and Smart Cities, Autonomous Machines, GTC Silicon Valley 2018 - ID S8260
Streaming:
Download:
 
Accelerating Scientific Simulation with Generative Adversarial Networks
Luke de Oliveira (Vai Technologies), Benjamin Nachman (Lawrence Berkeley National Laboratory), Michela Paganini (Yale University)
Many scientific and engineering fields increasingly rely on complex and time consuming computational simulation as part of the modern scientific workflow. In many applications, such as High Energy Particle Physics, Cosmology, Geophysics, and others, ...Read More
Many scientific and engineering fields increasingly rely on complex and time consuming computational simulation as part of the modern scientific workflow. In many applications, such as High Energy Particle Physics, Cosmology, Geophysics, and others, simulations are the computational bottleneck for producing and testing results. We introduce the usage of Generative Adversarial Networks (GAN) as a potential tool for speeding up expensive theoretical models and simulations in scientific and engineering applications, ushering in a new era of deep learning-powered scientific discovery. We will show that using a GAN-based High Energy Physics fast simulator on GPUs can provide speedups of up to 100,000x when compared to traditional simulation software, while retaining high levels of precision. Finally, we will discuss modeling and architectural considerations in this domain with the hope of directly empowering scientists and engineers in other fields to experiment with Generative Adversarial Networks in order to speed up simulation across scientific domains.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), HPC and AI, GTC Silicon Valley 2018 - ID S81001
Streaming:
Download:
 
Deep Reinforcement Learning for Real-World Robotic Manipulation
Tuomas Haarnoja (UC Berkeley)
Deep reinforcement learning (deep RL) has emerged as a promising direction for autonomous acquisition of complex behaviors due to its ability to process complex sensory input and to acquire elaborate behavior skills, using general-purpose neural netw ...Read More
Deep reinforcement learning (deep RL) has emerged as a promising direction for autonomous acquisition of complex behaviors due to its ability to process complex sensory input and to acquire elaborate behavior skills, using general-purpose neural network representations. Since learning expressive function approximators requires large quantities of data, deep RL has been mostly applied to simulated domains, such as video games and simulated robotic locomotion and manipulation tasks, where the data collection can occur faster than real time and be trivially parallelized. We''ll address techniques that have been proposed to enable deep RL for real-world robotics, and discuss how the maximum-entropy principle can be leveraged to reduce the required amount of real-world interaction.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8603
Streaming:
Download:
 
Generate Neural Network Automatically with High Accuracy and High Efficiency
Chao Xue (IBM)
Designing neural network architectures are critical for deep learning applications, but it is so complex and depends on AI experts. We''ll demonstrate how you can learn how to construct neural networks automatically without the human intervention. Th ...Read More
Designing neural network architectures are critical for deep learning applications, but it is so complex and depends on AI experts. We''ll demonstrate how you can learn how to construct neural networks automatically without the human intervention. There are two fundamental limiters to the performance of auto-generated neural networks: accuracy and efficiency, which is caused by searching overhead. We''ll also explore new techniques to make auto-generated neural network methods accurate and efficient, including: end-to-end technology to construct neural network within reinforcement learning, adaptive random search and bayesian optimization framework for different AI domains, such as computer vision, IoT acoustics, NLP and finance; using historical knowledge bases to reduce the searching overhead; and scheduling the execution of searching tasks over multiple NVIDIA GPUs to speed up the searching process. Also, we''ll give both the theoretical analysis and experiment results, which show significant improvement of accuracy and substantial reduction of searching time.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8234
Streaming:
Download:
 
GPU Accelerated Sequence Learning for Action Recognition
Yemin Shi (Peking University)
We''ll introduce several attempts for modeling the long-term sequence dependence to help improve the action recognition performance. First, we''ll introduce a fused feature of deep and hand-crafted features to prove the complementation between them. ...Read More
We''ll introduce several attempts for modeling the long-term sequence dependence to help improve the action recognition performance. First, we''ll introduce a fused feature of deep and hand-crafted features to prove the complementation between them. We''ll also introduce an attempt of attention model to illustrate the effectiveness of attention mechanism on action recognition. We''ll then introduce shuttleNet, which is a biologically-inspired neural network. Finally, we''ll give some divergent experiments on action recognition to show the potential research direction.  Back
 
Keywords:
AI and DL Research, Computer Vision, Video and Image Processing, GTC Silicon Valley 2018 - ID S8229
Streaming:
Download:
 
The Future of the In-Car Experience
Abdelrahman Mahmoud (Affectiva), Ashutosh Sanan (Affectiva)
As the race to full autonomy accelerates, the in-cab transportation experience is also being redefined. Future vehicles will sense the passengers'' identities and activities, as well as their cognitive and emotional states, to adapt and ...Read More

As the race to full autonomy accelerates, the in-cab transportation experience is also being redefined. Future vehicles will sense the passengers'' identities and activities, as well as their cognitive and emotional states, to adapt and optimize their experience. AI capable of interpreting what we call "people analytics" captured through their facial and vocal expressions, and aspects of the context that surrounds them will power these advances. We''ll give an overview of our Emotion AI solution, and describe how we employ techniques like deep learning-based spatio-temporal modeling. By combining these techniques with a large-scale dataset, we can develop AI capable of redefining the in-cab experience.

  Back
 
Keywords:
AI and DL Research, NVIDIA Inception Program, Deep Learning and AI Frameworks, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8758
Streaming:
Download:
 
Smart City: Deep Learning Model for Car-Pedestrian Interaction
Zoran Kostic (Columbia University)
In this talk we will discuss the work Columbia University, in partnership with NYC government, is using deep learning and GPUs to develop smart city traffic management facilitating support for navigation/movement of multitude of vehicles (including a ...Read More
In this talk we will discuss the work Columbia University, in partnership with NYC government, is using deep learning and GPUs to develop smart city traffic management facilitating support for navigation/movement of multitude of vehicles (including autonomous cars) in dense urban environments with many pedestrians. We will describe our work in real-time tracking of cars and pedestrians, prediction of movement based on historical observations of the intersection, backed by ultra-low latency wireless communications and edge computing nodes.  Back
 
Keywords:
AI and DL Research, Intelligent Video Analytics and Smart Cities, Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8201
Streaming:
 
Differentiable Tree Planning for Deep Reinforcement Learning
Gregory Farquhar (University of Oxford)
We''ll discuss recent research in deep reinforcement learning (RL), with a focus on the application of intuitions, from planning to neural network architectures for deep RL. Planning in complex visual environments has thus far been held back by the d ...Read More
We''ll discuss recent research in deep reinforcement learning (RL), with a focus on the application of intuitions, from planning to neural network architectures for deep RL. Planning in complex visual environments has thus far been held back by the difficulty of learning accurate predictive models. To address this, we embedded a model inside a differentiable, dynamically-constructed tree-planning architecture, so that we identify an effective model when used within that planner. We''ll share our work on developing these architectures, as well as our approaches to various technical obstacles associated with the efficient optimization of deep tree-structured models on GPU.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8787
Streaming:
Download:
 
Training Deep AutoEncoders for Collaborative Filtering
Oleksii Kuchaiev (NVIDIA)
This session will describe an approach to building personalized recommendations using (very) deep autoencoders. We will explore effects of different activation functions, network depth and novel algorithmic approaches. The model is trained end-to-end ...Read More
This session will describe an approach to building personalized recommendations using (very) deep autoencoders. We will explore effects of different activation functions, network depth and novel algorithmic approaches. The model is trained end-to-end without any layer-wise pre-training and our PyTorch-based code is publicly available.  Back
 
Keywords:
AI and DL Research, Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8212
Streaming:
Download:
 
GAN Fashion Photo Shoot: Garment to Model Images Using Conditional GANs
Costa Colbert (MAD Street Den, Inc./ VUE.ai)
Learn how VUE.ai''s model generator uses conditional GANs to produce product-specific images suitable for replacing photographs in catalogs. We''ll present networks that generate images of fashion models wearing specific garments, using an image of t ...Read More
Learn how VUE.ai''s model generator uses conditional GANs to produce product-specific images suitable for replacing photographs in catalogs. We''ll present networks that generate images of fashion models wearing specific garments, using an image of the garment as a conditioning variable. Network architecture variants, training, and manipulation of latent variables to control attributes such as model pose, build, or skin color will be addressed.  Back
 
Keywords:
AI and DL Research, NVIDIA Inception Program, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8776
Streaming:
Download:
 
Learning Affinity via Spatial Propagation Networks
Sifei Liu (NVIDIA)
We provide a unified framework on learning affinity in pure data-driven fashion using a linear propagation structure. This is a GPU and deep learning friendly pairwise learning module that does not require solving linear equation, iterative inference ...Read More
We provide a unified framework on learning affinity in pure data-driven fashion using a linear propagation structure. This is a GPU and deep learning friendly pairwise learning module that does not require solving linear equation, iterative inferences or manually defined kernels. Specifically, we develop a three-way connection for the linear propagation model, which formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix. The spatial propagation network can be applied to many affinity-related tasks, such as image matting, segmentation and colorization, to name a few. Essentially, the model can learn semantically aware affinity relations for high-level vision tasks due to the powerful learning capability of the deep CNN. We validate the framework on the task of refinement for image segmentation boundaries. Experiments on face parsing and semantic segmentation tasks show that the spatial propagation network provides a general, effective, and efficient solution for generating high-quality segmentation results.  Back
 
Keywords:
AI and DL Research, Computer Vision, Video and Image Processing, GTC Silicon Valley 2018 - ID S8312
Streaming:
Download:
 
Scaling Deep Learning for Immersive User Interfaces
Joel Hestness (Baidu Research)
Deep learning creates advances following a virtuous recipe: model architecture search, creating large training datasets, and scaling computation. Baidu Research''s Silicon Valley AI Lab develops state-of-the-art conversational user interfaces followi ...Read More
Deep learning creates advances following a virtuous recipe: model architecture search, creating large training datasets, and scaling computation. Baidu Research''s Silicon Valley AI Lab develops state-of-the-art conversational user interfaces following this DL recipe. We research new model architectures and features for speech recognition (Deep Speech 3), speech generation (Deep Voice 3), and natural language processing. To deploy these models in impactful products, we want a deep understanding of how recipe components coordinate to drive accuracy improvements. Through large-scale empirical studies, we find intriguing results about how deep learning is likely to scale: As training set size increases, DL model generalization error and model sizes scale as particular power-law relationships. For a fixed dataset size, as model size grows, training time remains roughly constant -- larger models require fewer steps to converge to the same accuracy. These scaling relationships have significant implications on DL research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about dataset growth and future computing system design.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8899
Streaming:
Download:
 
Synthetic Facial Data for Training Deep Neural Networks
Shalini De Mello (NVIDIA)
Training AI agents that can successfully generalize requires large amounts of diverse labeled training data. Collecting and labeling data is a significant cost in the development of AI applications, which, in some cases, may not even be feasib ...Read More
Training AI agents that can successfully generalize requires large amounts of diverse labeled training data. Collecting and labeling data is a significant cost in the development of AI applications, which, in some cases, may not even be feasible. We'll describe computer graphics facial models that we are developing to generate large labeled synthetic facial data for training deep neural networks. Facial analysis is central to many vision applications that involve human-computer interaction, including robotics, autonomous cars, rehabilitation, and extended usability. Generating and animating human faces with high realism is a well-studied problem in computer graphics; however, very few computer vision AI techniques take advantage of rendered facial data to augment or replace manually collected training data. We'll share key insights of how we successfully use synthetic facial data for training facial analysis classifiers. We'll also demonstrate many sub-tasks on which synthetic data helps to significantly improve accuracy and reduces the need for manual data collection.
 
  Back
 
Keywords:
AI and DL Research, Intelligent Video Analytics and Smart Cities, GTC Silicon Valley 2018 - ID S8794
Streaming:
 
De Novo Drug Design using Artificial Intelligence
Olexandr Isayev (University of North Carolina)
We propose a novel computational strategy based on deep and reinforcement learning techniques for de-novo design of molecules with desired properties. This strategy integrates two deep neural networks generative and predictive to generate novel c ...Read More
We propose a novel computational strategy based on deep and reinforcement learning techniques for de-novo design of molecules with desired properties. This strategy integrates two deep neural networks generative and predictive to generate novel chemical structures with the desired properties. In the first phase of the method, generative and predictive models are separately trained with supervised learning algorithms. In the second phase, both models are jointly trained with reinforcement learning approach to bias newly generated chemical structures towards those with desired physical and biological properties. In this proof-of-concept study, we have employed this strategy to design chemical libraries biased toward compounds with either maximal, minimal, or specific range of physical properties, such as melting point and hydrophobicity, as well as to develop novel putative inhibitors of JAK2. This new approach can find a general use for generating targeted chemical libraries optimized for a single desired property or multiple properties.  Back
 
Keywords:
AI and DL Research, Computational Biology and Chemistry, GTC Silicon Valley 2018 - ID S8254
Streaming:
Download:
 
Towards Learning to Imagine Videos with Controlled Content
Sergey Tulyakov (Snap Inc.)
We discuss one of the first attempts to teach computers to imagine or generate videos with controlled content using deep learning generative modeling techniques. To this end, we assume visual information in a natural video can be decomposed into two ...Read More
We discuss one of the first attempts to teach computers to imagine or generate videos with controlled content using deep learning generative modeling techniques. To this end, we assume visual information in a natural video can be decomposed into two major components: content and motion. While content encodes the objects present in the video, motion encodes the object dynamics. Based on this prior, we propose the motion and content decomposed generative adversarial network (MoCoGAN) framework for video generation. The proposed framework generates a video clip by sequentially mapping random noise vectors to video frames. We divide a random noise vector into content and motion parts. By controlling these parts we generate both the content of the video and the action that is being performed. We perform quantitative and qualitative analysis on several video datasets, including artificial shape motion, facial expression, and tai-chi videos.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), Computer Vision, GTC Silicon Valley 2018 - ID S8477
Streaming:
 
Quick and Easy DL Workflow Proof of Concept
Alec Gunny (NVIDIA), Kenneth Hester (NVIDIA), Jeffrey Weiss (NVIDIA)
Spin up a deep learning (DL) proof-of-concept on a budget. We'll walk you through a DL workflow in the cloud leveraging DIGITS, then download a trained model, and run inference on a Jetson TX2. This session considers multiple options such as Nimbix, ...Read More
Spin up a deep learning (DL) proof-of-concept on a budget. We'll walk you through a DL workflow in the cloud leveraging DIGITS, then download a trained model, and run inference on a Jetson TX2. This session considers multiple options such as Nimbix, AMI, and NGC on Tesla P100, Tesla V100, and NVIDIA DGX-1 servers. This tutorial will be a combination of lecture, live demos, and detailed instructions.  Back
 
Keywords:
AI and DL Research, Accelerated Analytics, GTC Silicon Valley 2018 - ID S8286
Download:
 
Fully Context-Aware Video Prediction
Wonmin Byeon (NVIDIA)
We'll discuss the development of a novel model for video prediction and analysis -- the parallel multi-dimensional long short-term memory (PMD-LSTM). PMD-LSTM is a general model for learning from higher dimensional data such as images, videos, and b ...Read More
We'll discuss the development of a novel model for video prediction and analysis -- the parallel multi-dimensional long short-term memory (PMD-LSTM). PMD-LSTM is a general model for learning from higher dimensional data such as images, videos, and biomedical scans. It is an extension of the popular LSTM recurrent neural networks to higher dimensional data with a rearrangement of the recurrent connections to dramatically increase parallelism. This gives the network the ability to compactly model the effect of long-range context in each layer, unlike convolutional networks, which need several layers to cover a larger input context. We'll discuss the blind spot problem in recent work on video prediction, and show how PMD-LSTM based models are fully context-aware for each predicted pixel. These models outperform comparatively complex state-of-the-art approaches significantly in a variety of challenging video prediction scenarios such as car driving, human motion, and diverse human actions.  Back
 
Keywords:
AI and DL Research, NVIDIA Inception Program, Computer Vision, GTC Silicon Valley 2018 - ID S8713
Streaming:
Download:
 
SpaceNet: Accelerating Automated Feature Extraction for Satellite Imagery - Two years, Four Competitions in the Making
Todd M. Bacastow (Radiant Solutions), David Lindenbaum (CosmiQ Works, and IQT Lab)
We'll present the results of the SpaceNet 2017-2018 Challenge, preview future SpaceNet Challenges, and how developers can generally access open labeled satellite image training data through SpaceNet on AWS. To date, three SpaceNet Challenges ha ...Read More
We'll present the results of the SpaceNet 2017-2018 Challenge, preview future SpaceNet Challenges, and how developers can generally access open labeled satellite image training data through SpaceNet on AWS. To date, three SpaceNet Challenges have been designed to apply computer vision techniques to satellite imagery which examine building footprint extraction, road network extraction, and off-nadir object detection. SpaceNet on AWS is an online repository of openly available satellite imagery, co-registered map data to train algorithms for developers and data scientists to access for research. This first-of-its-kind open innovation project for the geospatial industry launched in August 2016 as a collaboration between AWS, CosmiQ Works, DigitalGlobe, and NVIDIA. The SpaceNet Roads Challenge, launching in November, builds on labeled training datasets consisting of building footprints across Khartoum, Las Vegas, Paris, and Shanghai by providing over 8,000 km of mapped road networks. It uses a novel metric motivated by graph theory concepts that focused competitors on routing rather than just static road pixel identification.  Back
 
Keywords:
AI and DL Research, GIS, GTC Silicon Valley 2018 - ID S8553
Streaming:
 
Training Neural Networks with Mixed Precision: Theory and Practice
Paulius Micikevicius (NVIDIA)
We'll cover the theory and practice for training DNNs with Tensor Cores, introduced for AI processing with the Volta GPU architecture. Tensor Cores provide up to 120 TFlops throughput, mixing operations on IEEE half- and single-precision floats. In ...Read More
We'll cover the theory and practice for training DNNs with Tensor Cores, introduced for AI processing with the Volta GPU architecture. Tensor Cores provide up to 120 TFlops throughput, mixing operations on IEEE half- and single-precision floats. In the theory portion of the talk, we'll review the half-precision format, values that arise in DNN computations, and techniques that maximize utilization of fp16 format by these values. Techniques include loss-scaling, master weights, and choosing the proper precision for a given operation. In the practice portion of this talk, we'll survey various models that have been trained in mixed precision, matching the accuracy of fp32 training sessions while using the same hyperparameters. Models include various architectures (feed forward, recurrent, generative) as well as cover diverse tasks (image, speech, and language processing). We'll also provide network design and training guidelines to maximize speed when using Tensor Cores.  Back
 
Keywords:
AI and DL Research, Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8923
Streaming:
Download:
 
Deep Learning Applications in Text and Graphics at NVIDIA
Bryan Catanzaro (NVIDIA)
At NVIDIA, we're busy applying deep learning to diverse problems, and this talk will give an overview of a few of these applications. We'll discuss our resume matching system, which helps match candidates to job openings at NVIDIA, as well as an op ...Read More
At NVIDIA, we're busy applying deep learning to diverse problems, and this talk will give an overview of a few of these applications. We'll discuss our resume matching system, which helps match candidates to job openings at NVIDIA, as well as an open-source sentiment analysis project trained on unsupervised text that is improving our marketing capabilities. We'll discuss a blind image quality metric that we're using to lower the cost of raytracing photorealistic graphics, and a generative model that we've built to create realistic graphics from simplistic sketches.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8672
Streaming:
 
Unleashing the Imagination: Combining systems+software innovation with GPUs to create new capabilities
Hillery Hunter (IBM)
AI is one of the most rapidly-evolving areas of computer science today and datascientists are constantly pushing the boundaries of the possible -- wanting to explore new data types, new algorithms, and diverse and heterogenous models. In this talk we ...Read More
AI is one of the most rapidly-evolving areas of computer science today and datascientists are constantly pushing the boundaries of the possible -- wanting to explore new data types, new algorithms, and diverse and heterogenous models. In this talk we'll explore two key productivity factors for datascience -- first, speed and the ability to explore many models and sets of data quickly; and second, ability to explore broad types of models, incorporating both machine learning and deep learning. We will talk about results of 40x and 50x productivity through system+software co-design and novel algorithms which leverage Power Systems and GPUs for both deep learning and key areas of classical machine learning. System+software co-design and co-optimization can result in dramatic efficiency improvements, enable creation of large models, exploration of large datasets, and realize productivity gains for datascientists, freeing them up to focus on the fundamental science of deep and machine learning -- gaining accuracy, functionality, and generalizability of their models.  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S81025
Streaming:
 
Overcoming Missing Modalities in Remote Sensing
Benjamin Bischke (German Research Center for Artificial Intelligence (DFKI)), Damian Borth (German Research Center for Artificial Intelligence (DFKI))
Recent advances in earth observation are opening up a new exciting area for exploration of satellite image data. We'll teach you how to analyse this new data source with deep neural networks. Focusing on emergency response, you will learn how to app ...Read More
Recent advances in earth observation are opening up a new exciting area for exploration of satellite image data. We'll teach you how to analyse this new data source with deep neural networks. Focusing on emergency response, you will learn how to apply deep neural networks for semantic segmentation on satellite imagery. We will specifically focus on multimodal segmentation and the challenge of overcoming missing modality information during inference time. It is assumed that registrants are already familiar with fundamentals of deep neural networks.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8596
Streaming:
Download:
 
Adaptive Ray Tracing Rendering Powered by Deep Learning
Andrew Tao (NVIDIA), Carsten Waechter (NVIDIA)
This session will present a proof of concept where a deep neural network was trained with pairs of Iray ray traced images (one arbitrary ray tracing iteration number and one fully converged image) and theirs structural similarity index (SSIM). Origin ...Read More
This session will present a proof of concept where a deep neural network was trained with pairs of Iray ray traced images (one arbitrary ray tracing iteration number and one fully converged image) and theirs structural similarity index (SSIM). Originally thought as a method for measuring the similarity between two images, SSIM index can also be viewed as a quality measure versus a reference image or, in our case, as a ray tracing rendering progress. The DNN can now from any render iteration of arbitrary scene infer a rendering progress estimator but also provides heat map pictures of the scenes that can be used for adaptive rendering, focusing ray tracing engine power on appropriate zones.  Back
 
Keywords:
AI and DL Research, Graphics and AI, Rendering and Ray Tracing, GTC Silicon Valley 2018 - ID S8788
Streaming:
Download:
 
SSD++ Boosting Performance of Single-Shot MultiBox Detection Using Convolution Autoencoders
Vijay Gabale (Huew)
We'll showcase how you can apply a wealth of unlabeled image data to significantly improve accuracy and speed of single-shot object-detection (SSD) techniques. Our approach, SSD++, advances the state-of-the-art of single shot multibox-based object d ...Read More
We'll showcase how you can apply a wealth of unlabeled image data to significantly improve accuracy and speed of single-shot object-detection (SSD) techniques. Our approach, SSD++, advances the state-of-the-art of single shot multibox-based object detectors (such as SSD, YOLO) by employing a novel combination of convolution-deconvolution networks to learn robust feature maps, thus making use of unlabeled dataset, and the fresh approach to have confluence of convolution and deconvolution features to combine generic as well as semantically rich feature maps. As a result, SSD++ drastically reduces the requirement of labeled datasets, works on low-end GPUs, identifies small as well as large objects with high fidelity, and speeds up inference process by decreasing the requirement of default boxes. SSD++ achieves state-of-the-art results on PASCAL VOC and MS COCO datasets. Through ablation study, we'll explain the effectiveness of different components of our architecture that help us achieve improved accuracy on the above datasets. We'll further show a case study of SSD++ to identify shoppable objects in fashion, home decor, and food industry from images in the wild.  Back
 
Keywords:
AI and DL Research, NVIDIA Inception Program, Computer Vision, Video and Image Processing, GTC Silicon Valley 2018 - ID S8159
Streaming:
 
Deep Learning for Dialogue Systems
Yun-Nung (Vivian) Chen (National Taiwan University)
Learn how to apply deep learning technologies for building robust and scalable dialogue systems with deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work. We'll start with an ov ...Read More
Learn how to apply deep learning technologies for building robust and scalable dialogue systems with deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work. We'll start with an overview of the dialogue research and allow the audience to dive deep into the state-of-the-art work about neural-based language understanding, dialogue management, and language generation towards end-to-end neural dialogue systems.  Back
 
Keywords:
AI and DL Research, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8542
Streaming:
Download:
 
IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification
Pavlo Molchanov (NVIDIA)
Deep residual networks (ResNets) made a recent breakthrough in deep learning. The core idea of ResNets is to have shortcut connections between layers that allow the network to be much deeper while still being easy to optimize avoiding vanishing ...Read More

Deep residual networks (ResNets) made a recent breakthrough in deep learning. The core idea of ResNets is to have shortcut connections between layers that allow the network to be much deeper while still being easy to optimize avoiding vanishing gradients. These shortcut connections have interesting properties that make ResNets behave differently from other typical network architectures. In this talk we will use these properties to design a network based on a ResNet but with parameter sharing and adaptive computation time, we call it IamNN. The resulting network is much smaller than the original network and can adapt the computational cost to the complexity of the input image. During this talk we will provide an overview of ways to design compact networks, give an overview of ResNets properties and discuss how they can be used to design compact dense network with only 5M parameters for ImageNet classification.

  Back
 
Keywords:
AI and DL Research, GTC Silicon Valley 2018 - ID S8456
Streaming:
Download:
 
Multimodal Memory Modelling for Video Captioning
Yan Huang (Institute of Automation, Chinese Academy of Sciences)
This talk presents a novel framework named multimodal memory model for video captioning, which builds a visual and textual shared memory to model the long-term visual-textual dependency and further guide visual attention on described visual targets t ...Read More
This talk presents a novel framework named multimodal memory model for video captioning, which builds a visual and textual shared memory to model the long-term visual-textual dependency and further guide visual attention on described visual targets to solve visual-textual alignments. Video captioning which automatically translates video clips into natural language sentences is a very important task in computer vision. By virtue of recent deep learning technologies, video captioning has made great progress. However, learning an effective mapping from the visual sequence space to the language space is still a challenging problem due to the long-term multimodal dependency modelling and semantic misalignment. Inspired by the facts that memory modelling poses potential advantages to long-term sequential problems and working memory is the key factor of visual attention, the proposed model attaches an external memory to store and retrieve both visual and textual contents by interacting with video and sentence with multiple read and write operations.  Back
 
Keywords:
AI and DL Research, Computer Vision, Video and Image Processing, GTC Silicon Valley 2018 - ID S8311
Streaming:
Download:
 
Getting Started with Tensorflow on GPUs
Hans Hyttsten (Google)
Want to get started using TensorFlow together with GPUs? Then come to this session, where we will cover the TensorFlow APIs you should use to define and train your models, and the best practices for distributing the training workloads to multipl ...Read More

Want to get started using TensorFlow together with GPUs? Then come to this session, where we will cover the TensorFlow APIs you should use to define and train your models, and the best practices for distributing the training workloads to multiple GPUs. We will also look at the underlying reasons why are GPUs are so great to use for Machine Learning workloads?

  Back
 
Keywords:
AI and DL Research, Deep Learning and AI, Developer Talk, GTC Silicon Valley 2018 - ID S8946
Streaming:
Download:
 
Sim2Real Visual Robotic Servoing for Navigation and Manipulation via Deep Reinforcement Learning
Fereshteh Sadeghi (University of Washington)
Humans are remarkably proficient at controlling their limbs and tools from a wide range of viewpoints, diverse environments and in the presence of distractors. In robotics, this ability is referred to as visual servoing. Standard visual servoing appr ...Read More
Humans are remarkably proficient at controlling their limbs and tools from a wide range of viewpoints, diverse environments and in the presence of distractors. In robotics, this ability is referred to as visual servoing. Standard visual servoing approaches have limited generalization as they typically rely on manually designed features and calibrated camera. We exhibit generalizable visual servoing in the context of robotic manipulation and navigation tasks learned through visual feedback and by deep reinforcement learning (RL) without needing any calibrated setup. By highly randomizing our simulator, we train policies that generalize to novel environments and also to the challenging real world scenarios. Our domain randomization technique addresses the high sample complexity of deep RL, avoids the dangers of trial-and-error and also provides us with the liberty to learn recurrent vision-based policies for highly diverse tasks where capturing sufficient real robot data is impractical. An example of such scenario is learning view-invariant robotic policies which leads into learning physical embodiment and self-calibration purely through visual feedback.  Back
 
Keywords:
AI and DL Research, IoT, Robotics & Drones, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8955
Streaming:
 
Investigating Data Augmentation Strategies for Advancing Deep Learning Training
Winston Hsu (National Taiwan University)
We saw the huge success of the deep learning paradigm and the superhuman capability in numerous benchmarks in image, video, audio, or text. However, it poses huge challenges as adopting the methods in industrial applications (mainly due to the lack o ...Read More
We saw the huge success of the deep learning paradigm and the superhuman capability in numerous benchmarks in image, video, audio, or text. However, it poses huge challenges as adopting the methods in industrial applications (mainly due to the lack of quality tracking data) as the neural networks consume enormous parameters and require relatively huge quality training data. We'll aim for investigating the "data augmentation" strategies increasing quality training data for robust inference across different learning problems mainly in image, video, 3D, and IoT data streams. We'll first quantify the importance of training data for deep neural networks then review numerous strategies, such as crawling from the web, utilizing generative models, 3D computer graphics, augmented reality, engagement in social media, gaming, etc. We'll compare the effectiveness among the diverse strategies. As generally taking the data from other domains, we also need to deal with the cross-domain learning problem. We'll provide detailed insights from our recent work published in top conferences (e.g., CVPR, ICCV, AAAI, etc.) and those cases in industrial applications.  Back
 
Keywords:
AI and DL Research, Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8391
Streaming:
Download:
 
Tensor Layers for Compression of Deep Learning Networks
Cristopher Cecka (NVIDIA)
We'll review recent efforts to compress fully connected layers in machine learning via tensor networks, including the Tensor Train format, the Tensor Contraction Layer, the Tensor Regression Layer, and a Tensor Ring decomposition. These decompositio ...Read More
We'll review recent efforts to compress fully connected layers in machine learning via tensor networks, including the Tensor Train format, the Tensor Contraction Layer, the Tensor Regression Layer, and a Tensor Ring decomposition. These decompositions, in supplementing or replacing fully connected layers, are shown to dramatically reduce the number of parameters required by the network without resorting to sparsity and without loss in error. We've shown 55-80 percent compression of the entire network with less than one percent loss in accuracy. These Tensor layers can be used in end-to-end training, fine-tuning, and transfer-learning by initializing the decomposition with a pre-trained fully connected layer. Furthermore, because the forward and backward passes of the network rely on dense Tensor contractions, we show that these methods retain high computational intensity and can be efficiently evaluated on GPUs.  Back
 
Keywords:
AI and DL Research, Algorithms and Numerical Techniques, HPC and AI, GTC Silicon Valley 2018 - ID S8807
Streaming:
Download:
 
Accelerating Cancer Research with Deep Learning
Fernanda Foertter (NVIDIA)
The Department of Energy (DOE) entered into a partnership with the National Cancer Institute (NCI) of the National Institutes of Health (NIH) to accelerate cancer research. This "Cancer Moonshot" aims to tackle three main objectives: better ...Read More
The Department of Energy (DOE) entered into a partnership with the National Cancer Institute (NCI) of the National Institutes of Health (NIH) to accelerate cancer research. This "Cancer Moonshot" aims to tackle three main objectives: better understand the mechanisms of cancer, use large amounts of diverse medical data for predictive models, and enable precision medicine by providing guidance for treatment to individual patients. Leveraging the compute expertise of DOE in high performance computing (HPC) and new methods for deep learning in artificial intelligence, this HPC+AI approach aims to create a single scalable deep neural network code called CANDLE (CANcer Distributed Learning Environment) that will be used to address all three challenges. This talk aims to give an overview of the project and highlight how GPU accelerated systems in the DOE ecosystem, Summit and Sierra, have contributed to the project.  Back
 
Keywords:
AI and DL Research, HPC and AI, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S81033
Streaming:
 
Multi-Resolution 3D-Convolutional Neural Network for Object Recognition
Sambit Ghadai (Iowa State University), Adarsh Krishnamurthy (Iowa State University)
Voxelized representation of 3D objects is commonly used for training 3D-Convolutional Neural Networks for object detection and classification. However, high-resolution voxelization of CAD models are memory intensive and hence, it is not possible to l ...Read More
Voxelized representation of 3D objects is commonly used for training 3D-Convolutional Neural Networks for object detection and classification. However, high-resolution voxelization of CAD models are memory intensive and hence, it is not possible to load multiple models in the GPU for training. We have developed a GPU-accelerated voxelization technique that generates multi-level voxel grids of 3D objects. Instead of creating a single high-resolution voxel grid for the whole object, this technique generates selective region-based high-resolution voxel grids to represent detailed features in the object. We have also developed a multi-resolution 3D-Convolutional Neural Network that uses this hybrid voxelization for accurate object recognition and classification.  Back
 
Keywords:
AI and DL Research, Industrial Inspection, Computer Vision, GTC Silicon Valley 2018 - ID S8389
Streaming:
Download:
 
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
Ting-Chun Wang (NVIDIA)
We'll present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks. Conditional GANs have enabled a variety of applications, but the results are often limited ...Read More
We'll present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks. Conditional GANs have enabled a variety of applications, but the results are often limited to low-res and still far from realistic. We'll show that we're capable of generating 2048x1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.  Back
 
Keywords:
AI and DL Research, Graphics and AI, GTC Silicon Valley 2018 - ID S8918
Streaming:
 
GPU Performance Testing and PowerAI on IBM Cloud (Presented by IBM Cloud)
Alex Hudak (IBM), Brian Wan (IBM)
In this session, you will learn about the latest IBM PowerAI solution, IBM Cloud GPU offerings and see a price-performance comparison, with supporting data, on the number of CPUs required to optimize GPU performance. We've also aggregated extensive ...Read More
In this session, you will learn about the latest IBM PowerAI solution, IBM Cloud GPU offerings and see a price-performance comparison, with supporting data, on the number of CPUs required to optimize GPU performance. We've also aggregated extensive test data to determine general best practices such as half-precision deep learning advantages on the Tesla V100 and the implications of neural-network model variable distribution and gradient aggregation techniques on your performance results. Join us to see why NVIDIA GPUs on IBM Cloud offer superior results.  Back
 
Keywords:
AI and DL Research, Accelerated Analytics, GTC Silicon Valley 2018 - ID S81013
Streaming:
Download:
AI for Gaming
Presentation
Media
The Real-Time Revolution
Adam Myhill (Unity)
GPU accelerated creative development platforms are no longer just for games, they're revolutionizing areas from film to automotive. See how Unity is being used to enable unheard-of levels of productivity and create even deeper collaboration between ...Read More
GPU accelerated creative development platforms are no longer just for games, they're revolutionizing areas from film to automotive. See how Unity is being used to enable unheard-of levels of productivity and create even deeper collaboration between teams.  Back
 
Keywords:
AI for Gaming, Graphics and AI, Real-Time Graphics, GTC Silicon Valley 2018 - ID S81010
Streaming:
Download:
 
Playing FPS Games with Deep Reinforcement Learning
Devendra Singh Chaplot (Carnegie Mellon University)
Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are full ...Read More
Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. We present the first architecture to tackle 3D environments in first-person shooter games that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information, such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as average humans in deathmatch scenarios.  Back
 
Keywords:
AI for Gaming, AI and DL Research, GTC Silicon Valley 2018 - ID S8467
Streaming:
Download:
 
Reinforcement Learning for Multiplayer Agents at SEED
Magnus Nordin (Electronic Arts / SEED)
Over the last couple of years, neural nets have enabled significant breakthroughs in computer vision, voice generation and recognition, translation, and self-driving cars. Neural nets will also be a powerful enabler for future game development. We'l ...Read More
Over the last couple of years, neural nets have enabled significant breakthroughs in computer vision, voice generation and recognition, translation, and self-driving cars. Neural nets will also be a powerful enabler for future game development. We'll give an overview of the potential of neural nets in game development, as well as provide an in-depth look at how we can use neural nets combined with reinforcement learning for new types of game AI.  We will also show some new exciting results from applying deep reinforcement learning to AAA games.  Back
 
Keywords:
AI for Gaming, AI and DL Research, GTC Silicon Valley 2018 - ID S8715
Streaming:
Download:
 
Optimizing for Real-Time Inference
Donald Brittain (NVIDIA)
Real-time games have an extremely small budget for computations of each frame. Learn the right way to approach real-time performance with inference workloads, taking advantage of the newest technologies available.
Real-time games have an extremely small budget for computations of each frame. Learn the right way to approach real-time performance with inference workloads, taking advantage of the newest technologies available.  Back
 
Keywords:
AI for Gaming, GTC Silicon Valley 2018 - ID S8742
Streaming:
 
Deep Learning for Locomotion Animation
Gavriel State (NVIDIA)
We''ll examine tools and technologies that NVIDIA''s GameWorks team is building to leverage the power of deep learning for content creation, demonstrating recent research in ways that neural networks can be used to generate realistic looking human an ...Read More
We''ll examine tools and technologies that NVIDIA''s GameWorks team is building to leverage the power of deep learning for content creation, demonstrating recent research in ways that neural networks can be used to generate realistic looking human animation. We''ll talk about how to apply GPUs for high-performance runtime inferencing of these networks for use in games or real-time VFX scenarios.  Back
 
Keywords:
AI for Gaming, GTC Silicon Valley 2018 - ID S8743
Streaming:
Download:
 
Production-Level Performance Capture Using Deep Convolutional Neural Networks
Antti Herva (Remedy Games)
We''ll present a machine learning solution that enables cost-efficient creation of large amounts of high-quality facial animation for digital doubles in games. Remedy Entertainment, NVIDIA, and the University of Southern California recently published ...Read More
We''ll present a machine learning solution that enables cost-efficient creation of large amounts of high-quality facial animation for digital doubles in games. Remedy Entertainment, NVIDIA, and the University of Southern California recently published "Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks" as part of the Symposium on Computer Animation. We''ll cover topics including recording a facial animation dataset for an actor, setting up a deep learning project, preprocessing the data, training a deep convolutional neural network, and evaluating the results. We''ll also present a summary of the findings and discuss potential future work.  Back
 
Keywords:
AI for Gaming, Graphics and AI, GTC Silicon Valley 2018 - ID S8734
Streaming:
 
Machine Learning with StarCraft II
Timo Ewalds (DeepMind), Chris Lee (Blizzard)
We''ll present an overview of the StarCraft II machine learning environment, including some basic API examples using C++ and Python. ...Read More

We''ll present an overview of the StarCraft II machine learning environment, including some basic API examples using C++ and Python.

  Back
 
Keywords:
AI for Gaming, Graphics and AI, GTC Silicon Valley 2018 - ID S8739
Streaming:
Download:
 
Democratizing Deep Learning with Unity ML-Agents
Arthur Juliani (Unity Technologies)
Unity ML-Agents is an open source AI toolkit that enables machine learning developers and researchers to train agents in realistic, complex scenarios with decreased technical barriers. ML-Agents offers a flexible way to use the Unity Editor and Engin ...Read More
Unity ML-Agents is an open source AI toolkit that enables machine learning developers and researchers to train agents in realistic, complex scenarios with decreased technical barriers. ML-Agents offers a flexible way to use the Unity Editor and Engine to develop and test new AI algorithms quickly and efficiently across games, robotics, and beyond. We''ll walk you through new learning methods that are bundled with the latest version of Unity ML-Agents. This includes, (1) Imitation Learning: Train agents to mimic human behavior. (2) Multi-agent Reinforcement Learning: Train multiple agents together to fulfill cooperative, competitive, and general tasks. We''ll showcase these new learning methods in some interesting training scenarios with real game examples.  Back
 
Keywords:
AI for Gaming, Graphics and AI, GTC Silicon Valley 2018 - ID S8740
Streaming:
Download:
 
Large-Scale Platform for Multi-Player Online Battle Arena (MOBA) Game AI
Qiang Fu (Tencent AI Lab), Bin Wu (Tencent AI Lab)
We have been developing Multi-Player Online Battle Arena (MOBA) Game AI to push AI research boundaries in pursuit of general AI. One of the key challenges in 5v5 MOBA AI development is how to process massive game replays and feed the data to model tr ...Read More
We have been developing Multi-Player Online Battle Arena (MOBA) Game AI to push AI research boundaries in pursuit of general AI. One of the key challenges in 5v5 MOBA AI development is how to process massive game replays and feed the data to model training in an efficient and reliable manner. To address this, we have built a large-scale game AI platform where millions of CPUs and thousands of GPUs are efficiently scheduled. Powered by our game AI platform and scheduling schemes, our MOBA AI is capable of learning upon billions of high-quality user replay samples per day using both deep learning and self-play.  Back
 
Keywords:
AI for Gaming, Data Center and Cloud Infrastructure, AI and DL Research, GTC Silicon Valley 2018 - ID S8922
Streaming:
Download:
 
A.I. Disrupting the Future of Content Creation for Games
Eric Risser (Artomatix)
The artistic manpower needed to create a video-game has been increasing exponentially over the years. Thanks to the computational power of NVIDIA GPUs, new AI accelerated workflows are poised to solve this problem, saving artists and studio ...Read More

The artistic manpower needed to create a video-game has been increasing exponentially over the years. Thanks to the computational power of NVIDIA GPUs, new AI accelerated workflows are poised to solve this problem, saving artists and studios time and money, and driving greater creativity. Artomatix is the leading pioneer in this space, its AI-based approach to content creation helps automate many of the mundane, tedious and repetitive tasks artists and designers face every day. This talk introduces the academic theory and history behind Creative AI and then delves into specific use cases and applications such as: Texture Synthesis, Material Enhancement, Hybridization and Style Transfer. Finally, this talk presents the next generation of tools for the creative industries, powered by AI, and gives case studies on how they've been solving some of the game industries largest problems over the past year. Join this session to gain an insight to the future of game creation.

  Back
 
Keywords:
AI for Gaming, NVIDIA Inception Program, Graphics and AI, GTC Silicon Valley 2018 - ID S8735
Streaming:
AI in Healthcare
Presentation
Media
From Challenges to Impact of Machine Learning in Clinical Practice
Keith Dreyer (Partners HealthCare), Alejandro Frangi (CISTIB / The University of Sheffield), Abdul Hamid Halabi (NVIDIA), Wiro Niessen (Medical Image Computing and Computer Assisted Interventions (MICCAI)), Mike Tilkin (American College of Radiology (ACR))
The increasing availability of large medical imaging data resources with associated clinical data, combined with the advances in the field of machine learning, hold large promises for disease diagnosis, prognosis, therapy planning and therapy mo ...Read More

The increasing availability of large medical imaging data resources with associated clinical data, combined with the advances in the field of machine learning, hold large promises for disease diagnosis, prognosis, therapy planning and therapy monitoring. As a result, the number of researchers and companies active in this field has grown exponentially, resulting in a similar increase in the number of papers and algorithms. A number of issues need to be addressed to increase the clinical impact of the machine learning revolution in radiology. First, it is essential that machine learning algorithms can be seamlessly integrated in the clinical workflow. Second, the algorithm should be sufficiently robust and accurate, especially in view of data heterogeneity in clinical practice. Third, the additional clinical value of the algorithm needs to be evaluated. Fourth, it requires considerable resources to obtain regulatory approval for machine learning based algorithms. In this workshop, the ACR and MICCAI Society will bring together expertise from radiology, medical image computing and machine learning, to start a joint effort to address the issues above.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8897
Streaming:
Download:
 
Automated Segmentation of Suspicious Breast Masses from Ultrasound Images
Viksit Kumar (Mayo Clinic College of Medicine and Science)
Learn how to apply deep learning for detecting and segmenting suspicious breast masses from ultrasound images. Ultrasound images are challenging to work with due to the lack of standardization of image formation. Learn the appropriate data augme ...Read More

Learn how to apply deep learning for detecting and segmenting suspicious breast masses from ultrasound images. Ultrasound images are challenging to work with due to the lack of standardization of image formation. Learn the appropriate data augmentation techniques, which do not violate the physics of ultrasound imaging. Explore the possibilities of using raw ultrasound data to increase performance. Ultrasound images collected from two different commercial machines are used to train an algorithm to segment suspicious breast with a mean dice coefficient of 0.82. The algorithm is shown to perform at par with conventional seeded algorithm. However, a drastic reduction in computation time is observed enabling real-time segmentation and detection of breast masses.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8525
Streaming:
Download:
 
Building a GPU-Accelerated Short-Read Aligner for Bisulfite-Treated DNA Sequences
Richard Wilton (Johns Hopkins University)
It is not always easy to accelerate a complex serial algorithm with CUDA parallelization. A case in point is that of aligning bisulfite-treated DNA (bsDNA) sequences to a reference genome. A simple CUDA adaptation of a CPU-based implementation c ...Read More

It is not always easy to accelerate a complex serial algorithm with CUDA parallelization. A case in point is that of aligning bisulfite-treated DNA (bsDNA) sequences to a reference genome. A simple CUDA adaptation of a CPU-based implementation can improve the speed of this particular kind of sequence alignment, but it's possible to achieve order-of-magnitude improvements in throughput by organizing the implementation so as to ensure that the most compute-intensive parts of the algorithm execute on GPU threads.

  Back
 
Keywords:
AI in Healthcare, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8130
Streaming:
Download:
 
Ultrasound Medical Imaging in the GPU Era
Marcin Lewandowski (us4us Ltd.)
Fast, inexpensive and safe, ultrasound imaging is the modality of choice for the first level of medical diagnostics. The emerging solutions of portable and hand-held 2/3D scanners, advanced imaging algorithms, and deep learning promise further d ...Read More

Fast, inexpensive and safe, ultrasound imaging is the modality of choice for the first level of medical diagnostics. The emerging solutions of portable and hand-held 2/3D scanners, advanced imaging algorithms, and deep learning promise further democratization of this technology. During the session, we will present an overview of ultrasound imaging techniques in medical diagnostics, explore the future of ultrasound imaging enabled by GPU processing, as well as set out the path to the conception of a portable 3D scanner. We will also demonstrate our hardware developments in ultrasound platforms with GPU-based processing. Having started with one large research scanner, we have begun our migration towards more commercially-viable solutions with a small hand-held unit built on the mobile GPU NVidia Tegra X1.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8421
Streaming:
Download:
 
Targeted Sequencing for All on S5 an S5 XL: GPUs Make It Happen
Mohit Gupta (Thermo Fisher Scientific)
We'll disscuss how GPUs are playing a central role in making advances in Ion Torrent's targeted sequencing workflow and talk about the S5 DNA sequencer from Ion Torrent that is enabling democratization of sequencing market and accel ...Read More

We'll disscuss how GPUs are playing a central role in making advances in Ion Torrent's targeted sequencing workflow and talk about the S5 DNA sequencer from Ion Torrent that is enabling democratization of sequencing market and accelerating research in precision medicine at a breathtaking pace with the help of GPUs. We'll highlight our work in liquid biopsy and non-invasive prenatal testing and how the breadth in technology offerings in semiconductor chips gives us the scale of sequencing from small panels to exomes. We'll discuss our analysis pipeline and the latest and greatest in algorithm development and acceleration on GPUs as well as our experiences ranging from Fermi to Pascal GPU architectures. 

  Back
 
Keywords:
AI in Healthcare, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8419
Streaming:
Download:
 
Computational Pathology at Scale: Changing Clinical Practice One Petabyte at a Time
Thomas Fuchs (Memorial Sloan Kettering Cancer Center)
How can we train medical deep learning models at a petabyte scale and how can these models impact clinical practice? We will discuss possible answers to these questions in the field of Computational Pathology. Pathology is in the midst of a revo ...Read More

How can we train medical deep learning models at a petabyte scale and how can these models impact clinical practice? We will discuss possible answers to these questions in the field of Computational Pathology. Pathology is in the midst of a revolution from a qualitative to a quantitative discipline. This transformation is fundamentally driven by machine learning in general and computer vision and deep learning in particular. With the help of PAIGE.AI we are building a clinical-grade AI at Memorial Sloan Kettering Cancer Center. The models are trained based on petabytes of image and clinical data on top of the largest DGX-1 V100 cluster in pathology. The goal is not only to automated cumbersome and repetitive tasks, but to impact diagnosis and treatment decisions in the clinic. This talk will focus on our recent advances in deep learning for tumor detection and segmentation, on how we train these high capacity models with annotations collected from pathologists, and how the resulting systems are implemented in the clinic.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8960
Streaming:
Download:
 
Machine Learning in Precision Medicine: Quantitative Medical Imaging, Artificial Intelligence, GPU Efficiency
Milan Sonka (University of Iowa)
Machine Learning in Precision Medicine: Patient-Specific Treatment Enabled by Quantitative Medical Imaging, Artificial Intelligence, and GPU Efficiency The attendees will learn about the need for and use of machine learning in today's patien ...Read More

Machine Learning in Precision Medicine: Patient-Specific Treatment Enabled by Quantitative Medical Imaging, Artificial Intelligence, and GPU Efficiency The attendees will learn about the need for and use of machine learning in today's patient-centered healthcare. The talk will focus on general approaches requiring machine learning to obtain image-based quantitative features, reach patient diagnoses, predict disease outcomes, and identify proper precision-treatment strategies. While the presented methods are general in nature, examples from cardiovascular disease management will be used to demonstrate the need for and power of machine learning enabled by the performance advantages of GPU computation.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8892
Streaming:
Download:
 
Workflow and Regulatory Challenges to Algorithm Implementation
Mike Tilkin (American College of Radiology (ACR))
AI in medical imaging has the potential to provide radiology with an array of new tools that will significantly improve patient care. To realize this potential, AI algorithm developers must engage with physician experts and navigate domains such ...Read More

AI in medical imaging has the potential to provide radiology with an array of new tools that will significantly improve patient care. To realize this potential, AI algorithm developers must engage with physician experts and navigate domains such as radiology workflow and regulatory compliance. This session will discuss a pathway for clinical implementation, and cover ACR's efforts in areas such as use case development, validation, workflow integration, and monitoring.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8994
Download:
 
R&D on Medical Imaging
Mei Han (Ping An Technology, US Research Lab)
In this talk I will describe the research and development work on medical imaging, done at PingAn Technology and Google Cloud, covering five different tasks. I'll present the technical details of the deep learning approaches we have develope ...Read More

In this talk I will describe the research and development work on medical imaging, done at PingAn Technology and Google Cloud, covering five different tasks. I'll present the technical details of the deep learning approaches we have developed, and share with the audiences the research direction and scope in the medical fields at PingAn technology and PingAn USA Lab.

  Back
 
Keywords:
AI in Healthcare, Computer Vision, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8930
Download:
 
Not Just a Black Box: Interpretable Deep Learning for Genomics and Beyond
Avanti Shrikumar (Stanford University)
Deep learning models give state-of-the-art results on diverse problems, but their lack of interpretability is a major problem. Consider a model trained to predict which DNA mutations cause disease: if the model performs well, it has likely ident ...Read More

Deep learning models give state-of-the-art results on diverse problems, but their lack of interpretability is a major problem. Consider a model trained to predict which DNA mutations cause disease: if the model performs well, it has likely identified patterns that biologists would like to understand. However, this is difficult if the model is a black box. We present algorithms that provide detailed explanations for individual predictions made by a deep learning model and discover recurring patterns across the entire dataset. Our algorithms address significant limitations of existing interpretability methods. We show examples from genomics where the use of deep learning in conjunction with our interpretability algorithms leads to novel biological insights.

  Back
 
Keywords:
AI in Healthcare, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8907
Streaming:
Download:
 
Identifying New Therapeutics for Parkinson's Disease Using Virtual Neurons on an Azure Hosted GPU Cluster
Andy Lee (Neuroinitiative)
Learn how to apply recent advances in GPU and open data to unravel the mysteries of biology and etiology of disease. Our team has built data driven simulated neurons using CUDA and open data, and are using this platform to identify new therapeut ...Read More

Learn how to apply recent advances in GPU and open data to unravel the mysteries of biology and etiology of disease. Our team has built data driven simulated neurons using CUDA and open data, and are using this platform to identify new therapeutics for Parkinson's disease with funding from the Michael J. Fox Foundation. In this session I'll discuss the open data which enables our approach, and how we are using Nvidia Tesla cards on Microsoft Azure to dynamically scale to more than 100,000 GPU cores while managing technology costs.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8386
Streaming:
Download:
 
A Deep Learning-Based Intelligent Reference Library for Diagnostic Decision Support in Lung Cancer Screening
Daniel Golden (Arterys), Sean Sall (Arterys)
Radiological diagnosis and interpretation should not take place in a vacuum -- but today, it does. One of the greatest challenges the radiologist faces when interpreting studies is understanding the individual patient in the context of the milli ...Read More

Radiological diagnosis and interpretation should not take place in a vacuum -- but today, it does. One of the greatest challenges the radiologist faces when interpreting studies is understanding the individual patient in the context of the millions of patients who have come previously. Without access to historical data, radiologists must make clinical decisions based only on their memory of recent cases and literature. Arterys is working to empower the radiologist with an intelligent lung nodule reference library that automatically retrieves historical cases that are relevant to the current case. The intelligent lung nodule reference library is built on top of our state-of-the-art deep learning-based lung nodule detection, segmentation and characterization system.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8507
Streaming:
 
A Component-Based AI Engine Platform for Medical Workflow
Xu Chen (Winning Health)
As deep learning techniques have been applied to the field of healthcare, more and more AI-based medical systems continue to come forth, which are accompanied by new heterogeneity, complexity and security risks. In the real-world we've seen ...Read More

As deep learning techniques have been applied to the field of healthcare, more and more AI-based medical systems continue to come forth, which are accompanied by new heterogeneity, complexity and security risks. In the real-world we've seen this sort of situation lead to demand constraints, hindering AI applications development in China's hospitals. First, we'll share our experience in building a unified GPU accelerated AI engine system to feed component-based functionality into the existing workflow of clinical routine and medical imaging. Then, we'll demonstrate in a pipeline of integrating the different types of AI applications (detecting lung cancer, predicting childhood respiratory disease and estimating bone age) as microservice to medical station, CDSS, PACS and HIS system to support medical decision-making of local clinicians. On this basis, we'll describe the purpose of establishing an open and unified, standardized, legal cooperation framework to help AI participants to enter the market in China to build collaborative ecology.

  Back
 
Keywords:
AI in Healthcare, Product & Building Design, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8895
Streaming:
 
Computational Precision Medicine - How Healthcare May Look Like in 10 years Thanks to GPUs
Alejandro Frangi (CISTIB / The University of Sheffield)
This talk will overview the fields of Personalised Computational Medicine and In Silico Clinical Trials, which are revolutionizing Medicine and Medical Product Development. This talk will introduce these concepts, provide examples of how they ca ...Read More

This talk will overview the fields of Personalised Computational Medicine and In Silico Clinical Trials, which are revolutionizing Medicine and Medical Product Development. This talk will introduce these concepts, provide examples of how they can transform healthcare, and emphasize why artificial intelligence and machine learning are relevant to them. We will also explain the limitations of these approaches and why it is paramout to engage in both phenomenological (data-driven) and mechanistic (principle-driven) modelling. Both areas are in desperate need for better infrastructures -sofrware and hardaware- giving access to computational and storage resources. The talk will be thought-provoking and eye-opening as to opportunities in this space for researchers and industries alike.

  Back
 
Keywords:
AI in Healthcare, Deep Learning and AI, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8887
Streaming:
 
Deep Imaging: Quantitative Biomarkers for Clinical Decision Making
Razvan Ionasec (Siemens Healthineers)
The transformation towards value-based healthcare needs inventive ways to lower cost and increase patient health outcomes. Artificial intelligence is vital for realizing value-based care. Turning medical images into biomarkers helps to increase ...Read More

The transformation towards value-based healthcare needs inventive ways to lower cost and increase patient health outcomes. Artificial intelligence is vital for realizing value-based care. Turning medical images into biomarkers helps to increase effectiveness of care.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8412
Streaming:
 
Deep Learning Improves Neuroimaging: Faster, Safer, Cheaper and Smarter
Enhao Gong (Stanford University, Subtle Medical), Greg Zaharchuk (Stanford University)
We will introduce deep learning applications in clinical neuroimaging (using MRI, CT, PET, etc.) and recent breakthrough results from Stanford and Subtle Medical. Perspectives and feedbacks of applying AI technologies in neuroimaging are shared, ...Read More

We will introduce deep learning applications in clinical neuroimaging (using MRI, CT, PET, etc.) and recent breakthrough results from Stanford and Subtle Medical. Perspectives and feedbacks of applying AI technologies in neuroimaging are shared, from expert radiologists and deep learning experts. How Deep Learning/AI is changing clinical neuroimaging practice * How will deep learning be applied in radiology workflow right now and in the future * Practical concerns and perspectives from radiologists How Deep Learning assists smarter neuroimaging decision making * Multi-scale 3D network enables lesion outcome prediction for stroke * More accurate lesion segmentation in neuroimaging How Deep Learning enables safer and cheaper neuroimaging screening * Deep Learning and GAN enables >95% reduction in radiation for functional medical imaging * Deep Learning enables 90% reduction in chemical (Gadolinium) contrast agent usage in contrast enhanced MRI How Deep Learning accelerate neuroimaging * Further acceleration and improved MRI reconstruction using deep learning * Deep Generative Adversarial Network for Compressed Sensing

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8647
Streaming:
 
Back to the Essense of Medicine, Answer Clinical Questions -- Medical Imaging in AI Era
Xiaodong Tao (iFlytek)
iFLYTEK Health's mission is to use the most advanced artificial intelligence technologies to revolutionize healthcare industry to help doctors provide quality care to more patients with higher efficiency. Developed upon iFLYTEK's world c ...Read More

iFLYTEK Health's mission is to use the most advanced artificial intelligence technologies to revolutionize healthcare industry to help doctors provide quality care to more patients with higher efficiency. Developed upon iFLYTEK's world class hardware/software technologies in voice recognition and voice synthesization, iFLYTEK's products can help reduce doctors' burden in writing medical records and free their time to focus more on caring patients. These technologies can also reduce errors and improve completeness and accuracy of medical records, therefore support advanced intelligence applications based on complete patient data. Automated image analysis tools can help doctors find abnormalities in images with confidence, especially for the inexperienced doctors from lower tier hospitals. Clinical Decision Support (CDS) system is based on authoritative medical literature, large amount of expert knowledge, and real cases to improve primary doctors' ability of accurate diagnosis using complete and accurate patient information.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8302
Streaming:
 
Deep Learning and Use of GPUs in Mammography
Hsi-Ming Chang (CureMetrix)
Discuss the difficulties in digital mammography, and the computational challenges we encountered while adapting deep learning algorithms, including GAN, to digital mammography. Learn how we address those computational issues, and get the informa ...Read More

Discuss the difficulties in digital mammography, and the computational challenges we encountered while adapting deep learning algorithms, including GAN, to digital mammography. Learn how we address those computational issues, and get the information of our benchmarking results using both consumer and enterprise grade GPUs.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8482
Streaming:
 
From Promising Algorithms to Clinical Practice: Next Generation of Challenges
Wiro Niessen (Medical Image Computing and Computer Assisted Interventions (MICCAI))
There is large promise in machine learning methods for the automated analysis of medical imaging data for supporting disease detection, diagnosis and prognosis. These examples include the extraction of quantitative imaging biomarkers that are re ...Read More

There is large promise in machine learning methods for the automated analysis of medical imaging data for supporting disease detection, diagnosis and prognosis. These examples include the extraction of quantitative imaging biomarkers that are related to presence and stage of disease, radiomics approaches for tumor classification and therapy selection, and deep learning methods for directly linking imaging data to clinically relevant outcomes. However, the translation of such approaches requires methods for objective validation in clinically realistic settings or clinical practice. In this talk, I will discuss the role of next generation challenges for this domain.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8992
Streaming:
Download:
 
Frontiers of AI in Medical Imaging: Overcoming Current Challenges and Moving Beyond Classification
Imon Banerjee (Stanford University), Daniel Rubin (Stanford University)
Learn about the key types of clinical use cases for AI methods in medical imaging beyond simple image classification that will ultimately improve medical practice, as well as the critical challenges and progress in applying AI to these applicati ...Read More

Learn about the key types of clinical use cases for AI methods in medical imaging beyond simple image classification that will ultimately improve medical practice, as well as the critical challenges and progress in applying AI to these applications. We''ll first describe the types of medical imaging and the key clinical applications for deep learning for improving image interpretation. Next, we''ll describe recent developments of word-embedding methods to leverage narrative radiology reports associated with images to generate automatically rich labels for training deep learning models and a recent AI project that pushes beyond image classification and tackles the challenging problem of clinical prediction. We''ll also describe emerging methods to leverage multi-institutional data for creating AI models that do not require data sharing and recent innovative approaches of providing explanation about AI model predictions to improve clinician acceptance.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8295
Download:
 
Medical Imaging with TensorFlow
Josh Gordon (Google)
Dive in to recent work in medical imaging, where TensorFlow is used to spot cancerous cells in gigapixel images, and helps physicians to diagnose disease. During this talk, we''ll introduce concepts in Deep Learning, and show concrete co ...Read More

Dive in to recent work in medical imaging, where TensorFlow is used to spot cancerous cells in gigapixel images, and helps physicians to diagnose disease. During this talk, we''ll introduce concepts in Deep Learning, and show concrete code examples you can use to train your own models. In addition to the technology, we''ll cover problem solving process of thoughtfully applying it to solve a meaningful problem. We''ll close with our favorite educational resources you can use to learn more about TensorFlow.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8919
Streaming:
 
Accelerating Bioinformatics: End-to-End Computation of NASA GeneLab Data with GPU Data Frame
Jacqueline Cenci-McGrody (NVIDIA), Venkat Krishnamurthy (MapD Technologies)
Protecting crew health is a critical concern for NASA in preparation of long duration, deep-space missions like Mars. Spaceflight is known to affect immune cells. Splenic B-cells decrease during spaceflight and in ground-based physiological mode ...Read More

Protecting crew health is a critical concern for NASA in preparation of long duration, deep-space missions like Mars. Spaceflight is known to affect immune cells. Splenic B-cells decrease during spaceflight and in ground-based physiological models. The key technical innovation presented by our work is end-to-end computation on the GPU with the GPU Data Frame (GDF), running on the DGXStation, to accelerate the integration of immunoglobulin gene-segments, junctional regions, and modifications that contribute to cellular specificity and diversity. Study results are applicable to understanding processes that induce immunosuppressionlike cancer therapy, AIDS, and stressful environments here on earth.

  Back
 
Keywords:
AI in Healthcare, Performance Optimization, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8528
Streaming:
 
Customizable Ultrasound Imaging in Real-Time Using a GPU-Accelerated Beamformer
Dongwoon Hyun (Stanford University)
Learn how researchers at Stanford University are leveraging the power of GPUs to improve medical ultrasound imaging. Ultrasound imaging is a powerful diagnostic tool that can provide clinicians with feedback in real time. Until recently, ultraso ...Read More

Learn how researchers at Stanford University are leveraging the power of GPUs to improve medical ultrasound imaging. Ultrasound imaging is a powerful diagnostic tool that can provide clinicians with feedback in real time. Until recently, ultrasound beamforming and image reconstruction has been performed using dedicated hardware in order to achieve the high frame rates necessary for real-time diagnostic imaging. Though many sophisticated techniques have been proposed to further enhance the diagnostic utility of ultrasound images, computational and hardware constraints have made translation to the clinic difficult. We have developed a GPU-accelerated software beamforming toolbox that enables researchers to implement custom real-time beamforming on any computer with a CUDA-capable GPU, including commercial ultrasound scanners. In this session, we will: 1) briefly introduce the basics of ultrasound beamforming, 2) present our software beamforming toolbox, and 3) show videos demonstrating its capabilities from a clinical study of echocardiography, as well as an implementation of a novel speckle removing beamformer that utilizes deep fully convolutional neural networks.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8279
Streaming:
Download:
 
Highly Accurate Brain Stroke Diagnosis System and Generative Stroke Lesion Model
Junghwan Cho (CAIDE Systems, Inc)
Learn CAIDE Systems'' unique diagnosis system with highly accurate prediction and delineation of brain stroke lesion. We''ll present how we increase sensitivity in medical diagnosis system and how we develop a state-of-the-art ge ...Read More

Learn CAIDE Systems'' unique diagnosis system with highly accurate prediction and delineation of brain stroke lesion. We''ll present how we increase sensitivity in medical diagnosis system and how we develop a state-of-the-art generative deep learning model for acquiring segmented stroke lesion CT images, and demonstrate our market-ready product: a diagnostic tool as well as a medical deep learning platform. We trained our diagnostic system using CT image data from thousands of patients with brain stroke and tested to see commercial feasibility of use for hospitals and mobile ambulances.  

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8428
Streaming:
Download:
 
Learning Normalized Inputs for Iterative Estimation in Medical Image Segmentation
Michal Drozdzal (Montreal Institute for Learning Algorithms)
In medical imaging, acquisition procedures and imaging signals vary across different modalities and, thus, researchers often treat them independently, introducing different models for each imaging modality. To mitigate the number of modality-spe ...Read More

In medical imaging, acquisition procedures and imaging signals vary across different modalities and, thus, researchers often treat them independently, introducing different models for each imaging modality. To mitigate the number of modality-specific designs, we introduced a simple yet powerful pipeline for medical image segmentation that combines fully convolutional networks (FCNs) with fully convolutional residual networks (FC-ResNets). FCNs are used to obtain normalized images, which are then iteratively refined by means of a FC-ResNet to generate a segmentation prediction. We''ll show results that highlight the potential of the proposed pipeline, by matching state-of-the-art performance on a variety of medical imaging modalities, including electron microscopy, computed tomography, and magnetic resonance imaging.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8251
Download:
 
Deep Learning for Shallow Sequencing
Yonatan Israeli (NVIDIA)
The NVIDIA Genomics Group has developed a deep learning platform to transform noisy, low-quality DNA sequencing data into clean, high-quality data. Hundreds of DNA sequencing protocols are used to profile phenomena such as protein-DNA binding an ...Read More

The NVIDIA Genomics Group has developed a deep learning platform to transform noisy, low-quality DNA sequencing data into clean, high-quality data. Hundreds of DNA sequencing protocols are used to profile phenomena such as protein-DNA binding and DNA accessibility. For example, the ATAC-seq protocol identifies open genomic sites by sequencing open DNA fragments; genome-wide fragment counts provide a profile of DNA accessibility. Recent advances enable profiling from smaller patient samples than previously possible. To reduce sequencing cost, we developed a convolutional neural network that denoises data from a small number of DNA fragments, making the data suitable for various downstream tasks. Our platform aims to accelerate adoption of DNA sequencers by minimizing data requirements.

  Back
 
Keywords:
AI in Healthcare, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8602
Streaming:
Download:
 
Accelerating Nanopore Sequencing Using AI and Volta
Chuck Seberino (Roche Sequencing Solutions)
Nanopore sequencing is a breakthrough technology that marries cutting edge semiconductor processes together with biochemistry, achieving fast, scalable, single molecule DNA sequencing. The challenge is real-time processing of gigabytes of data p ...Read More

Nanopore sequencing is a breakthrough technology that marries cutting edge semiconductor processes together with biochemistry, achieving fast, scalable, single molecule DNA sequencing. The challenge is real-time processing of gigabytes of data per second in a compact benchtop instrument. GPUDirect, together with the cuDNN library, enables Roche to maximize the effectiveness of Tesla V100 GPUs in their next generation sequencing instrument. Attendees will learn how these pieces come together to build a streaming AI inference engine to solve a signal processing workflow. Analysis and performance comparisons of the new TensorCore units, available on Volta hardware, will be included.cal cuDNN API

  Back
 
Keywords:
AI in Healthcare, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8947
Streaming:
Download:
 
CUDA Based Stitching of Teravoxel Microscopy Images
Massimo Bernaschi (National Research Council of Italy)
Learn how to use (multi) GPU and CUDA to speed up the process of stitching very large images (up to TeraBytes in size). Image stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segment ...Read More

Learn how to use (multi) GPU and CUDA to speed up the process of stitching very large images (up to TeraBytes in size). Image stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Image stitching is widely used in many important fields, like high resolution photo mosaics in digital maps and satellite photos or medical images. Motivated by the need to combine images produced in the study of the brain, we developed and released for free the TeraStitcher tool that we recently enhanced with a CUDA plugin that allows an astonishing speedup of the most computing intensive part of the procedure. The code can be easily adapted to compute different kinds of convolution. We describe how we leverage shuffle operations to guarantee an optimal load balancing among the threads and CUDA streams to hide the overhead of moving back and forth images from the CPU to the GPU when their size exceeds the amount of available memory. The speedup we obtain is such that jobs that took several hours are now completed in a few minutes.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8182
Streaming:
Download:
 
The Early Detection of Pancreatic Cancer Using Deep Learning: Preliminary Observations
Elliot Fishman (Johns Hopkins Hospital)
This talk will present the challenges and opportunities in developing a deep learning program for use in medical imaging. It will present a hands on approach to the challenges that need to be overcome and the need for a multidisciplinary approac ...Read More

This talk will present the challenges and opportunities in developing a deep learning program for use in medical imaging. It will present a hands on approach to the challenges that need to be overcome and the need for a multidisciplinary approach to help define the problems and potential solutions. The role of highly curated data for training the algorithms and the challenges in creating such datasets is addressed. The annotation of data becomes a key point in training and testing the algorithms. The role of experts in computer vision, and radiology will be addressed and how this project can prove to be a roadmap for others planning collaborative efforts will be addressed Finally I will discuss the early results of the Felix project whose goal is nothing short of the early detection of pancreatic cancer to help improve detection and ultimately improve patient outcomes.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S81004
Streaming:
 
Performance Improvements for CUDA Accelerated Real-Time Diagnostic Ultrasound Medical Imaging Motion Tracking
Ismayil Guracar (Siemens Medical Solutions, USA Inc. Ultrasound Group)
Motion tracking with motion compensation is an important component of modern advanced diagnostic ultrasonic medical imaging with microbubble contrast agents. Search-based on sum of absolute differences a well-known technique for motion estimatio ...Read More

Motion tracking with motion compensation is an important component of modern advanced diagnostic ultrasonic medical imaging with microbubble contrast agents. Search-based on sum of absolute differences a well-known technique for motion estimation is very amenable to efficient implementations, which exploit the fine grained parallelism inherent in GPUs. We''ll demonstrate a real-world application for motion estimation and compensation in the generation of real-time maximum intensity projections over time to create vascular roadmaps in medical images of organs, such as the liver with ultrasound contrast agents. We''ll provide CUDA kernel code examples which make this application possible as well as performance measurements demonstrating the value of instruction-level parallelism and careful control of memory access patterns for kernel performance improvement. We hope to provide insight to CUDA developers interested in motion estimation and compensation as well as general insight into kernel performance optimization relevant for any CUDA developer.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8233
Streaming:
Download:
 
Continuously Learning AI Pathologist : A Smart Microscope that can Automatically Screen Different Biological Specimen
Tathagato Rai Dastidar (SigTuple Technologies Pvt Ltd)
Clinical laboratories play a crucial role in healthcare ecosystem - the laboratories are pivotal and act as a screening sub-system by providing early inference in disease and abnormality diagnosis. An estimated 70% of clinical decisions regardin ...Read More

Clinical laboratories play a crucial role in healthcare ecosystem - the laboratories are pivotal and act as a screening sub-system by providing early inference in disease and abnormality diagnosis. An estimated 70% of clinical decisions regarding prevention, diagnosis and treatment involve lab tests. Surprisingly, 60% of the inferencing done at a clinical laboratory can be performed by one "wonder-tool" - microscope. Microscopy has helped pathologists assess and analyse the patients for over several centuries. The key hurdles in the microscopic examination are the amount of time that the pathologists have to spend in manual analysis and the need for the pathologists to be co-located with the specimen. In this talk, we introduce SigTuple's AI powered smart microscope that can automatically learn, analyse and summarize the inferences of several hundred abnormalities across different biological specimen (blood, urine and semen). It also utilizes the power of GPU computing on cloud to provide higher order analysis of the samples and acts as a tele-pathology enabler by providing pathologists the power to view or review any analysis or report from any part of the world.

  Back
 
Keywords:
AI in Healthcare, Pathology, GTC Silicon Valley 2018 - ID S8591
Streaming:
Download:
 
GE's Evolution from HPC to AI in Healthcare
Keith Bigelow (GE Healthcare Waukesha), Erik Steen (GE Healthcare)
For more than a decade, GE has partnered with Nvidia in Healthcare to power our most advanced modality equipment, from CT to Ultrasound. Part 1 of this session will offer an introduction to the deep learning efforts at GEHC, the platform we' ...Read More

For more than a decade, GE has partnered with Nvidia in Healthcare to power our most advanced modality equipment, from CT to Ultrasound. Part 1 of this session will offer an introduction to the deep learning efforts at GEHC, the platform we're building on top of NGC to accelerate new algorithm development, and then a deep dive into a case study of the evolution of our cardiovascular ultrasound scanner and the underlying extensible software stack. It will contain 3 main parts as follows: (a) Cardiovascular ultrasound imaging from a user perspective. Which problems we need to solve for our customers. Impact of Cardiovascular disease in a global perspective (b) An introduction to the Vivid E95 and the cSound platform , GPU based real time image reconstruction & visualization. How GPU performance can be translated to customer value and outcomes and how this has evolved the platform during the last 2 ½ years. (c) Role of deep learning in cardiovascular ultrasound imaging, how we are integrating deep learning inference into our imaging system and preliminary results from automatic cardiac view detection.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8849
Streaming:
Download:
 
Accelerating Medical Device Development in Medical Imaging
Alejandro Frangi (CISTIB / The University of Sheffield)
TBA ...Read More

TBA

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8993
Streaming:
 
Deep Learning Brings Disruptive Changes to Ophthalmology
Aaron Lee (University of Washington)
Hear about how GPU technology is disrupting the way your eye doctor works and how ophthalmic research is performed today. The rise of Electronic Medical Records in medicine has created mountains of Big Data particularly in ophthalmology where ma ...Read More

Hear about how GPU technology is disrupting the way your eye doctor works and how ophthalmic research is performed today. The rise of Electronic Medical Records in medicine has created mountains of Big Data particularly in ophthalmology where many discrete quantitative clinical elements like visual acuity can be tied to rich imaging datasets. In this session, we will explore the transformative nature that GPU acceleration has played in accelerating clinical research and show real-life examples of deep learning applications to ophthalmology in creating new steps forward in automated diagnoses, image segmentation, and computer aided diagnoses.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8866
Streaming:
 
Cascaded 3D Fully Convolutional Networks for Medical Image Segmentation
Holger Roth (Nagoya University)
We'll show how recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. FCNs can be trained to automatically segment 3D medical images, such as computed tomo ...Read More

We'll show how recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. FCNs can be trained to automatically segment 3D medical images, such as computed tomography (CT) scans based on manually annotated anatomies like organs and vessels. The presented methods achieve competitive segmentation results while avoiding the need for handcrafting features or training class-specific models, in a clinical setting. We'll explain a two-stage, coarse-to-fine approach that will first use a 3D FCN based on the 3D U-Net architecture to roughly define a candidate region. This candidate region will then serve as input to a second 3D FCN to do a fine prediction. This cascaded approach reduces the number of voxels the second FCN has to classify to around 10 percent of the original 3D medical image, and therefore allows it to focus on more detailed segmentation of the organs and vessels. Our experiments will illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results on many datasets. Code and trained models will be made available.

  Back
 
Keywords:
AI in Healthcare, Deep Learning and AI Frameworks, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8532
Streaming:
Download:
 
GPU-Enabled Ultrasound Imaging for Real-Time, Fully-Flexible Data Processing
Christoph Hennersperger (Technical University of Munich | Trinity College Dublin)
Explore how parallelized programming and DL can radically impact medical ultrasound imaging. In this session, we will describe how the processing of ultrasound signals can be implemented not only providing real-time capabilities, but also a flex ...Read More

Explore how parallelized programming and DL can radically impact medical ultrasound imaging. In this session, we will describe how the processing of ultrasound signals can be implemented not only providing real-time capabilities, but also a flexible environment for research and innovative new products. In this view, we will i) demonstrate 2D and 3D real-time imaging using open hardware platforms, and ii) provide an overview, how both radical parallelization and DL can be integrated within processing pipelines, providing new applications and improved image quality at unprecedented speed.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8764
Streaming:
Download:
 
Harnessing AI: Creating a Healthcare AI Ecosystem
Keith Dreyer (Partners HealthCare)
In this session, attendees will learn how to develop an AI Learning Platform for healthcare, develop initial(imaging) AI applications in specific care areas, and embed AI into devices creating "intelligent imaging systems". ...Read More

In this session, attendees will learn how to develop an AI Learning Platform for healthcare, develop initial(imaging) AI applications in specific care areas, and embed AI into devices creating "intelligent imaging systems".

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8991
Streaming:
 
AI Models to Clinical Practice: Open AI Marketplace for Diagnostic Imaging
Woojin Kim (Nuance Communications), Arman Sharafshahi (Nuance Communications)
Learn about the importance of clinical domain expertise in AI algorithm/model development and incorporation into clinical workflow, specifically in medical imaging, from a radiologist. With growing media attention, there is much fear, hype, and ...Read More

Learn about the importance of clinical domain expertise in AI algorithm/model development and incorporation into clinical workflow, specifically in medical imaging, from a radiologist. With growing media attention, there is much fear, hype, and hope when it comes to using DL in radiology. We will present through examples why it is essential to incorporate clinical domain expertise when developing DL models. We will demonstrate various ways AI can augment the radiologists both in image interpretation as well as beyond within the overall workflow. In the second portion of this talk, we will present the gap between developing a great AI model in isolation and having it become part of daily medical practice. From integration and hospital connectivity to algorithm serving at scale to meet growing demand, we will show how an AI Marketplace can create the ecosystem that allows AI to flourish.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8871
Streaming:
 
Computer-Augmented Healthcare: Opportunities and Challenges
Gregory Hager (The Malone Center for Engineering in Healthcare, Johns Hopkins University)
The Role of Data in Achieving Precision and Value in Healthcare The goal of healthcare is to provide the most effective treatment to every patient in the most efficient way. Data plays a key role in every aspect of this process from decision sup ...Read More

The Role of Data in Achieving Precision and Value in Healthcare The goal of healthcare is to provide the most effective treatment to every patient in the most efficient way. Data plays a key role in every aspect of this process from decision support systems that provide a clinician with the right information at the right time, to scheduling algorithms that predict patient flow and schedule accordingly, to analytics to coach and support patients in achieving or maintaining a healthy lifestyle. Achieving the vision of a data-informed healthcare system will require fundamental advances in many areas including causal inference, inference on complex, high-dimensional and heterogeneous data, missing data, process modeling, bias reduction, statistical validation, and model adaptation, to name a few. In this talk, I will illustrate some of these challenges through concrete examples within the Malone Center.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8891
Streaming:
Download:
 
AI DR Screening for Chronic Diseases
Emma Xu (Airdoc)
Diabetic retinopathy, also known as diabetic eye disease, is a major complication of diabetes, which damage occurs to the retina due to diabetes mellitus and is a leading cause of blindness. AirDoc's product Dirctor, Emma Xu and Professor Yo ...Read More

Diabetic retinopathy, also known as diabetic eye disease, is a major complication of diabetes, which damage occurs to the retina due to diabetes mellitus and is a leading cause of blindness. AirDoc's product Dirctor, Emma Xu and Professor You Li of Shanghai Changzheng Hospital, will share how AirDoc, the leading Intelligent Medical startup in China, leverages Nvidia's GPU and Deep Learning to improve the DR diagnose with Automatic left/right eye recognition, Automatic detection of the location and numbers, Automatic DR staging, Fast recognition speed, Patient Information Management for real-time screening statistics and usage management.

  Back
 
Keywords:
AI in Healthcare, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8940
Streaming:
Download:
Accelerated Analytics
Presentation
Media
Writing Graph Primitives with Gunrock
Muhammad Osama (University of California Davis)
Learn how to use Gunrock, a state-of-the-art CUDA-based graph-processing library specifically designed for the GPU, to develop fast, efficient, and complex graph primitives. Gunrock achieves a balance between performance and expressiveness by couplin ...Read More
Learn how to use Gunrock, a state-of-the-art CUDA-based graph-processing library specifically designed for the GPU, to develop fast, efficient, and complex graph primitives. Gunrock achieves a balance between performance and expressiveness by coupling high-performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. Gunrock is a stable, powerful, and forward-looking substrate for GPU-based, graph-centric research and development. Like many graph frameworks, it leverages a bulk-synchronous programming model and targets iterative convergent graph computations. We believe that Gunrock offers both the best performance on GPU graph analytics as well as the widest range of primitives.  Back
 
Keywords:
Accelerated Analytics, Tools and Libraries, HPC and AI, GTC Silicon Valley 2018 - ID S8586
Streaming:
 
The Need for Speed: How the Auto Industry Accelerates Machine Learning with Visual Analytics
Asghar Ghorbani (Volkswagen AG), Zach Izham (Volkswagen), Aaron Williams (MapD)
While GPU-accelerated analytics have already radically accelerated the speed of training machine learning models, data scientists and analysts still grapple with deriving insights from these complex models to better inform decision-making. The key: V ...Read More
While GPU-accelerated analytics have already radically accelerated the speed of training machine learning models, data scientists and analysts still grapple with deriving insights from these complex models to better inform decision-making. The key: Visualizing and interrogating black box models with a GPU-enabled architecture. Volkswagen and MapD will discuss how interactive, visual analytics are helping the automotive brand interactively explore the output of their ML models to interrogate them in real time, for greater accuracy and reduced biases. They'll also examine how applying the GPU Data Frame to their efforts has enabled them to accelerate data science by minimizing data transfers and made it possible for their complex, multi-platform machine learning workflows to run entirely on GPUs.  Back
 
Keywords:
Accelerated Analytics, NVIDIA Inception Program, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8468
Streaming:
Download:
 
Fast and Scalable Subgraph Isomorphism Using Dynamic Graph Techniques
James Fox (Georgia Institute of Technology)
Finding well-connected subgraphs is a common graph analysis goal. However, well-known formulations such as the k-Clique are computationally intractable and often too restrictive. The k-Truss of a graph is defined in terms of minimal triangle counts a ...Read More
Finding well-connected subgraphs is a common graph analysis goal. However, well-known formulations such as the k-Clique are computationally intractable and often too restrictive. The k-Truss of a graph is defined in terms of minimal triangle counts and is computationally tractable to find. We'll present our novel algorithm and scalable implementation for finding the k-Truss of a graph, which uses dynamic triangle counting techniques and leverages a dynamic graph data structure and framework for the GPU. Our approach won an Innovation Award for HPEC'17 GraphChallenge, and performs anywhere from 100x to 10,000x faster than baseline benchmarks.  Back
 
Keywords:
Accelerated Analytics, Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8198
Streaming:
Download:
 
Blazing Fast SQL Analytics on Your Data Lake
Rodrigo Aramburu (BlazingDB), William Malpica (BlazingDB)
Extract analytical value out of your enterprise data lake with a state-of-the-art GPU SQL analytics engine. As businesses continue to consolidate massive datasets into data lake technologies (HDFS, AWS S3, Azure Blob, etc.), they find themselves unab ...Read More
Extract analytical value out of your enterprise data lake with a state-of-the-art GPU SQL analytics engine. As businesses continue to consolidate massive datasets into data lake technologies (HDFS, AWS S3, Azure Blob, etc.), they find themselves unable to fully leverage the value these lakes hold. Data engineering departments need to produce unique, costly ETL processes for every dataset and every tool which hopes to interact with said dataset. At BlazingDB we've built an analytics engine that runs SQL directly on open source file formats inside data lakes, currently BlazingDB's Simpatico and Apache Parquet. These file formats can be easily accessed from a variety of different tools, limit duplication of large volumes of data, and support improved data governance. Learn strong practices for ensuring your data lake doesn't turn into a swamp and how to extract the full value of your data lake investment.  Back
 
Keywords:
Accelerated Analytics, Telecom Industry Solutions, NVIDIA Inception Program, Finance, GTC Silicon Valley 2018 - ID S8484
Streaming:
Download:
 
Building an Enterprise Machine Learning Center of Excellence
Zachary Hanif (Capital One)
Algorithmic advancements and new research capabilities frequently overshadow the infrastructure that enables that research and serves it to customers in production applications. Having a solid infrastructure for real world machine learning often ends ...Read More
Algorithmic advancements and new research capabilities frequently overshadow the infrastructure that enables that research and serves it to customers in production applications. Having a solid infrastructure for real world machine learning often ends up being the biggest determinant of success and is an exciting area of research and engineering in its own right. These environments are what allow brilliant algorithms to deliver value at scale. We'll detail how Capital One has designed its GPU computing environment to accelerate machine learning efforts and outline the services used, the framework to leverage those services, and the engineering practices used to develop and deploy well-governed, accurate models to high-volume production environments. Beyond production deployments, we'll discuss how this infrastructure performs large-scale testing of models and frameworks to explore the interactions of deep learning tools like MXNet and TensorFlow. We'll also discuss the practices that enabled Capital One to hire a high-performing team in this incredibly desirable field.  Back
 
Keywords:
Accelerated Analytics, Tools and Libraries, Data Center and Cloud Infrastructure, Finance, GTC Silicon Valley 2018 - ID S8843
Streaming:
 
Accelerating Graph Algorithms for Government and Industry
Mahantesh Halappanavar (Pacific Northwest National Laboratory), Antonino Tumeo (Pacific Northwest National Laboratory)
We'll discuss our efforts regarding the acceleration of large-scale graph algorithms in the context of projects funded by various government agencies. Graph methods are key kernels for large-scale data analytics, as well as for several exascale appl ...Read More
We'll discuss our efforts regarding the acceleration of large-scale graph algorithms in the context of projects funded by various government agencies. Graph methods are key kernels for large-scale data analytics, as well as for several exascale application domains, including smart grids, computational biology, computational chemistry, and climate science. We'll present our latest results on distributed implementations employing GPUs and accelerators of graph kernels, such as community detection and B-matching, showing how we can tackle large-scale problems with heterogeneous supercomputers. On the basis of the experience and results in optimizing these algorithms for high performance computing platforms, we'll then discuss new requirements, upcoming opportunities, and potential solution for next-generation, high-performance, integrated graph toolkits.  Back
 
Keywords:
Accelerated Analytics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8476
Streaming:
 
GOAI One Year Later
Joshua Patterson (NVIDIA)
This talk will discuss the evolution of the GPU Open Analytics Initiative (GoAi) from its inception to today. GoAi, at its core, is a collection of libraries, frameworks, and APIs that lower the barrier of GPU adoption for data scientists. The goal o ...Read More
This talk will discuss the evolution of the GPU Open Analytics Initiative (GoAi) from its inception to today. GoAi, at its core, is a collection of libraries, frameworks, and APIs that lower the barrier of GPU adoption for data scientists. The goal of GoAi is to enable end to end data science workflows across many multi-GPU servers, to analyze and understand data more efficiently than ever before. To date, GoAi includes methods for performing SQL, machine learning, data processing or feature engineering, graph analytics, and graph visualization all on the GPU. This talk will discuss the who, what, when, where, and whys of GoAi, and its integration into the traditional big data world through leading open source projects like Apache Arrow and Apache Parquet. Finally, this talk will highlight major achievements of GoAi, our plans for the future, and how developers can become a part of this rapidly evolving ecosystem.  Back
 
Keywords:
Accelerated Analytics, Telecom Industry Solutions, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8502
Streaming:
Download:
 
Efficient Data Loading for Streaming Data Processing on GPUs
Vassili Gorshkov (FastData.IO)
We'll provide an overview of a GPU-based streaming processing engine and the specific challenges and opportunities that this presents. Unlike talks on database engines, we'll focus on data loading problems. We'll first cover hardware constraints a ...Read More
We'll provide an overview of a GPU-based streaming processing engine and the specific challenges and opportunities that this presents. Unlike talks on database engines, we'll focus on data loading problems. We'll first cover hardware constraints and then move on to computational parts required for every data batch. This part includes converters to internal columnar format and string dictionary construction. We'll also cover in detail the maintenance of string dictionary for streaming data.  Back
 
Keywords:
Accelerated Analytics, Performance Optimization, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8877
Streaming:
Download:
 
Capitalize on Next Generation In-Memory HPC with HPE Superdome Flex (Presented by Hewlett Packard Enterprise)
Bill Dunmire (HPE)
Learn how the breakthrough HPE Superdome Flex platform equips scientists, engineers, and business lines with in-memory computing at unparalleled scale to solve complex, data-intensive problems holistically, accelerate analytics, and coupled with Nvid ...Read More
Learn how the breakthrough HPE Superdome Flex platform equips scientists, engineers, and business lines with in-memory computing at unparalleled scale to solve complex, data-intensive problems holistically, accelerate analytics, and coupled with Nvidia GPU technology, leverage large-scale data visualization to speed time to discovery and innovation.  Back
 
Keywords:
Accelerated Analytics, Computational Fluid Dynamics, Computer Aided Engineering, HPC and AI, GTC Silicon Valley 2018 - ID S8973
Streaming:
Download:
 
Reinventing Real-Time Multidimensional Analytics Powered by GPU
Roman Raevsky (Polymatica)
We'll provide answers and business cases for three main questions: First, what were the main problems of big data analytics and how was it solved with GPUs? Second, how can you quickly analyze data and get maximum profit from it? Third, what is the ...Read More
We'll provide answers and business cases for three main questions: First, what were the main problems of big data analytics and how was it solved with GPUs? Second, how can you quickly analyze data and get maximum profit from it? Third, what is the future of business intelligence (BI)? We'll discuss the new way of analytics a unique BI solution powered by GPU, which provides real-time multidimensional analytics for all kinds of businesses. The online analytical processing and data mining server with hybrid CPU and GPU architecture gives users freedom of analytics with no pre-aggregates, and provides the fastest analytical tool for enterprise-sized raw data volumes. We'll show the results of the latest tests of analytical platform operations on different hardware, which proves the efficiency of work on GPUs. One example of user cases that we'll show is how companies around the world use this solution to analyze billions of raw business data records, and to optimize and automate their business. We'll also show the future of BI how the analytical platforms will look in the nearest future, and how the world of big data will change.  Back
 
Keywords:
Accelerated Analytics, Predictive Analytics for Retail, NVIDIA Inception Program, Finance, GTC Silicon Valley 2018 - ID S8533
Streaming:
Download:
 
Speed at Scale: Using GPUs to Accelerate Analytics for Extreme Use Cases (Presented by MapD)
Todd Mostak (MapD)
It is common knowledge that GPUs can dramatically accelerate HPC and machine learning/AI workloads, but can they do the same for general purpose analytics? In this talk, Todd Mostak, CEO of MapD, will provide real-world examples of how a new generati ...Read More
It is common knowledge that GPUs can dramatically accelerate HPC and machine learning/AI workloads, but can they do the same for general purpose analytics? In this talk, Todd Mostak, CEO of MapD, will provide real-world examples of how a new generation of GPU-powered analytics platforms can enable enterprises from a range of verticals to dramatically accelerate the process of insight generation at scale. In particular, he will focus on how the key technical differentiators of GPUs: their massive computational bandwidth, fast memory, and native rendering pipeline, make them uniquely suited to allow analysts and data scientists to query, visualize and power machine learning over large, often high-velocity, datasets. Using the open source MapD analytics platform as an example, Todd will detail the technical approaches his team took to leverage the full parallelism of GPUs and demo how the platform allows analysts to interactively explore datasets containing tens of billions of records.  Back
 
Keywords:
Accelerated Analytics, NVIDIA Inception Program, GIS, GTC Silicon Valley 2018 - ID S81008
Streaming:
Download:
 
Eliminating Manual Data Labeling with AI-powered Data Curation (Presented by Pure Storage)
Ben Taylor (Ziff.ai), Emily Watkins (Pure Storage)
Learn from real-world case studies where large corpora of unstructured data were indexed and organized by deep-learning pipelines. Organizations are capturing and saving exponentially more unstructured data. As a tactic to organize this data, many te ...Read More
Learn from real-world case studies where large corpora of unstructured data were indexed and organized by deep-learning pipelines. Organizations are capturing and saving exponentially more unstructured data. As a tactic to organize this data, many teams turn to manual data classification, but that human-in-the-loop process can be cost prohibitive and introduce metadata inaccuracies. By applying deep learning and cluster-based labeling, we can index petabyte-scale datasets and rapidly organize unstructured data for downstream model building and analysis. This session will teach you how to quickly switch to training on all the contents of your data lake, rather than just a subset. We will use cases studies with real-world datasets to walk through best practices for a deep learning indexing pipeline.  Back
 
Keywords:
Accelerated Analytics, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8962
Streaming:
Download:
 
Breaking the Speed of Interconnect with Compression for Database Applications
Felipe Aramburu (Blazing DB), Nikolay Sakharnykh (NVIDIA)
Learn strategies for efficiently employing various cascaded compression algorithms on the GPU. Many database input fields are amenable to compression since they have repeating or gradually increasing pattern, such as dates and quantities. Fast implem ...Read More
Learn strategies for efficiently employing various cascaded compression algorithms on the GPU. Many database input fields are amenable to compression since they have repeating or gradually increasing pattern, such as dates and quantities. Fast implementations of decompression algorithms such as RLE-Delta will be presented. By utilizing compression, we can achieve 10 times greater effective read bandwidth than the interconnect allows for raw data transfers. However, I/O bottlenecks still play a big role in the overall performance and data has to be moved efficiently in and out of the GPU to ensure optimal decompression rate. After a deep dive into the implementation, we'll show a real-world example of how BlazingDB leverages these compression strategies to accelerate database operations.  Back
 
Keywords:
Accelerated Analytics, NVIDIA Inception Program, Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8417
Streaming:
 
Evaluation of Hybrid Cache-Coherent Concurrent Hash Table on POWER9 System with NVLink 2
Rajesh Bordawekar (IBM T. J. Watson Research Center), Pidad Gasfar D'Souza (IBM Systems Development Lab)
At the 2014 GTC, we described a novel concurrent cache-aware hash table that used a multi-level bounded linear probing hashing algorithm. This year we'll discus how the design has expanded using a hybrid (CPU-GPU based) hash table where the data is ...Read More
At the 2014 GTC, we described a novel concurrent cache-aware hash table that used a multi-level bounded linear probing hashing algorithm. This year we'll discus how the design has expanded using a hybrid (CPU-GPU based) hash table where the data is stored on the host CPU memory and accessed via the GPU using the unified memory constructs. The hash table is designed such that multiple CPU threads can update it concurrently and multiple GPU threads can fetch data from the hash table in a cache-coherent manner using NVLink 2.0. The hash-table is implemented on a POWER9 system with NVLink 2.0 connected Tesla V100 GPUs. We'll present detailed performance measurements of throughput and virtual memory activities from CPU updates and GPU fetches. We also compare the performance of our design against a hybrid hash table built using the Cuckoo hashing approach.  Back
 
Keywords:
Accelerated Analytics, GTC Silicon Valley 2018 - ID S8172
Streaming:
 
Graph-Centric AI for Cybersecurity
Howie Huang (The George Washington University)
Large enterprise networks and computer systems face the daily challenge of cyberattacks, which originate from software and hardware vulnerabilities and result in data theft, service interruption, and monetary loss. To address this challenge, we've d ...Read More
Large enterprise networks and computer systems face the daily challenge of cyberattacks, which originate from software and hardware vulnerabilities and result in data theft, service interruption, and monetary loss. To address this challenge, we've developed a set of graph-based machine learning techniques for accelerating threat detection on GPUs. We'll present our research on graph-centric AI that can be used to discover malicious actions in time to prevent irreversible damage to the systems. In the era of big data, these techniques help us to have a deep understanding of critical relationships in computer systems, social networks, and IoT, which is essential in many industry segments, including defense, software, finance, e-commerce, and healthcare.  Back
 
Keywords:
Accelerated Analytics, Deep Learning and AI Frameworks, Cyber Security, GTC Silicon Valley 2018 - ID S8158
Streaming:
 
Improving the Brick and Mortar Retail Customer Experience with GPUs
Trung Tran (Clarcepto Inc)
There is a clear opportunity for retailers to generate loyalty and increase sales by focusing on the overall customer experience. We'll describe how we are developing solutions to track customer activity and build profiles based on physical store ac ...Read More
There is a clear opportunity for retailers to generate loyalty and increase sales by focusing on the overall customer experience. We'll describe how we are developing solutions to track customer activity and build profiles based on physical store activity to personalize the in-store shopping experience. We'll also describe how GPUs and deep learning are used to create these capabilities ? all while protecting personal information and privacy.  Back
 
Keywords:
Accelerated Analytics, Intelligent Video Analytics and Smart Cities, Data Center and Cloud Infrastructure, Consumer Engagement and Personalization, Computer Vision, GTC Silicon Valley 2018 - ID S8144
Streaming:
 
Delivering an Extreme Data Analytics API for a Customer 360 View Using the Power of GPUs
Dipti Borkar (Kinetica)
We''ll discuss how Kinetica''s technology leverages the power of GPUs to deliver next-generation analytics. Also presented is a real-world use case of how the Lippo Group, one of the largest business conglomerates in Indonesia, was able to integrate ...Read More
We''ll discuss how Kinetica''s technology leverages the power of GPUs to deliver next-generation analytics. Also presented is a real-world use case of how the Lippo Group, one of the largest business conglomerates in Indonesia, was able to integrate data from multiple lines of business across several industries into a single big data analytics platform featuring an API layer with sub-second latency. We''ll discuss how their "deep and fast analytics" approach is opening up new opportunities for improved customer engagement within the business ecosystem.  Back
 
Keywords:
Accelerated Analytics, NVIDIA Inception Program, Cyber Security, Cyber Security, GTC Silicon Valley 2018 - ID S8905
Streaming:
 
Feeding the Big Data Engine: How to Import Data in Parallel
Brian Kennedy (Simantex)
Explore new techniques in transforming traditional sequential data import and validation routines into high-speed parallel algorithms. We''ll explore some of the innovative approaches required to import and validate massive data files, using a CSV fo ...Read More
Explore new techniques in transforming traditional sequential data import and validation routines into high-speed parallel algorithms. We''ll explore some of the innovative approaches required to import and validate massive data files, using a CSV format, in parallel. We''ll discuss the challenges of designing a set of parallel algorithms to simultaneously import millions of rows of data, while honoring all of the capabilities of the CSV format including: varying column lengths between columns and across rows, quoted columns, embedded token separators, and malformed data rows. We''ll also show how we support column character count validation with the ability to mix single and multi-byte characters within a field, along with our approaches and special optimizations to allow the GPU to efficiently handle string processing. Finally, we''ll review our performance gains compared to current sequential approaches, showing that we have increased throughput by over 18,000 percent on a single GPU card, and how this can be further scaled to support multiple GPUs.  Back
 
Keywords:
Accelerated Analytics, Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8443
Streaming:
Download:
 
Flexible and Fast Machine Learning and Deep Learning with Alluxio
Yupeng Fu (Alluxio), Michael Wendt (NVIDIA)
With the exponentially-growing deluge of data today, data lakes are pooling everywhere. So, how can you harness them for critical insights and is there an easy way to tap into the multitude of different storage systems that they''re stored in? Enter ...Read More
With the exponentially-growing deluge of data today, data lakes are pooling everywhere. So, how can you harness them for critical insights and is there an easy way to tap into the multitude of different storage systems that they''re stored in? Enter Alluxio, an agnostic and fast storage abstraction, which, when paired with deep learning and GPU-accelerated analytics yields a quick and easy way to harness the data. Join NVIDIA''s Applied Solutions Engineering (ASE) team as they walk through how to use Alluxio for fun and profit.  Back
 
Keywords:
Accelerated Analytics, GTC Silicon Valley 2018 - ID S8569
Streaming:
 
GPU-Accelerated Semantic Similarity Search at Scale
Kubilay Atasu (IBM Research)
Learn how to compute a high-quality approximation of the state-of-the-art text-similarity measure "Word Mover''s Distance" on massive datasets using a novel algorithm developed at IBM Research - Zurich. Our algorithm has linear time complex ...Read More
Learn how to compute a high-quality approximation of the state-of-the-art text-similarity measure "Word Mover''s Distance" on massive datasets using a novel algorithm developed at IBM Research - Zurich. Our algorithm has linear time complexity, requires a limited amount of working memory, and maps well into standard dense and sparse linear algebra routines. Therefore, it is very suitable for GPU acceleration! In addition, the algorithm is data parallel and exhibits a perfect weak scaling or strong scaling behavior when distributed across several GPUs. In practice, our algorithm renders the high-quality semantic-search results offered by Word Mover''s Distance applicable to massive datasets. We''ll also demonstrate applications of our algorithm in clustering, classification, and querying of news entries that are collected in real time from various data sources.  Back
 
Keywords:
Accelerated Analytics, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8418
Streaming:
Download:
 
How to Get the Most out of GPU Accelerated Database Operators
Tim Kaldewey (NVIDIA), Jiri Kraus (NVIDIA), Nikolay Sakharnykh (NVIDIA)
Early on, memory bandwidths, more than an order of magnitude higher than conventional processors have made GPUs an attractive platform for data-intensive applications. While there are many success stories about GPU-accelerated databases built from sc ...Read More
Early on, memory bandwidths, more than an order of magnitude higher than conventional processors have made GPUs an attractive platform for data-intensive applications. While there are many success stories about GPU-accelerated databases built from scratch, GPU-accelerated operations for large-scale, general-purpose databases are rather an exception than the norm. We characterize fundamental database operators like scan, filter, join, and group-by based on their memory access patterns. From these characteristics, we derive their potential for GPU acceleration, such as upper bounds for performance on current and future architectures. Starting from basic GPU implementations, we deep dive into aspects like optimizing data transfers, access, and layout, etc.  Back
 
Keywords:
Accelerated Analytics, Performance Optimization, GTC Silicon Valley 2018 - ID S8289
Streaming:
Download:
 
BigQuery and TensorFlow: Data Warehouse + Machine Learning Enables the "Smart" Query
Kaz Sato (Google)
BigQuery is Google''s fully managed, petabyte-scale data warehouse. It''s user-defined function realizes "smart" queries with the power of machine learning, such as similarity search or recommendation on images or documents with feature vec ...Read More
BigQuery is Google''s fully managed, petabyte-scale data warehouse. It''s user-defined function realizes "smart" queries with the power of machine learning, such as similarity search or recommendation on images or documents with feature vectors and neural network prediction. We''ll see how TensorFlow and its GPU-accelerated training environment enables a powerful "data warehouse + machine learning" solution.  Back
 
Keywords:
Accelerated Analytics, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8115
Streaming:
 
HornetsNest - Scalable Static and Dynamic Graph Algorithms Made Easy
Oded Green (Georgia Institute of Technology)
We''ll present HornetsNest, a framework for developing static and dynamic graph algorithms with relative ease. Through a small subset of graph primitives, which are the API for our framework, it is possible to implement parallel graph algorithms usin ...Read More
We''ll present HornetsNest, a framework for developing static and dynamic graph algorithms with relative ease. Through a small subset of graph primitives, which are the API for our framework, it is possible to implement parallel graph algorithms using a fairly small number of code lines. These graph primitives are optimized in the backend and as such programmers can focus on algorithm design rather than load-balancing, system utilization, and optimizations. Using these primitives, it''s possible to implement BFS in roughly 10 lines of code. Performance-wise, this BFS performs as well is its counterpart in the Gunrock library. More importantly, HornestsNest is the first framework to support a wide range of high-performing dynamic graph analytics, including new algorithms for dynamic triangle counting, dynamic page rank, and dynamic Katz centrality. Finally, we''ll cover the performance of numerous graph algorithms.  Back
 
Keywords:
Accelerated Analytics, Algorithms and Numerical Techniques, HPC and AI, GTC Silicon Valley 2018 - ID S8297
Streaming:
Download:
 
Building a GPU-Focused CI Solution
Michael Wendt (NVIDIA)
As the number of GPU-accelerated applications have multiplied, the needs for better development tools and services have increased as well. Chief among such services is continuous integration (CI), which dramatically improves and speeds up the develop ...Read More
As the number of GPU-accelerated applications have multiplied, the needs for better development tools and services have increased as well. Chief among such services is continuous integration (CI), which dramatically improves and speeds up the development life cycle through automated builds and integration testing. CI for GPU-accelerated applications comes with its own set of challenges, but the rewards can be enormous. We'll walk through how we implemented CI for the NVIDIA GPU Cloud by leaning on open source solutions such as Jenkins, discuss the lessons we learned in the process, and demonstrate how other such systems should be built in the future.  Back
 
Keywords:
Accelerated Analytics, Tools and Libraries, GTC Silicon Valley 2018 - ID S8563
Download:
 
Hornet: An Efficient Data Structure for Dynamic Sparse Graphs and Matrices
Oded Green (Georgia Institute of Technology)
We'll present Hornet, formerly known as cuSTINGER, a data structure designed for sparse dynamic graphs and matrices. Hornet scales to massive datasets while supporting very fast updates, over 200 million updates per second on a single Tesla P100 GPU ...Read More
We'll present Hornet, formerly known as cuSTINGER, a data structure designed for sparse dynamic graphs and matrices. Hornet scales to massive datasets while supporting very fast updates, over 200 million updates per second on a single Tesla P100 GPU. We'll show that replacing CSR, a popular data structure for sparse data, with Hornet does not change the execution time. We'll also show that the memory utilization of Hornet is within that of CSR and COO, and briefly show performance results of several analytics using Hornet. We'll cover the programming model for Hornet in a separate talk.  Back
 
Keywords:
Accelerated Analytics, Tools and Libraries, HPC and AI, GTC Silicon Valley 2018 - ID S8177
Streaming:
Download:
 
Extending Splunk with GPUs
Keith Kraus (NVIDIA), Joshua Patterson (NVIDIA)
As cybersecurity data volumes grow, even the best designed SIEMs struggle to perform complex analytics on a large range of data with interactive speeds. We'll discuss how NVIDIA GPU accelerated its own Splunk instance with technologies that are a pa ...Read More
As cybersecurity data volumes grow, even the best designed SIEMs struggle to perform complex analytics on a large range of data with interactive speeds. We'll discuss how NVIDIA GPU accelerated its own Splunk instance with technologies that are a part of the GPU Open Analytics Initiative, GOAI, to drastically improve cyberhunting. Using tools such as Anaconda, BlazingDB, Graphistry, and MapD, NVIDIA interactively explored billions of events faster than ever to detect threats and perform root cause analysis. We'll walk through how cyberdefenders can use open source tools and libraries to accelerate their own Splunk instance, with code samples and how to's. Finally, we'll discuss how to stay involved in the GPU-accelerated Splunk community.  Back
 
Keywords:
Accelerated Analytics, Telecom Industry Solutions, Cyber Security, GTC Silicon Valley 2018 - ID S8499
Streaming:
Download:
Additive Manufacturing
Presentation
Media
Realizing the Future of Making Things with Generative Design
Brian Frank (Autodesk)
Autodesk Generative Design harnesses the compute power of the NVIDIA GPU to deliver a full Design-to-Make workflow for today's product designers and engineers. Learn how the future of computing will enable better performing designs to be created wit ...Read More
Autodesk Generative Design harnesses the compute power of the NVIDIA GPU to deliver a full Design-to-Make workflow for today's product designers and engineers. Learn how the future of computing will enable better performing designs to be created with less time and effort than traditional engineering approaches. Autodesk Generative Design allows the user to fully explore possible design spaces, incorporating materials and manufacturing methods into the creation of design solutions.  Back
 
Keywords:
Additive Manufacturing, GTC Silicon Valley 2018 - ID S8600
Streaming:
 
Deep Learning at the Edge (Presented by HP Inc.)
Bruce Blaho (HP Inc.)
Come see how to do deep learning development on your desktop, with examples from 3D Printing and Geoscience.  In this session you will see how new powerful workstations are being used to create advanced deep learning solutions at the edge of the ...Read More
Come see how to do deep learning development on your desktop, with examples from 3D Printing and Geoscience.  In this session you will see how new powerful workstations are being used to create advanced deep learning solutions at the edge of the network - and why this is a strong complement to cloud-only based approaches.  We''ll share detailed, concrete examples from expert speakers: CGG is a geoscience company leading the use of deep learning to interpret terabyte-sized seismic data sets to find important underground features such as oil & gas deposits. HP''s Jet Fusion 3D Printers use complex thermal control systems to optimize part printing and material properties.  We will explore the deep learning based algorithms being developed for HP''s next generation of 3D printers.   Bruce Blaho - Fellow & Workstations Chief Technologist, HP Inc. Steve Dominguez - Team Lead Seismic Interpretation Software, CGG Dr. Jun Zeng - Principal Investigator, HP Labs Print & 3D, HP Inc. Dr. He Luan - Research Scientist, HP Labs Print & 3D, HP Inc.  Back
 
Keywords:
Additive Manufacturing, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S81044
Streaming:
Download:
Advanced AI Learning Techniques (incl. GANs and NTMs)
Presentation
Media
Discover Orders in Unordered Datasets: Generative Markov Networks
Yao-Hung Tsai (Carnegie Mellon University)
In this work, we argue that for any dataset even without explicit orders, there exists implicit orders/relationships for the data. Aiming at finding these orders and relationships, we introduce novel generative markov networks (GMNs) that considers a ...Read More
In this work, we argue that for any dataset even without explicit orders, there exists implicit orders/relationships for the data. Aiming at finding these orders and relationships, we introduce novel generative markov networks (GMNs) that considers a Markov Chain data generation process. To make the learning of transition operator tractable and flexible, we utilize neural networks as smooth function approximators. Moreover, we propose a batch-wise permutation training regime to ensure an ergodic training process for the Markov Chain. We'll show that GMNs are able to discover orders and relationships in datasets, and can also perform well on benchmark one-shot recognition task.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), GTC Silicon Valley 2018 - ID S8577
Streaming:
Download:
 
Deep Learning of Severe Weather Forecast Data
David Gagne (National Center for Atmospheric Research)
Attendees will learn how deep learning models identify severe weather hazards, how deep learning severe weather diagnosis compares with other machine learning methods, and what weather features deep learning considers most important for determining w ...Read More
Attendees will learn how deep learning models identify severe weather hazards, how deep learning severe weather diagnosis compares with other machine learning methods, and what weather features deep learning considers most important for determining whether a storm will produce severe weather or not. Severe weather hazards, such as tornadoes, hail, high winds, and flash floods, cause billions of dollars in property damage and injure or kill hundreds of people in the U.S. each year. Improved forecasts of the potential for severe weather enables decision makers to take actions to save lives and property. Machine learning and deep learning models extract spatial information from observations and numerical weather prediction model output to predict the probability of severe weather based on whether or not some form of severe weather was reported by the public. Convolutional neural networks and generative adversarial networks are compared against principal component analysis encodings to determine how much skill deep learning adds over traditional methods. The deep learning models are interrogated to identify important variables and spatial features for severe weather prediction.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Climate, Weather, Ocean Modeling, HPC and AI, GTC Silicon Valley 2018 - ID S8455
Streaming:
 
Disrupting Logistics and Optimization with AI
Karim Beguir (InstaDeep)
In this talk, you will get a detailed yet accessible look at how AI is disrupting logistics. Firms have for years been using classical optimization algorithms to make decisions such as how to deliver goods to multiple clients in a city, place package ...Read More
In this talk, you will get a detailed yet accessible look at how AI is disrupting logistics. Firms have for years been using classical optimization algorithms to make decisions such as how to deliver goods to multiple clients in a city, place packages in a warehouse or route orders. Such algorithms are often built on heuristics which experts have designed to get reasonable solutions quickly. Recent advances in Deep Learning and Reinforcement learning are however making it possible to build AI systems that tackle these optimization problems from scratch. Through constant learning, a modern AI system can match and even beat existing optimization algorithms, or deliver faster solutions thanks to GPU parallel processing. Companies can now leverage these advances into significant efficiency gains for their operations.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Performance Optimization, NVIDIA Inception Program, Inventory, GTC Silicon Valley 2018 - ID S8432
Streaming:
 
The Latest of Project Apollo and Centralized in-car Computing Platform for Autonomous Driving
Xing Yuan (Baidu)
Apollo Computing Unit (ACU), a mass production-oriented autonomous driving computing platform launched by Baidu, mainly features Apollo Pilot system and Intelligent Map service. As an important part of the Apollo platform, ACU is launched for ma ...Read More

Apollo Computing Unit (ACU), a mass production-oriented autonomous driving computing platform launched by Baidu, mainly features Apollo Pilot system and Intelligent Map service. As an important part of the Apollo platform, ACU is launched for mass production by the Baidu''s partners. Based on the different computing capabilities required by different scenarios, it is divided into three series of products: ACU-Basic, ACU-Advanced, and ACU-Professional.

  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Autonomous Vehicles, Algorithms and Numerical Techniques, Autonomous Driving, GTC Silicon Valley 2018 - ID S8902
Streaming:
Download:
 
Recurrent Generative Adversarial Neural Networks for Compressive Imaging
Morteza Mardani (Stanford University)
We''ll present recurrent generative adversarial networks (GANs) for image recovery from compressed measurements, which has applications ranging from undersampled medical image reconstruction to image super-resolution. State-of-the-art analytics are n ...Read More
We''ll present recurrent generative adversarial networks (GANs) for image recovery from compressed measurements, which has applications ranging from undersampled medical image reconstruction to image super-resolution. State-of-the-art analytics are not aware of the image perceptual quality, and demand iterative algorithms that incur significant computational overhead for real-time tasks. To sidestep these hurdles, we introduce a novel compressive imaging framework using deep neural networks that approximates a low-dimensional manifold of images using GANs. To ensure the images are consistent with the measurements, a recurrent GAN architecture is deployed that consists of multiple alternative blocks of generator networks and affine projection, which is then followed by a discriminator network to score the perceptual quality of the generated images. A deep residual network with skip connections is used for the generator, while the discriminator is a multilayer Perceptron. Experiments performed with real-world contrast enhanced MRI data corroborate the superior diagnostic quality and faster reconstruction for the retrieved images relative to state-of-the-art schemes.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Computer Vision, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8197
Streaming:
Download:
 
Application of Generative Deep Neural Networks for Mass Customization of Patient-Specific Products
Sergei Azernikov (Glidewell Dental), Jyh-Jing Hwang (UC Berkeley)
We''ll show how generative adversarial networks (GANs) running on GPUs are about to revolutionize mass customization of patient-specific products at Glidewell Dental. Every day, our labs produce thousands of patient-specific items, such as dental res ...Read More
We''ll show how generative adversarial networks (GANs) running on GPUs are about to revolutionize mass customization of patient-specific products at Glidewell Dental. Every day, our labs produce thousands of patient-specific items, such as dental restorations, implants, and appliances. To deliver functional and aesthetic products, high levels of precision and consistency are essential. Traditionally, dental restoration design and manufacturing process was very labor intensive and required highly skilled dental professionals. Today, with the advances in CAD/CAM, the amount of manual labor has been significantly reduced; however, there are still many aspects of the process that require human intervention due to the fact that some of these aspects are hard to formalize and therefore impossible to automate with traditional tools. The convergence of several technologies, such as deep learning, GPGPU, and cloud computing, has allowed us to effectively train generative models on historical data. These models are now capable of automatically generating high-quality patient-specific designs.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8155
Streaming:
Download:
 
Learning to Learn, Deep Learning for Robotics, Deep Reinforcement Learning, AI for Manufacturing and Logistics
Pieter Abbeel (UC Berkeley / OpenAI / Gradescope)
We''ll introduce the latest advances on topics such as learning-to-learn, meta-learning, deep learning for robotics, deep reinforcement learning, and AI for manufacturing and logistics. ...Read More

We''ll introduce the latest advances on topics such as learning-to-learn, meta-learning, deep learning for robotics, deep reinforcement learning, and AI for manufacturing and logistics.

  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), IoT, Robotics & Drones, Autonomous Machines, Robotics & Autonomous Machines, GTC Silicon Valley 2018 - ID S8118
Streaming:
Download:
 
Learning with Opponent-Learning Awareness
Jakob Foerster (University of Oxford)
We'll discuss deep reinforcement learning in multi-agent settings, focusing on learning with opponent-learning awareness, a novel multi-agent reinforcement learning method that allows one agent to consider the learning dynamics of another agent. You ...Read More
We'll discuss deep reinforcement learning in multi-agent settings, focusing on learning with opponent-learning awareness, a novel multi-agent reinforcement learning method that allows one agent to consider the learning dynamics of another agent. You'll learn that this not only stabilizes learning in multi-agent settings, but also leads to emergence of cooperation. A key question relevant to autonomous cars is how to maintain cooperation between self-interested learning agents in a multi-agent setting.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8685
Streaming:
Download:
 
Debug and Approve your Deep Networks by Overcoming the Black Box Problem
Tsvi Achler (Optimizing Mind), Peter Feghali (Optimizing Mind)
Networks may learn to perform tasks by cheating in unknown and unexpected ways, which may be a liability for the developer. Feedforward networks are the basis of artificial neural networks such as deep, convolution, recurrent, networks, and even ...Read More
Networks may learn to perform tasks by cheating in unknown and unexpected ways, which may be a liability for the developer. Feedforward networks are the basis of artificial neural networks such as deep, convolution, recurrent, networks, and even simpler regression methods. However the internal decision processes of Feedforward networks are difficult to explain: they are known to be a "black-box". This is especially problematic in applications where consequences of an error can be severe, such as in medicine, banking, or self-driving cars. Optimizing Mind has developed a new type of feedback neural networks motivated by neuroscience that allows easier understanding of the internal decision process. Developers, regulators, and users can better understand their AI, reduce unexpected surprises, and liability by having Feedforward networks converted to our Illuminated form to explain the internal decision processes. We'll demonstrate some of these benefits.  Back
 
Keywords:
Advanced AI Learning Techniques (incl. GANs and NTMs), NVIDIA Inception Program, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8554
Streaming:
Download:
Algorithms and Numerical Techniques
Presentation
Media
Capture Sparsity in DL Applications
Michael Frumkin (NVIDIA)
We'll present a new technique for improving efficiency of inference and training in deep learning in the presence of sparse workloads. We'll start with a brief overview of applications of sparse linear algebra in engineering and data analysis. Then ...Read More
We'll present a new technique for improving efficiency of inference and training in deep learning in the presence of sparse workloads. We'll start with a brief overview of applications of sparse linear algebra in engineering and data analysis. Then, we'll analyze the presence of sparsity in both the training and inference phases of deep learning. To exploit this sparsity, we present our method of improving memory locality of sparse applications. We'll establish lower and upper bounds for sparse matrix operations and crossover with dense matrix operations. We'll demonstrate how to minimize memory traffic by tiling matrix operations, efficient use of L2, L1, and SMEM. We'll conclude with a performance comparison of our method with existing techniques on some real pruned weight matrices from GoogLeNet and OpenNMT's multiway translation network. This is the joint work of Michael Frumkin, Jeff Pool, and Lung Sheng Chien.  Back
 
Keywords:
Algorithms and Numerical Techniques, Performance Optimization, HPC and AI, GTC Silicon Valley 2018 - ID S8458
Streaming:
 
On Porting Scalable Parallel CFD Application HiFUN on NVIDIA GPU
Nikhil Shende (S & I Engineering Solutions Pvt. Ltd.)
The present study deals with porting scalable parallel CFD application HiFUN on NVIDIA Graphics Processing Unit (GPU) using an off-load strategy. The present strategy focuses on improving single node performance of the HiFUN solver with the help of G ...Read More
The present study deals with porting scalable parallel CFD application HiFUN on NVIDIA Graphics Processing Unit (GPU) using an off-load strategy. The present strategy focuses on improving single node performance of the HiFUN solver with the help of GPUs. This work clearly brings out the efficacy of the off-load strategy using OpenACC directives on GPUs and may be considered as one of the attractive models for porting legacy CFD codes on GPU based HPC and Supercomputing platform.  Back
 
Keywords:
Algorithms and Numerical Techniques, Computational Fluid Dynamics, Computer Aided Engineering, GTC Silicon Valley 2018 - ID S8799
Streaming:
Download:
 
Automatic Generation of 1D Recursive Filter Code for GPUs
Martin Burtscher (Texas State University)
Learn how to automatically generate 1D recursive filter code for GPUs using PLR, a domain-specific compiler. It only requires the filter coefficients as input and emits high-performance CUDA code. Later result values depend on earlier result values i ...Read More
Learn how to automatically generate 1D recursive filter code for GPUs using PLR, a domain-specific compiler. It only requires the filter coefficients as input and emits high-performance CUDA code. Later result values depend on earlier result values in digital filters, making it a challenge to compute them in parallel. We'll present the new work and space efficient algorithm PLR uses to implement digital filters and other linear recurrences, and we explain how it automatically parallelizes and optimizes the GPU code. Our evaluation shows that, for single-stage IIR filters, the generated code reaches the throughput of memory copy for large inputs, which cannot be surpassed. On other digital filters, the automatically parallelized code outperforms the fastest prior implementations.  Back
 
Keywords:
Algorithms and Numerical Techniques, Performance Optimization, GTC Silicon Valley 2018 - ID S8189
Streaming:
Download:
 
World's Fastest Machine Learning With GPUs
Jonathan McKinney (H2O.ai), Rory Mitchell (H2O.ai)
Deep learning algorithms have benefited greatly from the recent performance gains of GPUs. However, it has been unclear whether GPUs can speed up machine learning algorithms such as generalized linear modeling, random forests, gradient boosting ...Read More

Deep learning algorithms have benefited greatly from the recent performance gains of GPUs. However, it has been unclear whether GPUs can speed up machine learning algorithms such as generalized linear modeling, random forests, gradient boosting machines, and clustering. H2O.ai, the leading open source AI company, is bringing the best-of-breed data science and machine learning algorithms to GPUs. We introduce H2O4GPU, a fully featured machine learning library that is optimized for GPUs with a robust python API that is a drop dead replacement for scikit-learn. We'll demonstrate benchmarks for the most common algorithms relevant to enterprise AI and showcase performance gains as compared to running on CPUs.

  Back
 
Keywords:
Algorithms and Numerical Techniques, NVIDIA Inception Program, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8523
Streaming:
Download:
 
Autotuning Dense Batched QR Factorizations on GPU
Wissam M. Sid-Lakhdar (Lawrence Berkeley National Laboratory)
The increasing complexity and heterogeneity of computer architectures makes it challenging to design both efficient and portable codes. Indeed, designing generic GPU kernels that attempt to fit all GPU architectures would not be efficient on any give ...Read More
The increasing complexity and heterogeneity of computer architectures makes it challenging to design both efficient and portable codes. Indeed, designing generic GPU kernels that attempt to fit all GPU architectures would not be efficient on any given architecture. Moreover, the careful and customized design of a GPU kernel for a specific GPU will be hardly efficient on the next generation of GPUs. Furthermore, writing tailored kernels for every GPU is a daunting task that would require too much time and effort. We'll present our work on applying the auto-tuning idea to target this issue for batched QR factorization kernels on GPUs by generating automatically codes specific to a given GPU.  Back
 
Keywords:
Algorithms and Numerical Techniques, Tools and Libraries, Performance Optimization, GTC Silicon Valley 2018 - ID S8850
Streaming:
Download:
 
CUTLASS: Software Primitives for Dense Linear Algebra at All Levels and Scales within CUDA
Andrew Kerr (NVIDIA)
Audience members will learn how to implement efficient Deep Learning computations using CUDA C++ in the context of CUTLASS. CUTLASS is an open-source collection of C++ template abstractions for implementing high-performance matrix-multiplication (GE ...Read More
Audience members will learn how to implement efficient Deep Learning computations using CUDA C++ in the context of CUTLASS. CUTLASS is an open-source collection of C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) at all levels of the CUDA thread hierarchy. We will describe many of the algorithmic strategies used by cuBLAS and cuDNN, and how they can be implemented using C++ templates to cover an extensive space of problem sizes, data layouts, and data types. In particular, we will emphasize how to support alternative and mixed precision math operations such as Pascal's integer DP4A operation and Volta's TensorCores. Finally, we will illustrate how CUTLASS primitives can be combined with custom functionality to implement related algorithms such as convolution. Although this talk highlights CUTLASS, the architecture concepts and algorithm details are relevant to any CUDA programmer focused on Deep Learning.  Back
 
Keywords:
Algorithms and Numerical Techniques, Tools and Libraries, GTC Silicon Valley 2018 - ID S8854
Streaming:
 
GPU Acceleration of Direct Sparse Matrix Solver for ANSYS Electronics
Zhen Wang (ANSYS)
A GPU-accelerated direct sparse matrix solver has been in use at ANSYS since 2016. It achieves high performance on CPUs and GPUs for a wide range of electromagnetic problems, in comparison with state-of-the-art commercial and open-source software. We ...Read More
A GPU-accelerated direct sparse matrix solver has been in use at ANSYS since 2016. It achieves high performance on CPUs and GPUs for a wide range of electromagnetic problems, in comparison with state-of-the-art commercial and open-source software. We'll review the current GPU acceleration technique, and describe our recent improvements to the GPU-enabled matrix solver technique, observing up to 1.5x speedup over the existing GPU algorithm. This new innovation enables GPU acceleration of matrix computations that would not benefit from GPUs before.  Back
 
Keywords:
Algorithms and Numerical Techniques, HPC and AI, GTC Silicon Valley 2018 - ID S8161
Streaming:
 
Parallel Hashing on Multi-GPU Nodes
Christian Hundt (Johannes Gutenberg University Mainz), Bertil Schmidt (Johannes Gutenberg University Mainz)
We'll discuss WarpDrive a high-speed, scalable, multi-GPU implementation for hashing billions of key-value pairs. Hash maps are among the most versatile data structures because of their compact data layout and expected constant time complexity for ...Read More
We'll discuss WarpDrive a high-speed, scalable, multi-GPU implementation for hashing billions of key-value pairs. Hash maps are among the most versatile data structures because of their compact data layout and expected constant time complexity for insertion and querying. CUDA-enabled GPUs can speedup hashing by virtue of their fast video memory featuring almost one terabytes per second bandwidth in comparison to state-of-the-art CPUs. However, the size of hash maps supported by single-GPU hashing implementations is restricted by the limited amount of available video RAM. We propose a novel subwarp/coalesced group-based probing scheme featuring coalesced memory access over consecutive memory regions in order to mitigate the high latency of irregular access patterns. Our implementation achieves around 1.3 billion insertions per second in single-GPU mode for a load factor of 0.95, clearly outperforming other implementations. We'll also present transparent scaling to multiple GPUs within the same node with over 4.5 billion operations per second for high load factors on four Tesla P100 GPUs connected by NVLink technology. WarpDrive is freely available at https://github.com/sleeepyjack/warpdrive.  Back
 
Keywords:
Algorithms and Numerical Techniques, Performance Optimization, GTC Silicon Valley 2018 - ID S8237
Streaming:
 
Datasets and Algorithms for Road Identification Via Satellite Imagery
Adam Van Etten (In-Q-Tel)
Road identification and route prediction in near real time remains a challenging problem for many geographic regions, particularly in the case of natural disasters or crisis situations. Existing methods such as manual road labeling or aggregatio ...Read More

Road identification and route prediction in near real time remains a challenging problem for many geographic regions, particularly in the case of natural disasters or crisis situations. Existing methods such as manual road labeling or aggregation of mobile GPS track data are currently insufficient in dynamic scenarios. The frequent revisits of satellite imaging constellations may accelerate efforts to rapidly update road network and optimal path prediction, provided routing information can be extracted from imaging pixels. We'll demonstrate deep learning segmentation methods for identifying road center lines and intersections from satellite imagery, and inferring networks from these road segments. We'll also explore data quality requirements by comparing open source labels with-high precision labels created as part of the SpaceNet Roads challenge.

  Back
 
Keywords:
Algorithms and Numerical Techniques, HD Mapping, Federal, GTC Silicon Valley 2018 - ID S8384
Streaming:
 
A Novel Mapped Grid Approach for GPU Acceleration of High-Order Structured Grid CFD Solvers
Steven Frankel (Technion)
We''ll present use of state-of-the-art computational fluid dynamics algorithms and their performance on NVIDIA GPUs, including the new DGX-1 Station using multiple Tesla V100 GPU accelerators. A novel mapped grid approach to implementing high-order s ...Read More
We''ll present use of state-of-the-art computational fluid dynamics algorithms and their performance on NVIDIA GPUs, including the new DGX-1 Station using multiple Tesla V100 GPU accelerators. A novel mapped grid approach to implementing high-order stencil based finite-difference and finite-volume methods is the highlight, but we''ll also feature the use of flux-reconstruction on GPU using OpenACC.  Back
 
Keywords:
Algorithms and Numerical Techniques, Computational Fluid Dynamics, GTC Silicon Valley 2018 - ID S8800
Streaming:
Download:
 
Acceleration of a Computational Fluid Dynamics Code with GPU Using OpenACC
Nicholson Koukpaizan (Georgia Institute of Technology)
The goal of this session is to report the knowledge acquired at the Oak Ridge GPU Hackathon that took place on October 9th-13th 2017, through the acceleration of a CFD (Computational Fluid Dynamics) solver. We''ll focus on the approach used to make t ...Read More
The goal of this session is to report the knowledge acquired at the Oak Ridge GPU Hackathon that took place on October 9th-13th 2017, through the acceleration of a CFD (Computational Fluid Dynamics) solver. We''ll focus on the approach used to make the application suitable for the GPU, the acceleration obtained, and the overall experience at the Hackathon. OpenACC was used to implement GPU directives in this work. We''ll detail the different OpenACC directives used, their advantages and disadvantages, as well as the particularities of CFD applications.  Back
 
Keywords:
Algorithms and Numerical Techniques, Computational Fluid Dynamics, GTC Silicon Valley 2018 - ID S8291
Streaming:
Download:
 
Realtime Signal Processing on NVIDIA TX2 using CUDA
Armin Weiss (Zurich University of Applied Sciences)
In our presentation, we will focus on low latency real-time signal processing on NVIDIA Jetson TX2. Originally designed for image processing, the Jetson TX2 incorporates a vast amount of embedded GPU processing power. However, it has not been widely ...Read More
In our presentation, we will focus on low latency real-time signal processing on NVIDIA Jetson TX2. Originally designed for image processing, the Jetson TX2 incorporates a vast amount of embedded GPU processing power. However, it has not been widely used for signal processing so far. There are two main challenges that have to be addressed: Constantly high input and output data rate of arbitrary digital signals, as well as a very short and deterministic latency requirement (processing time and data transfer time). Using the example of multichannel digital audio processing, we will look at details of CUDA kernel programming, which is a precondition for uninterrupted signal processing. Moreover, we explain efficient data I/O transfer to Jetson TX2 GPU memory, synchronization between CPU and GPU as well as update mechanisms for control data.  Back
 
Keywords:
Algorithms and Numerical Techniques, Product & Building Design, Performance Optimization, GTC Silicon Valley 2018 - ID S8350
Streaming:
Download:
 
Graph Partitioning Using Bayesian Inference on GPU
Carl Yang (UC Davis)
We implement an efficient CUDA algorithm that solves the graph clustering problem using the stochastic block model for the first time on GPUs. The algorithm views the graph as a generative model called degree-corrected stochastic block model, and per ...Read More
We implement an efficient CUDA algorithm that solves the graph clustering problem using the stochastic block model for the first time on GPUs. The algorithm views the graph as a generative model called degree-corrected stochastic block model, and performs statistical inference to discover the graph partitions most likely to result in the graph. A greedy agglomerative heuristic is used with Markov Chain Monte Carlo (MCMC) to do Bayesian inference. A comparison is made with the baseline GraphChallenge implementation on synthetic datasets. Our implementation achieves speed-ups of 11.5x and 4.1x over single-threaded and multi-threaded OpenMP implementations on the CPU. We''ll provide empirical evidence that even though our method of parallelizing MCMC leads to worse convergence in terms of iteration number, we are able to harness the parallelism of the GPU to discover clusters at the same accuracy in less time.  Back
 
Keywords:
Algorithms and Numerical Techniques, GTC Silicon Valley 2018 - ID S8424
Streaming:
 
Large-Scale Multi-Parameter Waveform Inversion with GPUs on the Cloud: A Pipelined Implementation
Huy Le (Stanford University)
We''ll describe how we accelerate the estimation of multiple subsurface properties with GPU-equipped cloud computers and save cost at the same time. Traditionally, institutions spend millions of dollars to build and maintain computing infrastructures ...Read More
We''ll describe how we accelerate the estimation of multiple subsurface properties with GPU-equipped cloud computers and save cost at the same time. Traditionally, institutions spend millions of dollars to build and maintain computing infrastructures that are rarely occupied at full capacity. Cloud computing offers a solution to this via on-demand provisioning that can flexibly meet an institution''s needs, but it comes with two potential problems: preemption and no guarantee of low-latency inter-node communication. To sidestep these issues, we implement a pipeline processing model that fully utilizes CPU memory and GPU global memory to hide latency without having to decompose the computational domain into multiple nodes.  Back
 
Keywords:
Algorithms and Numerical Techniques, Seismic and Geosciences, Data Center and Cloud Infrastructure, HPC and AI, GTC Silicon Valley 2018 - ID S8405
Streaming:
Download:
 
Accelerating Generative Design by Leveraging GPUs on the Cloud
Christopher Hebert (NVIDIA), Jerran Schmidt (Autodesk)
We''ll walk through the use of GPU accelerated voxel-based stress solver in the level set topology optimization engine used for Autodesk Generative Design. We''ll discuss how the solver benefits from executing on the GPU over our CPU implementation a ...Read More
We''ll walk through the use of GPU accelerated voxel-based stress solver in the level set topology optimization engine used for Autodesk Generative Design. We''ll discuss how the solver benefits from executing on the GPU over our CPU implementation and why this is important from both a costing and efficiency standpoint. Autodesk has partnered closely with Amazon to deliver cloud-based simulation on their platform and we will talk about how we are driving GPU usage on the cloud and how we have used the nvidia-docker plugin for PCIe passthrough to run on Amazon''s GPU compute systems.  Back
 
Keywords:
Algorithms and Numerical Techniques, Product & Building Design, Graphics and AI, Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8512
Streaming:
Download:
 
Physics-Based AI for Semiconductor Inspection Using a GPU Based Optical Neural Network (ONN)
Jing Zhang (KLA-Tencor)
We'll start with a brief background of modern semiconductor yield challenges and KLA-Tencor's solutions in the space of inspection and metrology with an emphasis on it is physics-based machine learning approaches. With the shrinking of the critical ...Read More
We'll start with a brief background of modern semiconductor yield challenges and KLA-Tencor's solutions in the space of inspection and metrology with an emphasis on it is physics-based machine learning approaches. With the shrinking of the critical dimension of integrated circuits for every generation, the inspection and metrology for semiconductor process control are facing increasingly challenges from physics limitations. As a solution, KLA-Tencor developed the physics-based AI technologies by combining traditional physical simulation with deep learning to enable a balanced solution between resolution enhancement and computational cost. We'll cover the concepts of incorporating optical physics inside a neural network implemented on GPUs, which we call an ONN (Optical Neural Network).  Back
 
Keywords:
Algorithms and Numerical Techniques, Industrial Inspection, GTC Silicon Valley 2018 - ID S8959
Download:
 
Accelerating Linear Algebra on Small Matrices - from Batched BLAS to Large Scale Solvers
Stanimire Tomov (UTK), Ichitaro Yamazaki (UTK)
Learn how to accelerate many small-sized linear algebra problems - from kernels to large-scale solvers. We describe techniques targeting parallelization, vectorization, and communication, which have become extremely challenging on many-core architect ...Read More
Learn how to accelerate many small-sized linear algebra problems - from kernels to large-scale solvers. We describe techniques targeting parallelization, vectorization, and communication, which have become extremely challenging on many-core architectures/GPUs. Standard interfaces, called batched APIs, are proposed to be included in highly-optimized libraries like MAGMA that provide the most extended set of batched BLAS and LAPACK functionalities to date. We'll describe the developments as well as their use to accelerate applications from big data analytics to high-order FEM tensor computations, and low-rank approximations for solvers and preconditioners. We'll also concentrate on the GPU acceleration of a large-scale distributed-memory solver that uses a hierarchically compressed coefficient matrix.  Back
 
Keywords:
Algorithms and Numerical Techniques, Performance Optimization, GTC Silicon Valley 2018 - ID S8475
Streaming:
Download:
 
Helping the Discovery of New Galaxies on the World's Largest Telescopes Using a Large GPU Cluster
Damien Gratadour (Université Paris Diderot & Observatoire de Paris), Hatem Ltaief (KAUST)
Have you heard about the world's biggest eye ever built? Are you interested in scientific simulations running on NVIDIA DGX-1? Come and learn how combining these powerful computing devices dramatically leaps forward the computational astronomy commu ...Read More
Have you heard about the world's biggest eye ever built? Are you interested in scientific simulations running on NVIDIA DGX-1? Come and learn how combining these powerful computing devices dramatically leaps forward the computational astronomy community in designing major, multimillion-dollar optical instruments for the European Extremely Large Telescope. Starting from the mathematical model up to the high-performance implementation on DGX-1, we'll explain how the resulting matrix computations associated with an efficient task-based programming model help design the next generation of telescope instruments and, eventually, demonstrate a pathfinder for the discovery of new galaxies.  Back
 
Keywords:
Algorithms and Numerical Techniques, Astronomy and Astrophysics, GTC Silicon Valley 2018 - ID S8231
Streaming:
Download:
 
Performance Evaluation of GPU-Accelerated Linear Solvers on TCAD Examples
Ana Iontcheva (Silvaco)
We'll present the results of our evaluation of GPU-accelerated sparse linear solvers from paralution and magma and compare them with our CPU-only sparse linear solvers on technology computer-aided design (TCAD) examples. TCAD is a category of softwa ...Read More
We'll present the results of our evaluation of GPU-accelerated sparse linear solvers from paralution and magma and compare them with our CPU-only sparse linear solvers on technology computer-aided design (TCAD) examples. TCAD is a category of software tools for designing semiconductor devices. The use of semiconductor devices can be found in almost any area of our current life. The purpose of TCAD tools is to replace the cumbersome physical experiments with computer simulations. A significant part of the whole simulation time is spent on solving the linear systems so the performance of the linear solvers is extremely important.  Back
 
Keywords:
Algorithms and Numerical Techniques, Computational Physics, GTC Silicon Valley 2018 - ID S8179
Streaming:
Download:
 
Image Data Augmentation on GPU: One Method That Does It All
Tim Zaman (NVIDIA)
Data augmentation is an effective method to boost your deep-learning training performance. There are many ways of doing this augmentation, and the ways to do so are not well established, and not all deep learning frameworks support augmentation nativ ...Read More
Data augmentation is an effective method to boost your deep-learning training performance. There are many ways of doing this augmentation, and the ways to do so are not well established, and not all deep learning frameworks support augmentation natively. We present a method of doing data augmentation that is based on transformation matrices to perturb both space and color, in a way that is easy to use and understand, framework-agnostic, and fast (runs on GPU). This method works especially well for augmentations that need to be applied to both images and labels, typical in object detection and segmentation tasks. Image augmentation is a job that GPU's excel at, and it will significantly reduce the load, and need, for a fast CPU.  Back
 
Keywords:
Algorithms and Numerical Techniques, Deep Learning and AI Frameworks, Video and Image Processing, GTC Silicon Valley 2018 - ID S8380
Streaming:
 
New Frontiers for Dense Linear Solvers: Towards Extreme Performance and Energy Efficiency
Ahmad Abdelfattah (Innovative Computing Laboratory University of Tennessee), Azzam Haidar (Innovative Computing Laboratory, University of Tennessee)
Learn how to develop fast and energy-efficient linear solvers using GPUs. Hybrid CPU-GPU techniques achieve high performance at the cost of extra power consumption. The new advancements in GPU architectures enable full GPU solutions that are high per ...Read More
Learn how to develop fast and energy-efficient linear solvers using GPUs. Hybrid CPU-GPU techniques achieve high performance at the cost of extra power consumption. The new advancements in GPU architectures enable full GPU solutions that are high performance, energy efficient, and CPU-independent. In addition, new technologies such as half precision arithmetic (FP16) help the design of new solvers that are significantly faster and even more energy efficient. While FP16 arithmetic has been a powerful tool for deep learning applications, our designs show that it is also very useful for boosting performance and energy efficiency of linear solvers. The new developments complement the hybrid algorithms in the MAGMA library, and provide users with a wide variety of designs that fit different requirements of performance, energy efficiency, and numerical accuracy.  Back
 
Keywords:
Algorithms and Numerical Techniques, Performance Optimization, GTC Silicon Valley 2018 - ID S8478
Streaming:
Download:
Animation and VFX
Presentation
Media
Genesis: MPC's Virtual Production Platform Where the Stories Begin
Damien Fagnou (Moving Picture Company), Francesco Giordana (MPC)
We'll showcase the MPC Virtual Production Platform called Genesis. While we won't be able to show any datasets currently in production, we'll showcase the technology and have some MPC original content to share. ...Read More

We'll showcase the MPC Virtual Production Platform called Genesis. While we won't be able to show any datasets currently in production, we'll showcase the technology and have some MPC original content to share.

  Back
 
Keywords:
Animation and VFX, Virtual Reality and Augmented Reality, GTC Silicon Valley 2018 - ID S8365
Streaming:
Download:
 
Go with the Flow: Pixar's Interactive Look-Development Tool
Florian Hecht (Pixar), Peter Roe (Pixar)
Pixar Animation Studios' new look-development tool, Flow, enables artists to create amazing shaders in a fully interactive environment. The software allows them to create and visualize very complex procedural shading networks, as required by feature ...Read More
Pixar Animation Studios' new look-development tool, Flow, enables artists to create amazing shaders in a fully interactive environment. The software allows them to create and visualize very complex procedural shading networks, as required by feature film production. Flow is built on-top of Rtp, an NVIDIA OptiX-based real-time GPU path tracer developed at Pixar, as well as USD, Pixar's open-source universal-scene-description. We'll show how these technologies are combined to create artist-focused workflows that are exploration-driven and more interactive than ever before. We'll also talk about how shading networks are implemented inside of our path tracer, which makes use of key OptiX features.  Back
 
Keywords:
Animation and VFX, Rendering and Ray Tracing, GTC Silicon Valley 2018 - ID S8370
Streaming:
 
A Data-Driven Future in Visual Effects Pipelines
Rishabh Battulwar (Digital Domain)
Visual effects pipelines at Digital Domain are undergoing a transition from using the traditional parametric models to data-driven generative models. We are extensively developing example-based systems that are geared to accommodate artist input whil ...Read More
Visual effects pipelines at Digital Domain are undergoing a transition from using the traditional parametric models to data-driven generative models. We are extensively developing example-based systems that are geared to accommodate artist input while staying within these generative models. Underlying all of these technologies is heavy GPU usage that is helping artists iterate quickly and reach their creative goals much faster than ever before. GPU-ready toolkits and libraries have also provided the ability to quickly iterate and try different methods in the development of data-driven approaches and have been key to finding a production-ready solution. We''ll go over how this transition occurred, along with examples where our creature development pipeline has benefited from these changes. We''ll also discuss where we see machine learning taking us in the future.  Back
 
Keywords:
Animation and VFX, GTC Silicon Valley 2018 - ID S8925
Streaming:
Download:
 
Creating Immersive AI-Powered Virtual Reality Simulation Training For Medical Professionals
Shauna Heller (AiSolve)
Experiential learning is among the best ways to practice for pediatric emergencies. However, hospitals are spending millions on expensive and inefficient mannequin-based training that does not consistently offer an authentic experience for med studen ...Read More
Experiential learning is among the best ways to practice for pediatric emergencies. However, hospitals are spending millions on expensive and inefficient mannequin-based training that does not consistently offer an authentic experience for med students and doctors, or offer convenient repeatability. Come hear about a groundbreaking pilot program that brought together a hospital and two unique VR and AI developer teams to deliver virtual reality training simulations for some of the most high stakes emergencies hospitals see: pediatric trauma. Learn how doctors aided in the design process to create authentic trauma room scenarios; how expert content and simulation developers crafted a VR experience that would have impact in a world where there is no room for error; and why Oculus supported this project with funding and hardware.  Back
 
Keywords:
Animation and VFX, Virtual Reality and Augmented Reality, GTC Silicon Valley 2018 - ID S8504
Streaming:
Astronomy and Astrophysics
Presentation
Media
AstroAccelerate - GPU-Accelerated Signal Processing for Next Generation Radio Telescopes
Wes Armour (University of Oxford)
AstroAccelerate is a GPU-enabled software package that focuses on enabling real-time processing of time-domain radio-astronomy data. It uses the CUDA programming language for NVIDIA GPUs. The massive computational power of modern day GPUs allows the ...Read More
AstroAccelerate is a GPU-enabled software package that focuses on enabling real-time processing of time-domain radio-astronomy data. It uses the CUDA programming language for NVIDIA GPUs. The massive computational power of modern day GPUs allows the code to perform algorithms such as de-dispersion, single pulse searching, and Fourier domain acceleration searching in real time on very large datasets, which are comparable to those that will be produced by next-generation radio telescopes such as the Square Kilometre Array.  Back
 
Keywords:
Astronomy and Astrophysics, GTC Silicon Valley 2018 - ID S8266
Streaming:
Download:
 
Can FPGAs Compete with GPUs?
John W. Romein (ASTRON (Netherlands Institute for Radio Astronomy))
Previously, FPGAs were known to be highly energy efficient, but notoriously difficult to program, and unsuitable for complex HPC applications. This is changing due to new technology developments: a high-level programming language (OpenCL), hard float ...Read More
Previously, FPGAs were known to be highly energy efficient, but notoriously difficult to program, and unsuitable for complex HPC applications. This is changing due to new technology developments: a high-level programming language (OpenCL), hard floating-point units, and tight integration with CPU cores. We''ll compare FPGAs and GPUs with respect to architecture, programming model, programming effort, performance, and energy efficiency, using some radio-astronomical signal-processing and imaging algorithms as examples. Can they compete with GPUs?  Back
 
Keywords:
Astronomy and Astrophysics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8310
Streaming:
 
Preparing an AMR Library for Summit
Max Katz (NVIDIA)
In this session you''ll hear about one team''s experience preparing an adaptive mesh refinement library -- and a fluid dynamics code based on it -- for Summit, the IBM POWER9 and NVIDIA Volta system at Oak Ridge National Laboratory, where multiple GP ...Read More
In this session you''ll hear about one team''s experience preparing an adaptive mesh refinement library -- and a fluid dynamics code based on it -- for Summit, the IBM POWER9 and NVIDIA Volta system at Oak Ridge National Laboratory, where multiple GPUs are connected via NVLink to each other and the CPUs. It was simple to compile and run on the OpenPOWER architecture, and to offload to the GPUs with CUDA Fortran, with little architecture-specific code. Initial results with POWER8 and P100 have shown excellent CPU and GPU performance and good multi-node scaling for an astrophysics mini-app that was difficult to run effectively on prior GPU architectures. We will also discuss our experiences porting other modules in our multi-physics codes, and preliminary results on the POWER9 and V100 platform.  Back
 
Keywords:
Astronomy and Astrophysics, Tools and Libraries, Computational Fluid Dynamics, GTC Silicon Valley 2018 - ID S8397
Streaming:
Download:
 
Powering Real-Time Radio Astronomy Signal Processing with Latest GPU Architectures
Harshavardhan Reddy (NCRA)
We''ll present a summary of ongoing work that targets the use of newer GPU architecture (Pascal and Volta) features in real-time signal processing applications in radio astronomy telescopes, and outline the future growth path for this ex ...Read More

We''ll present a summary of ongoing work that targets the use of newer GPU architecture (Pascal and Volta) features in real-time signal processing applications in radio astronomy telescopes, and outline the future growth path for this exciting new application of GPUs. With Pascal and Volta architectures, we''ll discuss the advantage of using higher memory bandwidth, half-single precision, and integer arithmetic in existing GPU-based correlator pipeline code. This is an ongoing effort between the National Centre for Radio Astrophysics and NVIDIA. We''ll look at various processing stages involved in the pipeline for exploring optimization possibilities, and highlight interesting results that were achieved. We''ll address in detail the effect of using half precision with respect to accuracy of performance and required library changes.

  Back
 
Keywords:
Astronomy and Astrophysics, GTC Silicon Valley 2018 - ID S8339
Streaming:
Download:
 
Image-Domain Gridding on Accelerators
Bram Veenboer (ASTRON)
We will present our latest results on Image Domain Gridding, an algorithm for radio astronomical imaging. This algorithm outperforms the state of the art in traditional imaging algorithms both in terms of image quality (by applying more corrections) ...Read More
We will present our latest results on Image Domain Gridding, an algorithm for radio astronomical imaging. This algorithm outperforms the state of the art in traditional imaging algorithms both in terms of image quality (by applying more corrections) and performance. In this talk, we will first introduce the algorithm and then demonstrate that this algorithm works very well on highly parallel accelerators. We will show the in-depth performance analysis and optimization techniques that we applied to get there.  Back
 
Keywords:
Astronomy and Astrophysics, Performance Optimization, GTC Silicon Valley 2018 - ID S8128
Streaming:
Download:
 
The SETI Institute: Using GPUs for Systems Science, Technology, and Exploration
Nathalie A. Cabrol (Seti Institute), Graham Mackintosh (NASA-STC, SETI Institute)
The SETI Institute (SI) approaches the question of the origin and nature of life in the universe. Our NASA Astrobiology Institute team develops new exploration strategies and detection methods to support the search for biosignatures on Mars and other ...Read More
The SETI Institute (SI) approaches the question of the origin and nature of life in the universe. Our NASA Astrobiology Institute team develops new exploration strategies and detection methods to support the search for biosignatures on Mars and other planets. SI is also driving a new paradigm for the exploration of biosignatures and signs of technology at all scales, using a holistic approach. This new direction requires the rapid analysis of vast amounts of data. In this presentation, we'll describe the history, successes, and challenges to current approaches, and describe SI's current and future efforts in FDL and other areas to incorporate AI and deep learning to drive this new big data paradigm for finding life in the universe.  Back
 
Keywords:
Astronomy and Astrophysics, In-Situ and Scientific Visualization, GTC Silicon Valley 2018 - ID S81023
Streaming:
Download:
Autonomous Vehicles
Presentation
Media
Autonomous Algorithms
Kukhyun Cho (Stradvision), Murat Durus (NVIDIA), Faustino Gomez (NNAISENSE), Jens Klimke (fka Aachen)
Artificial intelligence algorithms are the only solution to the complex environments and dynamic driving conditions of the autonomous driving problem. There is no way for engineers to hard-code and test every possible variable or situation a car may ...Read More
Artificial intelligence algorithms are the only solution to the complex environments and dynamic driving conditions of the autonomous driving problem. There is no way for engineers to hard-code and test every possible variable or situation a car may face in a daily drive. We'll present and discuss AI algorithms and implementations using NVIDIA DRIVE platform with our ecosystem partners.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8862
Streaming:
Download:
 
Next Generation Sensors for Self-Driving
Ronit Fuchs (Arbe Robotics), Jun Pei (Cepton Technologies), Andres PrietoMoreno (FLIR Systems), Glenn Schuster (NVIDIA), Aditya Srinivasan (Innoviz)
We'll highlight the next generation sensors that the autonomous driving innovators are working on. Key sensor vendors will discuss the need for these upcoming sensing technologies for self-driving and the use-cases they are enabling.  
We'll highlight the next generation sensors that the autonomous driving innovators are working on. Key sensor vendors will discuss the need for these upcoming sensing technologies for self-driving and the use-cases they are enabling.    Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8860
Streaming:
Download:
 
The Road From GPU-Powered Prototypes to Production-Ready ECUs
Christoph Herzog (Elektrobit Automotive GmbH), Alexander Much (Elektrobit Automotive GmbH)
GPUs provide power-efficient hardware acceleration for graphics processing and deep learning algorithms, making them the ideal compute processors for highly automated driving functionalities. Despite the predominance of GPUs in the development of pro ...Read More
GPUs provide power-efficient hardware acceleration for graphics processing and deep learning algorithms, making them the ideal compute processors for highly automated driving functionalities. Despite the predominance of GPUs in the development of prototypes, the actual market penetration of GPUs in series-production electronic control units (ECUs) remains comparably low. In this talk we will focus on a key contributor to this problem: deficient support for integration into the design processes of the automotive supply chain and automotive software standards.  Back
 
Keywords:
Autonomous Vehicles, GPU Virtualization, GTC Silicon Valley 2018 - ID S8851
Streaming:
Download:
 
Autoware on NVIDIA DRIVE: The Open-Source Self-Driving Platform
Shinpei Kato (Tier IV, Inc.)
We'll present a complete open-source software stack for self-driving vehicles, called Autoware, and its open integration with the NVIDIA DRIVE platform. Autoware implements working modules of localization and 3D mapping with LiDAR and GNSS, ...Read More

We'll present a complete open-source software stack for self-driving vehicles, called Autoware, and its open integration with the NVIDIA DRIVE platform. Autoware implements working modules of localization and 3D mapping with LiDAR and GNSS, object detection and traffic light recognition with deep learning, path planning with lattice and search methods, and vehicle dynamics control. Compute-intensive tasks of these modules are accelerated by using CUDA, and timing-aware tasks are protected by RTOS capabilities. We'll discuss the impact of CUDA acceleration on self-driving vehicles and its performance evaluation. Learn how Autoware enables any by-wire vehicles to become high-quality self-driving vehicles that can operate in real-world environments.

  Back
 
Keywords:
Autonomous Vehicles, Autonomous Driving, GTC Silicon Valley 2018 - ID S8636
Streaming:
Download:
 
What it Takes to Drive Autonomously on Chinese roads
Yiming Liu (Pony.ai)
Pony.ai will share the key technological milestones it has achieved in the past several months of road testing in China, including the company's soft launch of China's first-ever autonomous vehicle robotaxi service. CEO James Peng will s ...Read More

Pony.ai will share the key technological milestones it has achieved in the past several months of road testing in China, including the company's soft launch of China's first-ever autonomous vehicle robotaxi service. CEO James Peng will share the unique challenges posed by a Chinese road environment and how we leveraged deep learning and computational models to conquer those challenges. Pony.ai's mission is to build the safest and most reliable L4 autonomous driving technology. The startup was founded at the end of 2016 and is co-located in the heart of Silicon Valley and China.

  Back
 
Keywords:
Autonomous Vehicles, NVIDIA Inception Program, Autonomous Driving, GTC Silicon Valley 2018 - ID S8995
Streaming:
 
Deploying Autonomous Vehicles with NVIDIA DRIVE
Srikanth Sundaram (NVIDIA)
DRIVE PX is an open platform for Autonomous Driving Ecosystem. Its been adopted by over 300 partners in the automotive ecosystem to develop solutions for vehicles that are intelligent and autonomous. This talk will outline the technical challeng ...Read More

DRIVE PX is an open platform for Autonomous Driving Ecosystem. Its been adopted by over 300 partners in the automotive ecosystem to develop solutions for vehicles that are intelligent and autonomous. This talk will outline the technical challenges facing development of autonomous intelligent vehicles and provide details of how the next generation of DRIVE AI car computer i.e. DRIVE Xavier and DRIVE Pegasus address these challenges.

  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8666
Streaming:
 
Solving Real-Time Perception: the Most Difficult Problem in Autonomous Driving
Forrest Iandola (DeepScale)
We'll show how deep neural networks can ingest raw data from multiple types of sensors to generate improved perception results in real time, using processors fit for automotive mass production. Today's mass-produced driver-assistance sys ...Read More

We'll show how deep neural networks can ingest raw data from multiple types of sensors to generate improved perception results in real time, using processors fit for automotive mass production. Today's mass-produced driver-assistance systems are typically implemented with a late-fusion paradigm. This approach has a number of limitations in terms of accuracy, portability, and robustness to sensor failure. We'll propose an earlier stage of fusion, called Deep Sensor Fusion, where sensors transmit raw data over higher bandwidth in-vehicle networking, which is already used in mass production today.

  Back
 
Keywords:
Autonomous Vehicles, NVIDIA Inception Program, Computer Vision, GTC Silicon Valley 2018 - ID S8163
Streaming:
Download:
 
TuSimple Autonomous Trucks: Prototypes to Products
Xiaodi Hou (TuSimple)
Overview of TuSimple's unique full vision-based autonomous driving solution with a case study of it's camera based perception solution. Deep dive into the hidden elements of autonomous truck system in terms of algorithm, big data and hardware. Plus ...Read More
Overview of TuSimple's unique full vision-based autonomous driving solution with a case study of it's camera based perception solution. Deep dive into the hidden elements of autonomous truck system in terms of algorithm, big data and hardware. Plus, a look into the future of autonomous driving developments in sensors (Camera VS Lidar), redundant systems, and computational resources.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S81045
Streaming:
 
NVIDIA SDK Manager - Simplify Your Development Environment Setup
Avraham Shapira (NVIDIA)
SDK Manager is an NVIDIA all-in-one tool that enables developers to set up their development environments: easy, simple, and fast. The tool centralizes many different software development packages in one location; resolves dependencies for different ...Read More
SDK Manager is an NVIDIA all-in-one tool that enables developers to set up their development environments: easy, simple, and fast. The tool centralizes many different software development packages in one location; resolves dependencies for different SDKs, libraries, and software packages; supports NVIDIA hardware development platforms; and flashes different multiple operating systems. With only a few clicks, the user will be able to download, install, and have a complete development environment ready to work. SDK Manager's first focus are the NVIDIA DRIVE platforms for development of autonomous vehicles.  Back
 
Keywords:
Autonomous Vehicles, Tools and Libraries, GTC Silicon Valley 2018 - ID S8337
Streaming:
 
Deploying Autonomous Vehicles with NVIDIA DRIVE -
Srikanth Sundaram (NVIDIA)
DRIVE PX is an open platform for Autonomous Driving Ecosystem. Its been adopted by over 300 partners in the automotive ecosystem to develop solutions for vehicles that are intelligent and autonomous. This talk will outline the technical challenges f ...Read More
DRIVE PX is an open platform for Autonomous Driving Ecosystem. Its been adopted by over 300 partners in the automotive ecosystem to develop solutions for vehicles that are intelligent and autonomous. This talk will outline the technical challenges facing development of autonomous intelligent vehicles and provide details of how the next generation of DRIVE AI car computer i.e. DRIVE Xavier and DRIVE Pegasus address these challenges.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8666A
Streaming:
 
Development of a Self-Learning AI-Based L4 Vehicle - The Dream Car
Oliver Briemle (ZF), Daniel Watzenig (Virtual Vehicle)
The development of self-driving cars requires a strong relationships between partners in a different way as we know it today. This might be the only way to successfully bring self-driving vehicles on the road. ZF, Virtual Vehicle, and NVIDIA hav ...Read More

The development of self-driving cars requires a strong relationships between partners in a different way as we know it today. This might be the only way to successfully bring self-driving vehicles on the road. ZF, Virtual Vehicle, and NVIDIA have joined forces to develop an AI-based L4 vehicle for urban scenarios in only six months; the so-called dream car. Learning while sleeping is the groundbreaking idea of the dream car which was realized in the second half of 2017. Without driving around, the car constantly learns and adapts itself based on data acquired from other cars driving around somewhere else in the world. The key is AI and ZF''s ProAI which was developed with NVIDIA in the past year. ProAI interprets the data in real-time, learns from it, validates the data, checks the plausibility, and adjusts the vehicle behavior. We''ll summarizes the implementation steps, HW and SW architecture, relevant driving/testing scenarios, our AI approach, and the challenges met in order to realize the dream car.

  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8921
Streaming:
 
Sensing Technologies for an Autonomous Tomorrow (Presented by Analog Devices)
Chris Jacobs (Analog Devices)
The future of autonomous transport is upon us. In order to provide safe, reliable transport for all, it is essential to have the most accurate, real time 3D map around the vehicle. The 360 degree safety shield created using radar, LIDAR, cameras, and ...Read More
The future of autonomous transport is upon us. In order to provide safe, reliable transport for all, it is essential to have the most accurate, real time 3D map around the vehicle. The 360 degree safety shield created using radar, LIDAR, cameras, and IMUs make up the perception sensor suite is the foundation to making this a reality. Data from high performance imaging radar, LIDAR, and cameras are fused together giving the vehicle it''s sense of sight, whereas the IMU gives the vehicle is sense of feeling, while also ensuring it maintains its heading. The large amount of data generated from Analog Devices'' Drive360 sensors will require high performance AI computers in the vehicle such as NVIDIA''s Drive Pegasus to generate the real time 3D map. Together, Analog Devices & NVIDIA can enable safe, reliable autonomous transportation for all.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8964
Streaming:
Download:
 
Creating AI-Based Digital Companion for Mercedes-Benz Vehicles
Rigel Smiroldo (Mercedes-Benz Research & Development North America Inc.)
In-vehicle user experience needs intelligence not only to delight its users with a truly personalized experience and to simplify repetitive actions but also to minimize cognitive load and to decrease distractions. When driving becomes fully autonomou ...Read More
In-vehicle user experience needs intelligence not only to delight its users with a truly personalized experience and to simplify repetitive actions but also to minimize cognitive load and to decrease distractions. When driving becomes fully autonomous, vehicle needs to understand its users intent without getting explicit directions from them. To achieve such experience, customers behavior and interactions are analyzed in real-time to understand their intent and to predict what they will do next.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8970
Streaming:
Download:
 
Connected Automated Driving: Overview, Design, and Technical Challenges
Gaurav Bansal (Toyota InfoTechnology Center, USA)
We''ll discuss the important emerging field of connected automated driving, including technical and policy topics in this area. We''ll provide background on vehicular safety communications and current deployments in various parts ...Read More

We''ll discuss the important emerging field of connected automated driving, including technical and policy topics in this area. We''ll provide background on vehicular safety communications and current deployments in various parts of the world. Vehicular communication will enable sensor data sharing between vehicles, which could be the key for achieving higher levels of automation. Novel artificial intelligence techniques exploiting sensor data (camera, radar, GPS etc.) from neighboring cars can be used for designing perception and mapping functionalities for automated vehicles. We''ll discuss results from field testing and show advantages of connected automated driving.

  Back
 
Keywords:
Autonomous Vehicles, Autonomous Driving, GTC Silicon Valley 2018 - ID S8538
Streaming:
Download:
 
Deep Learning for Automated Systems: From the Warehouse to the Road
Melissa Smith (Clemson University)
Learn about our application of deep learning techniques for perception systems in autonomous driving, reinforcement learning for autonomous systems, label detection in warehouse inventory management, and undergraduate engagement in this research ...Read More

Learn about our application of deep learning techniques for perception systems in autonomous driving, reinforcement learning for autonomous systems, label detection in warehouse inventory management, and undergraduate engagement in this research. In collaboration with Clemson University''s International Center for Automotive Research, we''ve developed a perception module that processes camera inputs to provide environmental information for use by a planning module to actively control the autonomous vehicle. We''re extending this work to include an unsupervised planning module for navigation with reinforcement learning. We''ve also applied these techniques to automate the job of warehouse inventory management using a deep neural network running on a mobile, embedded platform to automatically detect and scan labels and report inventory, including its location in the warehouse. Finally, we''ll discuss how we involve undergraduate students in this research.

  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S8140
Streaming:
Download:
 
Advancing State-of-the-Art of Autonomous Vehicles and Robotics Research using AWS GPU Instances (Presented by Amazon Web Services)
Adrien Gaidon (Toyota Research Institute), Chetan Kapoor (Amazon Web Services)
Toyota Research Institute's (TRI) mission is to improve the quality of human life through advances in artificial intelligence, automated driving, and robotics. Learn more about their research and how they are using AWS EC2 P3 instances, industry's ...Read More
Toyota Research Institute's (TRI) mission is to improve the quality of human life through advances in artificial intelligence, automated driving, and robotics. Learn more about their research and how they are using AWS EC2 P3 instances, industry's most powerful GPUs instances, in combination with other AWS services to enable autonomous vehicles and robots at scale.  Back
 
Keywords:
Autonomous Vehicles, GTC Silicon Valley 2018 - ID S81014
Streaming:
Download:
 
Unlocking Access to HD Maps for Autonomous Driving
Willem Strijbosch (TomTom)
Autonomous vehicles require highly accurate, up-to-date maps for a safe, comfortable and optimized experience. TomTom's multi-source, multi-sensor approach leads to HD Maps that have greater coverage, are more richly attributed, and have hig ...Read More

Autonomous vehicles require highly accurate, up-to-date maps for a safe, comfortable and optimized experience. TomTom's multi-source, multi-sensor approach leads to HD Maps that have greater coverage, are more richly attributed, and have higher quality than single-source, single-sensor maps. Autonomous vehicles also need to be able to access the latest, most up-to-date HD Maps with minimal latency. Learn how TomTom is taking on this challenge.

  Back
 
Keywords:
Autonomous Vehicles, HD Mapping, Autonomous Driving, GTC Silicon Valley 2018 - ID S8700
Streaming:
Download:
Climate, Weather, Ocean Modeling
Presentation
Media
Sunny Skies Ahead! Versioning GPU accelerated WRF to 3.7.1
Stanley Posey (NVIDIA)
We'll detail the inherent challenges in porting a GPU-accelerated community code to a newer major version, integrating the community non-GPU changes with OpenACC directives from the earlier version. This is a non-trivial exercise - this particular v ...Read More
We'll detail the inherent challenges in porting a GPU-accelerated community code to a newer major version, integrating the community non-GPU changes with OpenACC directives from the earlier version. This is a non-trivial exercise - this particular version upgrade contained 143,000 modified lines of code which required reintegration into our accelerator directives. This work is important in providing support for newer features whilst still providing GPU support for the users. We'll also look at efforts to improve the maintainability of GPU accelerated community codes.  Back
 
Keywords:
Climate, Weather, Ocean Modeling, Programming Languages, GTC Silicon Valley 2018 - ID S8241
Streaming:
Download:
 
Performance Optimization for Scientific Applications
Alan Gray (NVIDIA)
We'll take you on a journey through enabling applications for GPUs; interoperability of different languages (including Fortran, OpenACC, C, and CUDA); CUDA library interfacing; data management, movement, and layout tuning; kernel optimization; tool ...Read More
We'll take you on a journey through enabling applications for GPUs; interoperability of different languages (including Fortran, OpenACC, C, and CUDA); CUDA library interfacing; data management, movement, and layout tuning; kernel optimization; tool usage; multi-GPU data transfer; and performance modeling. We'll show how careful optimizations can have a dramatic effect and push application performance towards the maximum possible on the hardware. We'll describe tuning of multi-GPU communications, including efficient exploitation of high-bandwidth NVLink hardware. The applications used in this study are from the domain of numerical weather prediction, and also feature in the ESCAPE European collaborative project, but we'll present widely relevant techniques in a generic and easily transferable way.  Back
 
Keywords:
Climate, Weather, Ocean Modeling, Performance Optimization, GTC Silicon Valley 2018 - ID S8190
Streaming:
 
An Approach to Developing MPAS on GPUs
Raghu Raj Prasanna Kumar (National Center for Atmospheric Research)
MPAS-A is a general circulation (global) model of the Earth''s atmosphere that is designed to work down to so-called non-hydrostatic scales where convective (vertical) cloud processes are resolved. To date, MPAS-A has been used primarily for meteorol ...Read More
MPAS-A is a general circulation (global) model of the Earth''s atmosphere that is designed to work down to so-called non-hydrostatic scales where convective (vertical) cloud processes are resolved. To date, MPAS-A has been used primarily for meteorological research applications, although climate applications in the community earth system model are being contemplated. At a high level, MPAS-A consists of a dynamics part, a fluid flow solver that integrates the non-hydrostatic time dependent nonlinear partial differential equations of the atmosphere, and a physics part, which computes the forcings of these equations due to radiative transport, cloud physics, and surface and near surface processes. The dynamics is in turn divided into the dry dynamics and moist dynamics parts. Algorithmically, the dynamics uses a finite volume method on an unstructured centroidal Voronoi mesh (grid, or tessellation) with a C-grid staggering of the state variables as the basis for the horizontal discretization.  Back
 
Keywords:
Climate, Weather, Ocean Modeling, Performance Optimization, GTC Silicon Valley 2018 - ID S8812
Streaming:
Download:
 
Faster than Real-Time Computing in Tsunami Early Warning Systems
Jorge Macias (EDANYA Group (University of Malaga))
When used as predictive tools in natural disasters such as tsunamis, numerical models require extremely fast computations. Just a few years ago, real-time computing in Tsunami Early Warning Systems (TEWS) was unthinkable. Nevertheless, the EDANYA Gro ...Read More
When used as predictive tools in natural disasters such as tsunamis, numerical models require extremely fast computations. Just a few years ago, real-time computing in Tsunami Early Warning Systems (TEWS) was unthinkable. Nevertheless, the EDANYA Group has revolutionized tsunami science paradigms. With the goal of saving lives in the framework of TEWS, our group has developed Tsunami-HySEA, a GPU-based numerical model aimed at producing numerical simulations of tsunami events that are faster than ever. Based on highly efficient, robust mathematical algorithms, together with the computational power of NVIDIA GPUs, Tsunami-HySEA is able to simulate a tsunami event in only a few minutes. Nowadays, one of the main challenges in tsunami science is producing accurate assessments of tsunami wave impacts and just a few minutes after the generating earthquake is triggered. This timely prediction would save many lives in a tsunami scenario. When the response is needed only in a few minutes, the resulting scenario is challenging. The required characteristics are difficult to combine in a single numerical tool: robustness, low-dissipation, large domains, and an extremely fast response  Back
 
Keywords:
Climate, Weather, Ocean Modeling, GTC Silicon Valley 2018 - ID S81003
Streaming:
Download:
 
An Agile Approach to Building a GPU-enabled and Performance-portable Global Cloud-resolving Atmospheric Model
Richard Loft (National Center for Atmospheric Research)
We'll give a high-level overview of the results of these efforts, and how we built a cross-organizational partnership to achieve them. Ours is a directive-based approach using OMP and OpenACC to achieve portability. We have focused on achieving good ...Read More
We'll give a high-level overview of the results of these efforts, and how we built a cross-organizational partnership to achieve them. Ours is a directive-based approach using OMP and OpenACC to achieve portability. We have focused on achieving good performance on three main architectural branches available to us, namely: traditional multi-core processors (e.g. Intel Xeons), core processors such as the Intel Xeon Phi, and, of course NVIDIA GPUs. Our focus has been on creating tools for accelerating the optimization process, techniques for effective cross-platform optimization, and methodologies for characterizing and understanding performance. The results are encouraging, suggesting a path forward based on standard directives for responding to the pressures of future architectures.  Back
 
Keywords:
Climate, Weather, Ocean Modeling, Performance Optimization, GTC Silicon Valley 2018 - ID S8811
Streaming:
Download:
Computational Biology and Chemistry
Presentation
Media
A New Level of Protein-Protein Complexes Prediction in Modern Drug Design
Timofei Ermak (BIOCAD)
Attendees will learn how GPU computing significantly increases accuracy of predictions in solving one of the hardest structural bioinformatics problem -- protein-protein complexes prediction (docking). We will present how it is used in modern drug di ...Read More
Attendees will learn how GPU computing significantly increases accuracy of predictions in solving one of the hardest structural bioinformatics problem -- protein-protein complexes prediction (docking). We will present how it is used in modern drug discovery by Russian biotechnology company BIOCAD. Protein-protein docking appears on a number of steps in modern drug discovery, so it is essential to make accurate predictions to design more quality drugs. It is a very computationally intensive task due to large solution space and big sizes of protein molecular systems. GPU computing makes it possible to scan huge solution space by solid metric of Gibbs Free Energy, thereby, significantly improving the quality of predictions as well as decreases overall calculation time.  Back
 
Keywords:
Computational Biology and Chemistry, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8226
Streaming:
Download:
 
Improving NAMD Performance on Volta GPUs
David Hardy (University of Illinois at Urbana-Champaign), Ke Li (NVIDIA), John Stone (University of Illinois at Urbana Champaign)
In 2007, NAMD was the first full-featured production molecular dynamics software to use CUDA for accelerating its costliest computations. We'll describe our latest efforts, techniques, and results in our quest to optimize NAMD to make best use of th ...Read More
In 2007, NAMD was the first full-featured production molecular dynamics software to use CUDA for accelerating its costliest computations. We'll describe our latest efforts, techniques, and results in our quest to optimize NAMD to make best use of the tremendous computational capabilities of state-of-the-art Volta GPUs, particularly in new dense node configurations such as the NVIDIA DGX and ORNL Summit systems that feature NVLink-connected GPUs. In existence now for over 20 years, NAMD is a sophisticated parallel molecular dynamics program. NAMD development has emphasized parallel scalability to support large-size and long-timescale biomolecular simulations running on petascale supercomputers. As GPU technology has evolved, NAMD has benefited from moving greater amounts of work to the GPU. NVIDIA's release of Volta has now shifted the balance almost entirely to the GPU, with the small remaining CPU calculations often posing bottlenecks to NAMD's performance. Attendees will learn optimization strategies and pitfalls for achieving higher performance as Amdahl's Law poses an ever increasing challenge for mature GPU-accelerated codes like NAMD.  Back
 
Keywords:
Computational Biology and Chemistry, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8727
Streaming:
Download:
 
Prediction of Heterodimeric Protein Complexes from Protein-Protein Interaction Networks Using Deep Learning
Peiying Ruan (NVIDIA)
We'll present how to apply deep learning to predict small-sized protein complexes with multiple biological information and hybrid deep learning model. We'll describe the background of the problem, what kind of biological information are u ...Read More
We'll present how to apply deep learning to predict small-sized protein complexes with multiple biological information and hybrid deep learning model. We'll describe the background of the problem, what kind of biological information are useful for accurately predicting small-sized protein complexes, how to improve the prediction accuracy by using hybrid deep learning models for different information, and compare the performance of multiple deep learning models for this problem.  Back
 
Keywords:
Computational Biology and Chemistry, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8333
Streaming:
Download:
 
ORNL Summit: Petascale Molecular Dynamics Simulations on the Summit POWER9/Volta Supercomputer
James Phillips (University of Illinois)
Learn the opportunities and pitfalls of running billion-atom science at scale on a next-generation pre-exascale GPU-accelerated supercomputer. The highly parallel molecular dynamics code NAMD has been long used on the GPU-accelerated Cray XK7 Blue Wa ...Read More
Learn the opportunities and pitfalls of running billion-atom science at scale on a next-generation pre-exascale GPU-accelerated supercomputer. The highly parallel molecular dynamics code NAMD has been long used on the GPU-accelerated Cray XK7 Blue Waters and ORNL Titan machines to perform petascale biomolecular simulations, including a 64-million-atom model of the HIV virus capsid. In 2007 NAMD was was one of the first codes to run on a GPU cluster, and it is now one of the first on the new ORNL Summit supercomputer, which features IBM POWER9 CPUs, NVIDIA Volta GPUs, and the NVLink CPU-GPU interconnect. This talk will cover the latest NAMD performance improvements and scaling results on Summit and other leading supercomputers.  Back
 
Keywords:
Computational Biology and Chemistry, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8747
Streaming:
 
Porting Quantum ESPRESSO's PWscf Solver to GPUs with CUDA Fortran
Everett Phillips (NVIDIA), Joshua Romero (NVIDIA), Filippo Spiga (Cambridge University)
Learn how to effectively leverage CUDA Fortran to port scientific applications written in Fortran to GPUs. We'll present in detail the porting effort of Quantum ESPRESSO's Plane-Wave Self-Consistent Field (PWscf) solver, from profiling and identify ...Read More
Learn how to effectively leverage CUDA Fortran to port scientific applications written in Fortran to GPUs. We'll present in detail the porting effort of Quantum ESPRESSO's Plane-Wave Self-Consistent Field (PWscf) solver, from profiling and identifying time-consuming procedures to performance analysis of the GPU-accelerated solver on several benchmark problems on systems ranging in size from small workstations to large distributed GPU clusters. We'll highlight several tools available in CUDA Fortran to accomplish this, from high-level CUF kernel directives to lower level kernel programming, and provide guidance and best practices in several use cases with detailed examples.  Back
 
Keywords:
Computational Biology and Chemistry, Performance Optimization, HPC and AI, GTC Silicon Valley 2018 - ID S8446
Streaming:
 
Scaling Molecular Dynamics Across 25,000 GPUs on Sierra & Summit
Tomas Oppelstrup (Lawrence Livermore National Laboratory), Shiv Sundram (Lawrence Livermore National Laboratory)
As a part of the Department of Energy/National Cancer Institute pilot programs and the Sierra Institutional Center of Excellences, Lawrence Livermore National Laboratory has developed strong scaling molecular dynamics codes for atomic-level simulatio ...Read More
As a part of the Department of Energy/National Cancer Institute pilot programs and the Sierra Institutional Center of Excellences, Lawrence Livermore National Laboratory has developed strong scaling molecular dynamics codes for atomic-level simulation in physics, materials science, and biology. Our implementation is portable from tablets and laptops to supercomputers, and can efficiently scale up to tens of thousands of GPUs. In particular, we target the Department of Energy leadership computing facilities, Sierra and Summit, at the Livermore and Oak Ridge National Laboratories. These are over 100-petaflops supercomputers powered by IBM and NVIDIA hardware. We''ll discuss the performance and scaling of our code, and its application to cancer biology research, material science, and high-energy physics.  Back
 
Keywords:
Computational Biology and Chemistry, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8489
Streaming:
Download:
 
Application of openACC to Computer Aided Drug Discovery software suite "Sanjeevini"
Bharatkumar Sharma (NVIDIA)
We will demonstrate the features and capabilities of OpenACC for porting and optimizing the ParDOCK docking module of the Sanjeevini suite for computer aided drug discovery developed at the HPC and Supercomputing Facility for Bioinformatics and Compu ...Read More
We will demonstrate the features and capabilities of OpenACC for porting and optimizing the ParDOCK docking module of the Sanjeevini suite for computer aided drug discovery developed at the HPC and Supercomputing Facility for Bioinformatics and Computational Biology at the Indian Institute of Technology Delhi. We have used OpenACC to efficiently port the existing C++ programming model of ParDOCK software with minimal code modifications to run on latest NVIDIA P100 GPU card. These code modifications and tuning resulted in a six times average speedup of improvements in turnaround time. By implementing openACC, the code is now able to sample ten times more ligand conformations leading to an increase in accuracy. The ACC ported ParDOCK code is now able to predict a correct pose of a protein-ligand interaction from 96.8 percent times, compared to 94.3 percent earlier (for poses under 1 A) and 89.9 percent times compared to 86.7 percent earlier (for poses under 0.5 A).  Back
 
Keywords:
Computational Biology and Chemistry, Performance Optimization, Bioinformatics & Genomics, GTC Silicon Valley 2018 - ID S8188
Download:
 
Porting VASP to GPUs with OpenACC
Stefan Maintz (NVIDIA), Markus Wetzstein (NVIDIA)
VASP is a software package for atomic-scale materials modeling. It''s one of the most widely used codes for electronic-structure calculations and first-principles molecular dynamics. We''ll give an overview and status of porting VASP to GPUs with Ope ...Read More
VASP is a software package for atomic-scale materials modeling. It''s one of the most widely used codes for electronic-structure calculations and first-principles molecular dynamics. We''ll give an overview and status of porting VASP to GPUs with OpenACC. Parts of VASP were previously ported to CUDA C with good speed-ups on GPUs, but also with an increase in the maintenance workload as VASP is otherwise written wholly in Fortran. We''ll discuss OpenACC performance relative to CUDA, the impact of OpenACC on VASP code maintenance, and challenges encountered in the port related to management of aggregate data structures. Finally, we''ll discuss possible future solutions for data management that would simplify both new development and maintenance of VASP and similar large production applications on GPUs.  Back
 
Keywords:
Computational Biology and Chemistry, GTC Silicon Valley 2018 - ID S8750
Streaming:
Download:
 
Accelerating Molecular Modeling Tasks on Desktop and Pre-Exascale Supercomputers
John Stone (University of Illinois at Urbana Champaign)
We'll showcase recent successes in the use of GPUs to accelerate challenging molecular simulation analysis tasks on the latest Volta-based Tesla V100 GPUs on both Intel and IBM/OpenPOWER hardware platforms, and with large scale runs on petascale com ...Read More
We'll showcase recent successes in the use of GPUs to accelerate challenging molecular simulation analysis tasks on the latest Volta-based Tesla V100 GPUs on both Intel and IBM/OpenPOWER hardware platforms, and with large scale runs on petascale computers such as ORNL Summit. We'll highlight the performance benefits obtained from die-stacked memory on Tesla V100, the NVLink interconnect on the IBM OpenPOWER platforms, and the use of advanced features of CUDA, Volta's new Tensor units, and just-in-time compilation to increase the performance of key analysis algorithms. We'll present results obtained with OpenACC parallel programming directives, current challenges, and future opportunities. Finally, we'll describe GPU-accelerated machine learning algorithms for tasks such as clustering of structures resulting from molecular dynamics simulations.  Back
 
Keywords:
Computational Biology and Chemistry, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8709
Streaming:
 
Deep Learning for Molecular Docking
David Koes (University of Pittsburgh)
Molecular docking is an important tool for computational drug discovery that aims to predict the binding pose of a ligand (drug) to a target protein. Identifying a correctly oriented pose requires a scoring function that has a global optimum close to ...Read More
Molecular docking is an important tool for computational drug discovery that aims to predict the binding pose of a ligand (drug) to a target protein. Identifying a correctly oriented pose requires a scoring function that has a global optimum close to the experimentally observed pose. Additionally, it should also be differentiable with respect to atomic positions so that it can be used for gradient-based pose optimization. We'll describe a differentiable grid-based convolutional neural network scoring function and explore its application in an end-to-end GPU-optimized molecular docking workflow. We'll show that convolutional neural networks trained on experimental data can successfully identify correct binding modes and meaningfully rank and score compounds. We'll also describe several visualization approaches that map the CNN score back to the atomic inputs to help guide medicinal chemistry optimization and provide insight into the functioning of the neural network. The entirety of our approach is available under an open-source license as part of our gnina package (https://github.com/gnina).  Back
 
Keywords:
Computational Biology and Chemistry, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8540
Streaming:
Download:
Computational Fluid Dynamics
Presentation
Media
An Optimized GPU Implementation for CFD on Unstructured Grids
Mohammad Zubair (Old Dominion University)
NASA Langley Research Center''s FUN3D computational fluid dynamics (CFD) software is used to solve the Navier-Stokes (NS) equations for a broad range of aerodynamics applications across the speed range. Accurate and efficient simulations of complex a ...Read More
NASA Langley Research Center''s FUN3D computational fluid dynamics (CFD) software is used to solve the Navier-Stokes (NS) equations for a broad range of aerodynamics applications across the speed range. Accurate and efficient simulations of complex aerodynamic flows are challenging and require significant computational resources. We''ll describe our experiences in developing an optimized GPU implementation for CFD on unstructured grids. The most costly kernels for a large FUN3D simulation are generally the matrix assembly and block-sparse linear solver. These computations have low arithmetic intensity and require atomic updates, for which GPUs with high memory bandwidth and efficient atomic operations are well suited.  Back
 
Keywords:
Computational Fluid Dynamics, Performance Optimization, GTC Silicon Valley 2018 - ID S8411
Streaming:
Download:
 
Tools for Improving Cross-Platform Software Development
Eric Kelmelis (EM Photonics)
Building software for the wide variety of heterogenous computers often requires writing multiple versions of everything from low-level computational kernels to high-level problem partitioning and communication schemes. Recently, EM Photonics has unde ...Read More
Building software for the wide variety of heterogenous computers often requires writing multiple versions of everything from low-level computational kernels to high-level problem partitioning and communication schemes. Recently, EM Photonics has undertaken several efforts to develop tools to assist developers in this work. These tools have two primary focuses: 1) To ease the process of developing cross-platform and mixed device software and 2) Allow application developers to focus more on their specific domain expertise than on the intricacies of building efficient, scalable software. In this talk, we will provide an overview of tools we have developed and discuss their use on real world applications. In particular, we will present our work with the climate modeling and computational fluid dynamics teams at NASA.  Back
 
Keywords:
Computational Fluid Dynamics, Tools and Libraries, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8239
Streaming:
Download:
 
Advances in Discrete Element Particle Modelling Using the GPU Based Code Blaze-DEM
Nicolin Govender (RCPE/University of Surrey), Daniel Wilke (Department of Mechanical and Aeronautical Engineering, University of Pretoria)
In this talk we will look at advances in the simulation of particulate systems in Computer Aided Engineering (CAE) applications. We will in particular be focusing on the Discrete Element Method (DEM) and the strides made in terms of the number of par ...Read More
In this talk we will look at advances in the simulation of particulate systems in Computer Aided Engineering (CAE) applications. We will in particular be focusing on the Discrete Element Method (DEM) and the strides made in terms of the number of particles and particle shape using the GPU based code Blaze-DEM. A variety of industrial applications ranging from mining, agriculture, civil engineering to pharmaceuticals will be discussed. We will also touch on how we can leverage the next wave of GPU computing namely, half precession and tensor cores in scientific computing which is still predominantly double precision based. Finally we look at the work been done by various groups to create a multi-physics GPU based platform using Blaze-DEM.  Back
 
Keywords:
Computational Fluid Dynamics, Computer Aided Engineering, GTC Silicon Valley 2018 - ID S8348
Streaming:
Computational Physics
Presentation
Media
Breakthroughs in Astrophysics Enabled by NVIDIA GPU Technology
Brant Robertson (UC Santa Cruz)
The vast scales and complex physics of the universe pose a significant challenge for understanding how galaxies form and evolve. Theoretical astrophysicists attempt to model the physical processes that drive the formation of galaxies and other struct ...Read More
The vast scales and complex physics of the universe pose a significant challenge for understanding how galaxies form and evolve. Theoretical astrophysicists attempt to model the physical processes that drive the formation of galaxies and other structures via supercomputer simulations, but the fidelity of these simulations are limited by computational power. With the advent of supercomputers powered by NVIDIA GPUs, astrophysical simulations have taken giant strides forward in their ability to model and understand the detailed properties of galaxies. I review some of our progress enabled by NVIDIA GPUs, including large-scale GPU-powered hydrodynamical simulations and Deep Learning applied to enormous astronomical surveys of galaxies.  Back
 
Keywords:
Computational Physics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8677
Streaming:
Download:
 
Using HPC Computational Physics Tools for Advanced Engineering Simulations and Production Deployment (Presented by Amazon Web Services)
David Hinz (Western Digital Technologies, Inc.), David Pellerin (Amazon Web Services)
AWS offers the most powerful GPU-accelerated cloud infrastructure that delivers unparalleled computational efficiency for advanced engineering simulations and analysis, enabling High Performance Computing (HPC) workloads to run in the cloud at scale. ...Read More
AWS offers the most powerful GPU-accelerated cloud infrastructure that delivers unparalleled computational efficiency for advanced engineering simulations and analysis, enabling High Performance Computing (HPC) workloads to run in the cloud at scale. This session features a real-world use case from the advanced product engineering team at Western Digital, who is using HPC solutions to model new technologies and capabilities prior to production. Western Digital's computational tools incorporate the description of physics occurring during the HDD recording process and ultimately result in input to a recording sub system channel model which produces an Error Rate. The length scales involved in the recording model range from a few nanometers in the description of the recording media to microns in the description of the recording head. The power of the current generation of NVIDIA GPUs allows Western Digital to generate enough simulation data so that the same recording sub system channel model, used in experiments, can be employed in studies that include fabrication processes variances.   Back
 
Keywords:
Computational Physics, HPC and AI, GTC Silicon Valley 2018 - ID S81041
Streaming:
Download:
 
Acceleration of a LLNL Production Fortran Application on SIERRA Supercomputer
Aaron Black (Lawrence Livermore National Laboratory)
The U.S. Department of Energy's (DOE) stockpile stewardship mission relies heavily on petascale simulations that have traditionally run on homogeneous architecture supercomputers. The DOE and Lawrence Livermore National Lab's newest computer, SIERR ...Read More
The U.S. Department of Energy's (DOE) stockpile stewardship mission relies heavily on petascale simulations that have traditionally run on homogeneous architecture supercomputers. The DOE and Lawrence Livermore National Lab's newest computer, SIERRA, which is scheduled to be the second most powerful supercomputer in the nation, is being installed and employs a heterogeneous architecture leveraging both IBM Power9 CPUs and NVIDIA Volta GPUs. This talk presents performance results for Teton, a mission-critical radiative transport application, as it is re-engineered to leverage heterogeneous computing platforms. The data structure and algorithm optimizations necessary to increase thread level parallelism 1,000 times and achieve GPU, CPU, and network concurrency will also be discussed.  Back
 
Keywords:
Computational Physics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8270
Streaming:
 
Multi GPU Parallel Processing with NVLINK
Wayne Mindle (CertaSIM, LLC)
Multi-GPU processing with the GP100 and NVLINK will be discussed using a hypervelocity impact problem. Multi-GPU processing has always been possible via the PCIe interface which means communication between GPUs is accomplished through the CPU. The NV ...Read More
Multi-GPU processing with the GP100 and NVLINK will be discussed using a hypervelocity impact problem. Multi-GPU processing has always been possible via the PCIe interface which means communication between GPUs is accomplished through the CPU. The NVLINK connection allows software to bypass this slower connection and allow for direct communication between GPUs to improve performance. An SPH solver, a particle based method, is used to solve the hypervelocity problem. The SPH solver does all calculations on the GPU so it is a perfect choice to compare performance between the various GPUs past and present. The results for single and multiple GPU simulations for K20, K40, P6000 and GP100 are presented.  Back
 
Keywords:
Computational Physics, Computer Aided Engineering, GTC Silicon Valley 2018 - ID S8888
Streaming:
 
Breaking Through the Barriers to GPU Accelerated Monte Carlo Particle Transport
Jeremy Sweezy (Los Alamos National Laboratory)
A new method of accelerating Monte Carlo particle transport with GPUs will be presented that can be implemented in modern and legacy Monte Carlo codes with little development cost. Two major barriers exist for accelerating Monte Carlo particle transp ...Read More
A new method of accelerating Monte Carlo particle transport with GPUs will be presented that can be implemented in modern and legacy Monte Carlo codes with little development cost. Two major barriers exist for accelerating Monte Carlo particle transport with GPUs: High development costs and limited performance. World class Monte Carlo particle transport codes require decades of development. Completely re-writing such codes for the high-performance computing platform du jour is not practical. A review of seven implementations of Monte Carlo neutron transport on GPUs indicates a performance wall of 4.5 times the speed of 8 CPU cores. The new method, which is based on ray casting, calculates neutron and photon fluence tallies on the GPU while the random walk is maintained on the CPU. This method significantly lowers the software development cost and increases performance. A performance increase of 7 times the performance of 8 CPU cores has been demonstrated for the calculation of neutron fluence in a Pressurized Water Reactor (PWR) fuel assembly. For photons, performances increases up to 29 times have been demonstrated when simulating both medical and industrial radiography.  Back
 
Keywords:
Computational Physics, Medical Imaging and Radiology, GTC Silicon Valley 2018 - ID S8427
Streaming:
Download:
 
A Look into the Future of X-Ray Imaging: When Ptychography Meets GPU Acceleration
Pablo Enfedaque (Lawrence Berkeley National Laboratory), Stefano Marchesini (Lawrence Berkeley National Laboratory)
Discover the main benefits and challenges of ptychography imaging, and learn how to overcome its high computational demands using state-of-the-art algorithms and massively parallel GPU solutions. Nowadays, X-ray imaging allows biologists to retrieve ...Read More
Discover the main benefits and challenges of ptychography imaging, and learn how to overcome its high computational demands using state-of-the-art algorithms and massively parallel GPU solutions. Nowadays, X-ray imaging allows biologists to retrieve the atomic arrangement of proteins and doctors the capability to view broken bones in full detail. In this context, ptychography has raised as a reference technique: resolution of a billionth of a meter, macroscopic field of view, capability to retrieve chemical, orbital, electronic, or magnetic contrast. The frenetic rate of improvement in X-ray light sources has already outpaced Moore''s law, enabling novel discoveries that involve a never-ending increase of experimental data: faster frame rates, better resolution, higher dimensionality. By taking advantage of massively parallel computing, GPUs allow us to keep pace with the discovery process and provide scientists with a real-time peek of the nanoworld. We''ll explore the present and future of X-ray imaging, exemplified with SHARP, a CUDA/MPI-based solution developed within the CAMERA team in LBNL to produce low-latency, high-throughput ptychography reconstructions.  Back
 
Keywords:
Computational Physics, Video and Image Processing, GTC Silicon Valley 2018 - ID S8284
Streaming:
Download:
 
Accelerated Deep Learning Discovery in Fusion Energy Science
William Tang (Princeton University)
Deep learning/artificial intelligence methods are increasingly being deployed to enable new avenues of big-data-driven discovery in key scientific application areas such as the quest to deliver Fusion Energy identified by the 2015 CNN "Moonshot ...Read More
Deep learning/artificial intelligence methods are increasingly being deployed to enable new avenues of big-data-driven discovery in key scientific application areas such as the quest to deliver Fusion Energy identified by the 2015 CNN "Moonshots for the 21st Century" series as one of 5 prominent modern grand challenges. Princeton University''s associated R&D methods have been successfully applied to accelerate progress in reliably predicting and avoiding large-scale losses (called "disruptions") of the thermonuclear plasma fuel in magnetically-confined devices the largest of which is the $25B international ITER device a burning plasma experiment under construction with the potential to exceed "breakeven" fusion power (i.e., "power out = power in") by a factor of 10 or more.  Back
 
Keywords:
Computational Physics, GTC Silicon Valley 2018 - ID S81002
Streaming:
Download:
 
Solar Storm Modeling using OpenACC: From HPC Cluster to "In-House"
Ronald Caplan (Predictive Science Inc.)
We explore using OpenACC to migrate applications required for modeling solar storms from CPU HPC clusters to an "in-house" multi-GPU system. We describe the software pipeline and the utilization of OpenACC in the computationally heavy codes ...Read More
We explore using OpenACC to migrate applications required for modeling solar storms from CPU HPC clusters to an "in-house" multi-GPU system. We describe the software pipeline and the utilization of OpenACC in the computationally heavy codes. A major step forward is the initial implementation of OpenACC in our Magnetohydrodynamics code MAS. Strategies for overcoming some of the difficulties encountered are discussed, including handling Fortran derived types, array reductions, and performance tuning. Production-level "time-to-solution" results will be shown for multi-CPU and multi-GPU systems of various sizes. The timings show that it is possible to achieve acceptable "time-to-solution"s on a single multi-GPU server/workstation for problems that previously required using multiple HPC CPU-nodes.  Back
 
Keywords:
Computational Physics, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8847
Streaming:
Download:
Computer Aided Engineering
Presentation
Media
Leveraging NVIDIA Quadro Virtual Data Center Workstation (Quadro vDWS) to Provide Horsepower to Virtual CAD Workstations
Wesley Struble (DENSO International America, Inc.), Varick Teller (DENSO International North America)
Learn about the requirements gathering, solution analysis, benchmarking, and user testing techniques we implemented to decide on a virtual workstation configuration that met all the requirements for our CAD users, while meeting the density requi ...Read More

Learn about the requirements gathering, solution analysis, benchmarking, and user testing techniques we implemented to decide on a virtual workstation configuration that met all the requirements for our CAD users, while meeting the density requirements to remain cost-effective. DENSO International America began investigating the feasibility of implementing CAD on VDI to simplify its workstation ecosystem in 2013. Several options were explored, but it wasn't until NVIDIA Quadro vDWS was released that we found a solution that met our performance and density requirements. We'll discuss our journey from vSGA/vDGA to vGPU. Starting with Kepler, through the latest Pascal architecture, we've performed benchmarking and user testing of each to demonstrate the benefit of each.

  Back
 
Keywords:
Computer Aided Engineering, Product & Building Design, GPU Virtualization, GTC Silicon Valley 2018 - ID S8435
Streaming:
Download:
 
Disrupting 3D Design - GPU Based Real-Time Simulation for Rapid Concepting
Justin Hendrickson (ANSYS)
Join us for an exciting presentation that will unveil the latest use of GPU technology that aids in real-time engineering simulations. You will see a new technology, called ANSYS Discovery Live, that provides instant, invaluable feedback that promote ...Read More
Join us for an exciting presentation that will unveil the latest use of GPU technology that aids in real-time engineering simulations. You will see a new technology, called ANSYS Discovery Live, that provides instant, invaluable feedback that promotes engineering designs more optimized and better understood than previously possible. Rather than engineers consume time with non value added tasks, they can turn the design process into an interactive, educational experience. The marrying of simulation technology with the technological advances of NVIDIA graphics are fundamentally changing the way products are designed and developed. The possibilities are endless with this technology.  Back
 
Keywords:
Computer Aided Engineering, Real-Time Graphics, GTC Silicon Valley 2018 - ID S8438
Streaming:
Computer Vision
Presentation
Media
SmartSense: Real-Time, Field-Deployed CV Traffic Analysis System
Justin Eichel (Miovision)
Miovision presents a video-based traffic analytics system, capable of tracking and classifying vehicles in real time throughout cities. The system leverages Jetson TX2 modules and inferencing to accurately classify vehicles at over 50 frames per ...Read More

Miovision presents a video-based traffic analytics system, capable of tracking and classifying vehicles in real time throughout cities. The system leverages Jetson TX2 modules and inferencing to accurately classify vehicles at over 50 frames per second using single-shot multibox detection and DAC, a VGG-based network. We'll cover many of the issues our teams went through to design and implement the system, including data collection, annotation, training, incorporating continuous training, and deep learning iteration. We'll also illustrate how the measured traffic trends were used to reduce congestion and evaluate the health of traffic corridors.

  Back
 
Keywords:
Computer Vision, Intelligent Video Analytics and Smart Cities, Autonomous Machines, GTC Silicon Valley 2018 - ID S8383
Streaming:
Download:
 
Learning-Free Universal Style Transformer
Chen Fang (Adobe Research)
Universal style transfer aims to transfer any arbitrary visual styles to content images. Existing feed-forward based methods, while enjoying the inference efficiency, are mainly limited by inability of generalizing to unseen styles or compromised vis ...Read More
Universal style transfer aims to transfer any arbitrary visual styles to content images. Existing feed-forward based methods, while enjoying the inference efficiency, are mainly limited by inability of generalizing to unseen styles or compromised visual quality. We'll present a simple yet effective method that tackles these limitations without training on any pre-defined styles. The key ingredient of our method is a pair of feature transform -- whitening and coloring -- that are embedded to an image reconstruction network. The whitening and coloring transforms reflect a direct matching of feature covariance of the content image to a given style image, which shares similar spirits with the optimization of Gram matrix-based cost in neural style transfer. We demonstrate the effectiveness of our algorithm by generating high-quality stylized images with comparisons to a number of recent methods. We also analyze our method by visualizing the whitened features and synthesizing textures via simple feature coloring.  Back
 
Keywords:
Computer Vision, Graphics and AI, Real-Time Graphics, GTC Silicon Valley 2018 - ID S8117
Streaming:
Download:
 
Using Multimodal Learning for TV Show Summarization
Yonghua Lin (IBM Research China), Qing Wang (IBM Research China)
We'll explore new techniques for TV show summarization using multimodal deep learning for saliency detection and fusion. For TV show summarization, the goal is to compact visual summary with informativeness and enjoyability to attract audience. In o ...Read More
We'll explore new techniques for TV show summarization using multimodal deep learning for saliency detection and fusion. For TV show summarization, the goal is to compact visual summary with informativeness and enjoyability to attract audience. In our work, we propose a multimodal summarization platform to integrate the multimodal saliences learned from video, audio, and text. Our work focuses on three aspects: 1) the saliency extraction for video, audio, and text using deep learning networks; 2) fusion framework design for multimodal information integration; 3) developing tools to speed up video processing. Using AI Vision, which is a public cloud-based AI service, we summarize a TV show with 11 hours duration in one minute.  Back
 
Keywords:
Computer Vision, Intelligent Video Analytics and Smart Cities, Video and Image Processing, GTC Silicon Valley 2018 - ID S8221
Streaming:
 
Advancing Representation Learning for Language and Vision
Hideki Nakayama (The University of Tokyo)
As an NVAIL partner, Machine Perception Group at Tokyo university is focusing on various research areas of AI, particularly on natural language processing, computer vision, and their cross-disciplinary domain. Since deep learning has revolutionalized ...Read More
As an NVAIL partner, Machine Perception Group at Tokyo university is focusing on various research areas of AI, particularly on natural language processing, computer vision, and their cross-disciplinary domain. Since deep learning has revolutionalized all these fields, one of the core issues has been how to effectively extract powerful semantic representations from low-level inputs in an end-to-end manner. Indeed, remarkable progress has been made on this point in recent years, enabling many spectacular cross-modal applications. In this talk, we will introduce several research projects in our group related to representation learning for language and vision, and discuss future direction.  Back
 
Keywords:
Computer Vision, Speech and Language Processing, GTC Silicon Valley 2018 - ID S8683
Streaming:
 
Displaying and Interacting with Desktop Apps in VR
Rouslan Dimitrov (VR Toolbox)
Displaying traditional desktop applications in virtual reality requires techniques to overcome the limited resolution of current displays while simultaneously taking advantage of the 360 real estate. Interacting with these applications is helped with ...Read More
Displaying traditional desktop applications in virtual reality requires techniques to overcome the limited resolution of current displays while simultaneously taking advantage of the 360 real estate. Interacting with these applications is helped with the use of gestures using the controllers and hands. We''ll go over the use of mixed reality for easier keyboard typing when necessary, general safety, and finding things around, such as cables, chairs, and coffee. All techniques described are implemented and available in the commercially available software, called VR Toolbox.  Back
 
Keywords:
Computer Vision, Virtual Reality and Augmented Reality, Performance Optimization, GTC Silicon Valley 2018 - ID S8215
Streaming:
Download:
 
3D Convolutional Neural Networks (CNNs) with Fast and Memory Efficient Cross-Hair Filters
Marie Piraud (Technical University of Munich)
Over the years, state-of-the-art architectures have been built with convolutional layers and have been employed successfully on 2D image processing and classification tasks. This success naturally appeals for the extension of the 2D convolutional lay ...Read More
Over the years, state-of-the-art architectures have been built with convolutional layers and have been employed successfully on 2D image processing and classification tasks. This success naturally appeals for the extension of the 2D convolutional layers to 3D convolutional layers to handle higher dimensional tasks in the form of video and 3D volume processing. However, this extension comes with an exponential increase in the number of computations and parameters in each convolutional layer. Due to these problems, 2D convolutional layers are still widely used to handle 3D images, which suffer from 3D context information. In view of this, we'll present a 3D fully convolutional neural network (FCNN) with 2D orthogonal cross-hair filters that makes use of 3D context information, avoiding the exponential scaling described above. By replacing 3D filters with 2D orthogonal cross-hair filters, we achieve over 20% improvement in execution time and 40% reduction in the overall number of parameters while accuracy is preserved.  Back
 
Keywords:
Computer Vision, Graphics and AI, Deep Learning and AI Frameworks, Video and Image Processing, GTC Silicon Valley 2018 - ID S8318
Streaming:
 
Computational Zoom: A Framework to Manipulate Image Composition in Post-Capture
Orazio Gallo (NVIDIA)
Telling the right story with a picture requires the ability to create the right composition. Two critical parameters controlling composition are the camera position and the focal length of the lens. The traditional paradigm to capture a picture is fo ...Read More
Telling the right story with a picture requires the ability to create the right composition. Two critical parameters controlling composition are the camera position and the focal length of the lens. The traditional paradigm to capture a picture is for a photographer to mentally visualize the desired result, select the capture parameters to produce it, and finally take the photograph, thus committing to a particular composition. To break this paradigm, we introduce computational zoom, a framework that allows a photographer to manipulate several aspects of composition in post-capture. Our approach also defines a multi-perspective camera that can generate compositions that are not attainable with a physical lens. Our framework requires a high-quality estimation of the scene's depth. Existing methods to estimate 3D information generally fail to produce dense maps, or sacrifice depth uncertainty to avoid missing estimates. We propose a novel GPU-based depth estimation technique that outperforms the state of the art in terms of quality, while ensuring that each pixel is associated with a depth value.  Back
 
Keywords:
Computer Vision, Video and Image Processing, GTC Silicon Valley 2018 - ID S8253
Streaming:
 
Accelerated Functional Mapping of World with NVIDIA GPUs and Deep Learning
Christopher Layton (Oak Ridge National Laboratory), Dalton Lunga (Oak Ridge National Laboratory), H. Lexie Yang (Oak Ridge National Laboratory)
The functional mapping of man-made facilities from high-resolution remote sensing images provides timely high-fidelity land-use information and population distribution estimates, which facilitates federal, non-governmental agency and industrial expan ...Read More
The functional mapping of man-made facilities from high-resolution remote sensing images provides timely high-fidelity land-use information and population distribution estimates, which facilitates federal, non-governmental agency and industrial expansion efficiency. We'll share our journey to deliver functional maps of the world that include building extraction, human settlement maps, mobile home parks, and facility mapping using a variety of remote sensing imagery. Our research addresses three frontier challenges; 1) distinct characteristics of remote sensing data for deep learning (including the model distribution shifts encountered with remote sensing images), multisensor sources, and data multi modalities; 2) training very large deep-learning models using multi-GPU and multi-node HPC platforms; 3) large-scale inference using ORNL's Titan and Summit with NVIDIA TensorRT. We'll also talk about developing workflows to minimize I/O inefficiency, doing parallel gradient-descent learning, and managing remote sensing data in HPC environment.  Back
 
Keywords:
Computer Vision, GIS, HPC and AI, GTC Silicon Valley 2018 - ID S8420
Streaming:
Download:
 
Teaching Machines to See, Communicate, and Act
Sanja Fidler (University of Toronto)
A successful autonomous system needs to not only understand the visual world but also communicate its understanding with humans. To make this possible, language can serve as a natural link between high level semantic concepts and low level visua ...Read More

A successful autonomous system needs to not only understand the visual world but also communicate its understanding with humans. To make this possible, language can serve as a natural link between high level semantic concepts and low level visual perception. We'll discuss recent work in the domain of vision and language, covering topics such as image/video captioning and retrieval, and question-answering. We'll also talk about our recent work on task execution via language instructions.

  Back
 
Keywords:
Computer Vision, GTC Silicon Valley 2018 - ID S8238
Streaming:
Download:
 
A Deep Neural Network for Estimating Depth from Stereo
Alexey Kamenev (NVIDIA), Nikolai Smolyanskiy (NVIDIA)
We present a deep neural network architecture for estimating 3D depth from stereo images. The network is modeled after computer vision stereo matching pipelines to simplify training process. Our loss function consists of a photometric loss term and L ...Read More
We present a deep neural network architecture for estimating 3D depth from stereo images. The network is modeled after computer vision stereo matching pipelines to simplify training process. Our loss function consists of a photometric loss term and Lidar based loss terms. This combination makes it possible to train our DNN in a supervised, semi-supervised and completely unsupervised way. Our DNN produces depth maps that have accuracy similar to Lidar based depth. We also compare our stereo DNN architecture to other stereo architectures as well as to a monocular depth DNN architecture. We demonstrate qualitative and quantitative test results.  Back
 
Keywords:
Computer Vision, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8660
Streaming:
Consumer Engagement and Personalization
Presentation
Media
Cadillac in VR
Mike Konchalski (All Things Media)
"Cadillac in VR" is the premiere VR showroom experience. In our presentation we want to highlight the needs Cadillac came to us with, our approach for creating this experience, key challenges we faced during development, our final results, ...Read More
"Cadillac in VR" is the premiere VR showroom experience. In our presentation we want to highlight the needs Cadillac came to us with, our approach for creating this experience, key challenges we faced during development, our final results, and what this might mean for the future of car buying. The needs we discuss will involve key points of change in the automotive industry and how Cadillac wanted to adapt to those changes. Our approach will touch on how we established our underlying philosophy which guided our decision making process throughout development. Following that, we will dive deeper into the technical challenges we faced while developing the experience. The environment, level of detail, lighting, UX/UI, and hardware are all key areas of discussion. We hope to have someone on stage at this point with the experience running to further add emphasis and clarification. Finally, we'll cover how all this came together in our final product and where we think it might take the future of buying a car.  Back
 
Keywords:
Consumer Engagement and Personalization, Virtual Reality and Augmented Reality, GTC Silicon Valley 2018 - ID S8650
Streaming:
Download:
 
Juicing Up Ye Olde GPU Monte Carlo Code
Richard Hayden (JP Morgan Chase), Oleg Rasskazov (JP Morgan Chase)
We''ll discuss the GPU accelerated Monte Carlo compute at JP Morgan which was architected for C1060 cards and revamped a few times as new architectures were released. The key features of the code are exclusive use of double precisio ...Read More

We''ll discuss the GPU accelerated Monte Carlo compute at JP Morgan which was architected for C1060 cards and revamped a few times as new architectures were released. The key features of the code are exclusive use of double precision, data caching, and code structure where significant amount of CPU pre-compute is followed by running multiple GPU kernels. On the latest devices, memory per flop is a throughput limiting factor for a class of our GPU-accelerated models. As byte/flop ratio is continuing to fall from one generation of GPU to the next, we are exploring the ways to re-architecture Monte Carlo simulation code to decrease memory requirements and improve TCO of the GPU-enabled compute. Obvious next steps are store less, re-calculate more, and unified memory. 

  Back
 
Keywords:
Consumer Engagement and Personalization, Finance - Quantitate Risk Management, GTC Silicon Valley 2018 - ID S8802
Download:
 
Finding the Right Dress at Scale at Rent The Runway
Saurabh Bhatnagar (Rent The Runway)
Rent The Runway gets millions of visitors every day. We serve personalized recommendations to them based on browser behavior and some explicit feedback. To add to the complexity, we have multiple membership programs. Fashion has some unique chal ...Read More

Rent The Runway gets millions of visitors every day. We serve personalized recommendations to them based on browser behavior and some explicit feedback. To add to the complexity, we have multiple membership programs. Fashion has some unique challenges regarding seasonality, fit and feedback. We also have a unique business model where order fulfillment and reservations are tied together in a unique way. We have moved to a GPU first infrastructure to scale instead of Spark clusters. We will discuss how we are moving to power all our algorithms this way.

  Back
 
Keywords:
Consumer Engagement and Personalization, GTC Silicon Valley 2018 - ID S8724
Streaming:
Download:
Cyber Security
Presentation
Media
Analyzing Sequences of Time Series Security Data with Recurrent Residual Networks
Ivko Cvejic (US Bank), Leon DeFrance (US Bank)
Analyzing time series data from security controls for signs of malicious activity is a common challenge in financial networks. We show how one tool, a recurrent residual deep learning (DL) model, can be used to rapidly analyze variable-length time se ...Read More
Analyzing time series data from security controls for signs of malicious activity is a common challenge in financial networks. We show how one tool, a recurrent residual deep learning (DL) model, can be used to rapidly analyze variable-length time series data to achieve meaningful analysis. Recurrent networks have long been a popular choice in DL for analyzing data with multiple time-steps where the meaning of data at one point in time is dependent upon data at other time-steps. For example, natural language processing solutions frequently utilize recurrent DL models to achieve state-of-the-art results in classification tasks. However, recurrent models are often plagued by issues concerning training difficulty as a function of the model depth. These issues are often exacerbated by the desire to create very deep models for particularly difficult tasks. Utilizing the ResNet concept developed by Microsoft research applied to a recurrent model, we show how models analyzing large sequences can achieve state-of-the-art results with fewer parameters and faster training times.  Back
 
Keywords:
Cyber Security, Finance, GTC Silicon Valley 2018 - ID S8656
Streaming:
 
Network Security with Machine Learning
Ashrith Barthur (H2O.ai)
Connections have behavioral patterns that are unique to protocols, loads, window sizes, and the type of traffic. A CDN enterprise behaves completely differently than how a cloud service company would behave and they both would be different from a cor ...Read More
Connections have behavioral patterns that are unique to protocols, loads, window sizes, and the type of traffic. A CDN enterprise behaves completely differently than how a cloud service company would behave and they both would be different from a corporation. This also means that attack vectors and attack landscapes are different in all these places. We'll speak about modeling different kinds of attacks and building a model that is able to identify these different kinds of attacks using machine learning. The ability to bring in the expertise of a network domain expert in Driverless AI allows for quickly iterating through valuable features across the data-space. The ability to harness NVIDIA's powerful GPUs cores and the extremely optimized CUDA library changes the rate at which newer and accurate models are built for identifying attacks across the internet or a corporate network. This is truly valuable for anyone defending attacks on a variable attack surface.  Back
 
Keywords:
Cyber Security, Telecom Industry Solutions, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8145
Streaming:
Download:
Data Center and Cloud Infrastructure
Presentation
Media
Autodesk BIM Cloud Workspace on Azure and Citrix Customer Panel Discussion
Allen Furmanski (Citrix Systems), Adam Jull (IMSCAD Global), Marc Sleegers (Autodesk), Frank Wolbertus (TBI)
GPU virtualization in the Cloud has ushered in a new era for architects, builders, designers and engineers. In this case study session you will learn how TBI personnel are now using Autodesk applications including BIM 360, Stingray, Revit and Na ...Read More

GPU virtualization in the Cloud has ushered in a new era for architects, builders, designers and engineers. In this case study session you will learn how TBI personnel are now using Autodesk applications including BIM 360, Stingray, Revit and Navisworks, through a digital workspace hosted on Citrix XenDesktop HDX 3D Pro running on Microsoft Azure NV-series virtual machines with NVIDIA Quadro Workstation technology. This technology stack enables TBI employees to work together in real time, from any location, while enjoying a highly optimized 3D user experience on any device, even the low-cost Raspberry Pi. In their technology journey, TBI progressed from an age of 2D flatland, to the more advanced age of optimization of 3D digital data, to the present-day era of interoperability and collaboration in a new age where connectivity is key.This session will also include a Citrix customer panel discussion. Hear from customers who have implemented virtualized 3D workloads to solve complex business challenges. Bring your questions along and join in on the knowledge sharing in an interactive setting.

  Back
 
Keywords:
Data Center and Cloud Infrastructure, Product & Building Design, GTC Silicon Valley 2018 - ID S8646
Streaming:
 
Spectre/Meltdown Impact on High Performance Workloads
Jeremy Eder (Red Hat)
The impact of the recent Spectre and Meltdown security vulnerabilities has reached every corner of the compute ecosystem. Red Hat's Performance Engineering team has a keen interest in quantifying a wide variety of workloads in order to provide feedb ...Read More
The impact of the recent Spectre and Meltdown security vulnerabilities has reached every corner of the compute ecosystem. Red Hat's Performance Engineering team has a keen interest in quantifying a wide variety of workloads in order to provide feedback to upstream developers working on these problems. This presentation will detail our team's involvement over the last several months, share selected performance impacts from a variety of common enterprise and HPC workloads, how to potentially mitigate overheads, and inform the audience about what's being done to reduce impacts going forward.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Performance Optimization, GTC Silicon Valley 2018 - ID S81017
Streaming:
Download:
 
Using Containers for GPU Workloads
Christian Brauner (Canonical Ltd.), Serge Hallyn (Cisco)
Learn how to use containers for efficient GPU utilization to achieve bare-metal performance for computationally intensive workloads. We'll show how NVIDIA tools and libraries can be used to achieve drop-in GPU support and efficient GPU feature integ ...Read More
Learn how to use containers for efficient GPU utilization to achieve bare-metal performance for computationally intensive workloads. We'll show how NVIDIA tools and libraries can be used to achieve drop-in GPU support and efficient GPU feature integration for container runtimes, and illustrate how to leverage system containers to run complex statistical models on NVIDIA GPUs.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8338
Streaming:
Download:
 
Building and Optimizing AI Cloud: Better Leveraging GPU in Container Cloud Infrastructure
Yubo Li (IBM Research China)
Time witnessed the rapid growth of AI cloud and AI-as-a-service, along with the AI explosion this year. We'll report our continuous effort and progress on bringing NVIDIA GPU to container cloud, GPU scheduling optimization, and experience sharing of ...Read More
Time witnessed the rapid growth of AI cloud and AI-as-a-service, along with the AI explosion this year. We'll report our continuous effort and progress on bringing NVIDIA GPU to container cloud, GPU scheduling optimization, and experience sharing of holding AI workload on container cloud. Firstly, based on the work we reported at GTC 2017, we will update our latest progress of new GPU features adding to Kubernetes, including two GPU advanced schedulers and GPU resource namespace control. This year, we have brought GPU-enabled Kubernetes to IBM Cloud Private, the IBM commercial on-premise container cloud, and several other import IBM products, including our own IBM AI product PowerAI Vision. Meanwhile, we also keep activity to continuously share our technology to open community. Secondly, we want to share our lessons and learns about how to design, manage, optimize and operate AI cloud, with our experiences from our product and user feedback over two years.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Deep Learning and AI Frameworks, HPC and AI, GTC Silicon Valley 2018 - ID S8287
Streaming:
 
Deep-Learning Inferencing on IBM Cloud with NVIDIA TensorRT
Larry Brown (IBM), Khoa Huynh (IBM)
We'll focus on the deep-learning neural network model deployment and inference on the IBM Cloud and how well Nvidia GPUs perform in this area compared to FPGAs that have been tuned for deep-learning primitives. We believe this topic is very relevan ...Read More
We'll focus on the deep-learning neural network model deployment and inference on the IBM Cloud and how well Nvidia GPUs perform in this area compared to FPGAs that have been tuned for deep-learning primitives. We believe this topic is very relevant today because, with the emergence of new powerful NVIDIA GPUs, more and more artificial intelligence has become part of our daily lives, from Siri, Alexa, language translation, image recognition, to self-driving cars. The cognitive era has truly begun. Toward this end, IBM has formed a close partnership with Nvidia to offer GPU-enabled systems - both dedicated servers and on the cloud - to our customers and developers to run their cognitive workloads.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Performance Optimization, GTC Silicon Valley 2018 - ID S8760
Streaming:
Download:
 
Introducing Krylov: AI Platform that Empowers eBay Data Science and Engineering Teams
Henry Saputra (eBay)
The Krylov Project is the key component in eBay's AI Platform initiative that provides an easy to use, open, and fast AI orchestration engine that is deployed as managed services in eBay cloud. The main goals of the project are: Every AI and machine ...Read More
The Krylov Project is the key component in eBay's AI Platform initiative that provides an easy to use, open, and fast AI orchestration engine that is deployed as managed services in eBay cloud. The main goals of the project are: Every AI and machine learning algorithm should be shareable and easily implementable with possible options of frameworks; enable machine learning engineers to do end-to-end training pipelines that distribute and parallelize over many machines; training models should be automated and allow easy access to vast eBay datasets; engineers should be able to search past job submissions, view results, and share with others. We have built Krylov from the ground up, leveraging JVM, Python, and Go as the main technologies to build the Krylov components, while standing in shoulder of giants of technology such as Docker, Kubernetes, and Apache Hadoop. Using Krylov, AI scientists can access eBay's massive datasets; build and train AI models; spin up powerful compute (high-memory or GPU instances) on the Krylov HPC cluster; and set up machine learning pipelines, such as using declarative constructs that stitch together pipeline lifecycle.  Back
 
Keywords:
Data Center and Cloud Infrastructure, HPC and AI, GTC Silicon Valley 2018 - ID S8277
Streaming:
Download:
 
Impact of Storage System Performance on TensorFlow Data Ingestion
Mark Whitney (Rescale)
As multi-GPU deep learning performance improves, the performance of the storage system hosting a dataset becomes critical in keeping these GPUs fully utilized. We survey the different methods for providing training data to a TensorFlow application on ...Read More
As multi-GPU deep learning performance improves, the performance of the storage system hosting a dataset becomes critical in keeping these GPUs fully utilized. We survey the different methods for providing training data to a TensorFlow application on a GPU, and benchmark data throughput for a variety of popular neural network architectures. We look at performance and potential bottlenecks for local storage technologies (SCSI SSD and NVMe), high performance network-attached file systems, TensorFlow native connectors (HDFS and S3), and FUSE-connected object storage.  Back
 
Keywords:
Data Center and Cloud Infrastructure, NVIDIA Inception Program, HPC and AI, GTC Silicon Valley 2018 - ID S8544
Streaming:
 
An Architectural Design Firm's Journey through Virtual GPU Technology for Global Collaboration
Jimmy Rotella (CannonDesign), Andrew Schilling (CannonDesign)
Learn the benefits that virtualization provides for an architecture and engineering design firm, along with the journey through the advancements in virtualization technology it took to finally meet the graphics-intensive needs of our design software. ...Read More
Learn the benefits that virtualization provides for an architecture and engineering design firm, along with the journey through the advancements in virtualization technology it took to finally meet the graphics-intensive needs of our design software. We'll share our experiences in how virtualization allows a large company, with over 15 offices and 1,000 people worldwide, to collaborate and work as a single firm. We'll show some cost comparisons with virtualization, along with their management benefits and requirements. We'll also look at the methods we used to set and test metrics specific to our requirements, and follow the results of those metrics through the changes in graphics virtualization technology.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8240
Streaming:
 
High-Performance Input Pipelines for Scalable Deep Learning
Brian Gold (Pure Storage)
Learn how to keep your GPUs fed with data as you train the next-generation of deep learning architectures. As GPU technology continues to advance, the demand for faster data continues to grow. In deep learning, input pipelines are responsible for a c ...Read More
Learn how to keep your GPUs fed with data as you train the next-generation of deep learning architectures. As GPU technology continues to advance, the demand for faster data continues to grow. In deep learning, input pipelines are responsible for a complex chain of actions that ultimately feed data into GPU memory: defining how files are read from storage, deserializing them into data structures, pre-processing on a CPU, and copying to the GPU. These pipelines bring together complex hardware systems--including cluster networks, peripheral interconnects, modern CPUs, and storage devices--along with sophisticated software systems to drive the data movement and transformation. In this talk, we present a new benchmark suite for evaluating and tuning input pipelines. We will examine results with TensorFlow's DataSets API on a DGX-1 with V100 and provide guidance on key tuning parameters and diagnostic techniques for improving performance.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8948
Streaming:
 
Taking Virtual Graphics to Eleven
Rachel O'Gorman (Autodesk)
Autodesk has many well known, and large products. Our Engineering and Development teams run development and testing across a wide variety of operating systems, and across versions of current, and older products. Many Developers have secondary systems ...Read More
Autodesk has many well known, and large products. Our Engineering and Development teams run development and testing across a wide variety of operating systems, and across versions of current, and older products. Many Developers have secondary systems, or in some cases 3, 4 and beyond. Most systems had high end NVIDIA graphics cards. As is the nature of development, these secondary and other systems were used in cycles. Often lying idle for weeks at a time. And despite the need for graphics in our products, we found that developers only used 50% of their graphics resources in their development cycle. By virtualizing those additional workstations, the graphics, cpu and other resources were used more efficiently. And, by not replacing the physical systems every 3 years, our cost avoidance rose rapidly. By early 2017, we were hearing from our customers that they wanted Autodesk to provide the same kind of service to them. We are in Pilot with a Platform that will provide this service to our Enterprise customers. Behind the successful virtualization of desktops is a people story. Mindsets, culture and skill sets had to evolve and change.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8916
Streaming:
 
SmartIO: Dynamic Sharing of GPUs and IO in a PCIe Cluster
Haakon Stensland (Simula Research Laboratory)
Learn how GPUs, NVMe drives and other IO devices can be efficiently shared in a PCI Express cluster using SmartIO from Dolphin Interconnect Solutions.Traditionally, IO devices have been statically assigned to a single root complex (host machine), and ...Read More
Learn how GPUs, NVMe drives and other IO devices can be efficiently shared in a PCI Express cluster using SmartIO from Dolphin Interconnect Solutions.Traditionally, IO devices have been statically assigned to a single root complex (host machine), and features such as hot-add, device migration and remote access is not supported in a flexible way without complex software frameworks. Dolphin SmartIO eliminates these restrictions and provide a flexible framework for handling PCIe devices and systems. Devices such as GPUs, NVMe drives and other IO devices can be flexibly accessed from remote systems. We demonstrate how SmartIO is implemented using standard {PCIe} and Non-Transparent Bridging, show that our system gets near native performance when moving data from local GPUs to remote NVMe drives, and how we can dynamically add more GPUs to scale performance.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8511
Streaming:
 
Using Tesla Data Center GPUs for Mixed Deployments of vGPU Enabled HPC and Virtual Desktops.
Friederich Devoir (NVIDIA), Steven Forsyth (NVIDIA), Douglas Holt (NVIDIA)
Discussion and demonstration of the potential with running HPC, and VDI workloads on common clusters for a modern a datacenter Dr. Jekyll and Mr. Hyde scenario. Explore the coexistence of CUDA based HPC job engines in conjunction with both Linux and ...Read More
Discussion and demonstration of the potential with running HPC, and VDI workloads on common clusters for a modern a datacenter Dr. Jekyll and Mr. Hyde scenario. Explore the coexistence of CUDA based HPC job engines in conjunction with both Linux and Windows machines used for virtual desktop infrastructure. The demonstration will focus on a very minimal VMware cluster deployment using VSAN storage to host both the Linux HPC multi node cluster for CUDA workloads and a VMware Horizon view deployment for Linux and Windows Virtual Desktops performing DirectX, OpenGL, and CUDA based visualization workloads as used by engineering and analysis power users.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, HPC and AI, GTC Silicon Valley 2018 - ID S8209
Streaming:
 
GPUs for Everyone: Why Optimize Windows 10 and Every Application with GRID
Jon Kelley (University of Arkansas)
With the switch to Windows 10, more applications are being developed with the assumption of a GPU being present. GPUs are in our desktops, laptops, tablets, and even in the mobile phones in our pockets. Why should VDI be any different? Come see how t ...Read More
With the switch to Windows 10, more applications are being developed with the assumption of a GPU being present. GPUs are in our desktops, laptops, tablets, and even in the mobile phones in our pockets. Why should VDI be any different? Come see how the University of Arkansas is giving everyone the fastest possible experience and opening doors to new ways of learning by serving up VDI desktops and applications with pervasive GPU access. When every app has GPU acceleration, the user experience is better than ever.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8543
Streaming:
 
Transforming the AEC business with Cloud Workstations in Azure
Brad Peterson (Workspot), Gavin Quamme (Mead & Hunt), Tariq Sharif (Microsoft Azure)
AEC firms are turning to GPU cloud workstations for greater agility, security and competitive advantage. Mead & Hunt achieved these benefits using Workspot''s Workstation Cloud on Microsoft Azure. Attend this session to hear about their journey a ...Read More
AEC firms are turning to GPU cloud workstations for greater agility, security and competitive advantage. Mead & Hunt achieved these benefits using Workspot''s Workstation Cloud on Microsoft Azure. Attend this session to hear about their journey and how they quickly met the needs of both IT and their power users. Mead & Hunt provides innovative engineering, construction and energy solutions. To sustain its leadership as one of the top design-build firms in the U.S, Mead & Hunt needs the best talent in the world, but opening multiple offices to support personnel across geographies is prohibitively complex. Mead & Hunt was unable to find a viable solution that would allow users to remotely access their BIM tools, including Autodesk Revit and Navisworks. Workspot Workstation Cloud on Microsoft Azure, powered by NVIDIA GPUs, was the answer. Now Mead & Hunt designers, engineers, and finishers can collaborate in real-time - whether from the project site, the office, or from home. These power users now securely access BIM tools from anywhere, and IT can provision a new user within minutes. Learn how Mead & Hunt transformed its business with Workspot, Microsoft Azure, and NVIDIA.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8774
Streaming:
Download:
 
HPC in Containers - Why Containers, Why HPC, How and Why NVIDIA
Christopher Newburn (NVIDIA)
Are you wondering whether the cloud is relevant to HPC and how it works? Increasingly, applications in high-performance computing are using containers to ease deployment. In this talk, you''ll learn what containers are, how they are orchestrated to ...Read More
Are you wondering whether the cloud is relevant to HPC and how it works? Increasingly, applications in high-performance computing are using containers to ease deployment. In this talk, you''ll learn what containers are, how they are orchestrated to run together in the cloud, and how communication among containers works. You''ll get a snapshot of current support from the ecosystem, and gain insight into why NVIDIA is leading the charge to provide best performance and usability.  Back
 
Keywords:
Data Center and Cloud Infrastructure, HPC and AI, GTC Silicon Valley 2018 - ID S8642
Streaming:
Download:
 
Pooling and Orchestrating NVIDIA Jetson for AI and Deep Learning on the Edge
Sumit Puri (Liqid Inc.)
Attendees will learn how NVIDIA''s Jetson TX-series processors can be scaled out to create an adaptive HPC and Supercomputing platform for bespoke deployments and edge computing environments. Advancements in composable infrastructure technology now m ...Read More
Attendees will learn how NVIDIA''s Jetson TX-series processors can be scaled out to create an adaptive HPC and Supercomputing platform for bespoke deployments and edge computing environments. Advancements in composable infrastructure technology now make it possible to pool and orchestrate Jetson processors for deployments with specialized parallel computing requirements. Use cases include Jetson deployments in non-embedded environments for edge computing where traditional HPC architectures are not hospitable. Clusters of NVIDIA Jetson TX- devices can be deployed in edge compute environments connected to arrays of sensors for neural net training, pattern recognition, and deep learning. Applications for autonomous transportation can also benefit from clustering massive numbers of Jetson TX- devices to simulate fleets of vehicles to train machine learning algorithms in parallel. Jetson use cases can be expanded well beyond embedded applications when deployed with PCIe-based fabric composable infrastructure technology, permitting 16x networking performance improvement over the embedded 1Gb Ethernet interface.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Graphics and AI, GTC Silicon Valley 2018 - ID S8539
Streaming:
Download:
 
Empowering CUDA Developers with Virtual Desktops
Tony Foster (Dell EMC)
You''ve just been tasked with deploying the NVIDIA CUDA Toolkit to a group of developers. Wouldn''t it be great if you could save time deploying it, protect the developers work, reduce the amount of unique workstation hardware needed, & get more ...Read More
You''ve just been tasked with deploying the NVIDIA CUDA Toolkit to a group of developers. Wouldn''t it be great if you could save time deploying it, protect the developers work, reduce the amount of unique workstation hardware needed, & get more out of your hardware investment? This session will show how this can be done with VMware Horizon Virtual Desktops leveraging vGPUs and the CUDA Toolkit. The CUDA Toolkit is a core component of most developer''s desktops and provides the underpinnings for many development operations that take advantage of GPU technology. It can, and often is, difficult to install on Virtual Machines. We will walk through its deployment on Linux virtual machines, insuring requirements for both the CUDA Toolkit & VMware Horizon with vGPU are met.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8483
Streaming:
Download:
 
Advantages of a Bare-Metal Cloud for CUDA Workloads (Presented by Oracle)
Karan Batta (Oracle)
With traditional performance intensive workloads transitioning to the cloud and new workloads such as deep learning relying on cloud resources, it''s imperative to optimize the environment to squeeze every ounce of performance from the NVIDIA GPUs as ...Read More
With traditional performance intensive workloads transitioning to the cloud and new workloads such as deep learning relying on cloud resources, it''s imperative to optimize the environment to squeeze every ounce of performance from the NVIDIA GPUs as possible. Learn how levers like bare-metal servers, a true flat network and high-performance storage can really accelerate workloads utilizing NVIDIA GPUs in the cloud. See live demos and walkthrough of how easy it is to launch your very own GPU cluster in Oracle Cloud Infrastructure. Additionally learn about new announcements on Oracle Cloud Infrastructure in partnership with NVIDIA. This is a session not to be missed!  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8978
Streaming:
Download:
 
GPUs for Every Workload in Microsoft Azure (Presented by Microsoft Azure)
Tariq Sharif (Microsoft Azure)
Azure N-series VMs powered by NVIDIA GPUs enable a range of new accelerated scenarios. Learn how you can take advantage of GPUs using CUDA or OpenCL for scenarios like ray traced rendering, machine learning, and artificial intelligence. Stream or rem ...Read More
Azure N-series VMs powered by NVIDIA GPUs enable a range of new accelerated scenarios. Learn how you can take advantage of GPUs using CUDA or OpenCL for scenarios like ray traced rendering, machine learning, and artificial intelligence. Stream or remotely access content and engineering designs, digital media or graphics rich applications utilizing DirectX or OpenGL along with workstation in the cloud capabilities.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8965
Streaming:
 
Linux Virtual Desktops with NVIDIA Virtual GPUs for Chip-Design Applications
Shailesh Deshmukh (NVIDIA)
We''ll focus on the key requirements and benefits of using Linux VDIs for chip design applications such as cadence allegro, PTC creao, etc. Currently there is a huge interest in this space for GPU accelerated VDIs specially with Linux as OS since &qu ...Read More
We''ll focus on the key requirements and benefits of using Linux VDIs for chip design applications such as cadence allegro, PTC creao, etc. Currently there is a huge interest in this space for GPU accelerated VDIs specially with Linux as OS since "exceed on demand" and other applications are out dated and quite expensive to operate. We will discuss some use cases both with VMWare and Citrix and after receiving permission for few customers their names as current users. A live demo will be followed at the end and Q&A.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, GTC Silicon Valley 2018 - ID S8175
Streaming:
Download:
 
The Path to GPU as a Service in Kubernetes
Viraj Chavan (NVIDIA), Renaud Gaubert (NVIDIA)
Kubernetes modern production patterns for Deep Learning applications and a deep dive into the Kubernetes GPU subsystem and its challenges (performance, scheduling, monitoring). Autonomous vehicles, face recognition, High Performance Computing, Virtua ...Read More
Kubernetes modern production patterns for Deep Learning applications and a deep dive into the Kubernetes GPU subsystem and its challenges (performance, scheduling, monitoring). Autonomous vehicles, face recognition, High Performance Computing, Virtual Reality, NVIDIA GPUs are enabling a new computer era with cloud computing at its center. With kubernetes being the next iteration in cloud technologies, the NVIDIA container team with the kubernetes community is driving the advances in GPU integration. During this talk we will review how to deploy a GPU enabled Kubernetes and the modern production patterns for deploying GPU enabled services and applications. We will also dive into the details of the Kubernetes device plugin (its GPU subsystem), the NVIDIA container stack and the limitations provided by the kubernetes infrastructure. We will finally be discussing the latest improvements in the device plugin subsystem of Kubernetes, and the challenges ahead of it such as NUMA, sharing and scheduling.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8893
Streaming:
Download:
 
Architecting a Complete Data Infrastructure for AI and Deep Learning (Presented by NetApp)
Kesari Mishra (NetApp), Santosh Rao (NetApp)
Enterprises are eager to take advantage of artificial intelligence technologies such as deep learning to introduce new services and enhance insights from company data. As data science teams move past proof of concept and begin to operationalize deep ...Read More
Enterprises are eager to take advantage of artificial intelligence technologies such as deep learning to introduce new services and enhance insights from company data. As data science teams move past proof of concept and begin to operationalize deep learning, it becomes necessary to focus on the creation of a complete data architecture that eliminates bottlenecks to facilitate faster model iteration. Designing a data architecture involves thinking holistically about the deep learning pipeline, from data ingest and edge analytics, to data prep and training in the core data center, to archiving in the cloud. It is necessary to understand performance requirements and data services needed, but one should also consider future extensibility and supportability as deep learning hardware and cloud approaches evolve over time. This session will examine all the factors involved in the architecture of a deep learning pipeline, focusing in on data management and the hybrid cloud. Careful infrastructure planning can smooth the flow of data through your deep learning pipeline, lead to faster time to deployment, and thus maximum competitive differentiation.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Performance Optimization, GTC Silicon Valley 2018 - ID S8974
Streaming:
 
Microsoft AI and Research - Infrastructure Overview for Deep Learning and Other Research
Jim Jernigan (Microsoft Research)
Microsoft Research leverages a wide variety of open-source, free and custom tools to manage a complex infrastructure for doing research. We are in a unique position at Microsoft and in the industry, where we serve academic experts who expect access t ...Read More
Microsoft Research leverages a wide variety of open-source, free and custom tools to manage a complex infrastructure for doing research. We are in a unique position at Microsoft and in the industry, where we serve academic experts who expect access to the latest open source tools, in an environment where Microsoft solutions should also be considered. See examples of how we manage popular/constrained assets and enforce fairness across many systems. Linux/Docker, Windows, On-site, Azure, or a hybrid of all-of-the above we see it all. In this session, you will learn what tools can be easily leveraged to manage your own onsite and cloud GPU infrastructure. We touch on Cluster management fabrics, scheduling, authentication, hot storage, configuration management, software portability/container management and high-performance hardware selection.  Back
 
Keywords:
Data Center and Cloud Infrastructure, HPC and AI, GTC Silicon Valley 2018 - ID S8663
Streaming:
Download:
 
Commoditizing GPU-as-a-Service Providers with Red Hat OpenShift Container Platform
Andre Beausoleil (Red Hat), Jeremy Eder (Red Hat)
Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVI ...Read More
Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as from every computer maker, an abstraction library like OpenShift becomes even more valuable. Through demonstrations, this session will introduce you to declarative models for consuming GPUs via OpenShift, as well as the two-level scheduling decisions that provide fast placement and stability.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8769
Streaming:
Download:
 
Bare-Metal Abstractions for Modern Hardware
Cyprien Noel (UC Berkeley)
Hardware is getting smarter every day. GPUs, hardware accelerated networks, and non-volatile memories are increasingly replacing capabilities offered today by operating systems and software libraries. They are becoming available on-premise and in clo ...Read More
Hardware is getting smarter every day. GPUs, hardware accelerated networks, and non-volatile memories are increasingly replacing capabilities offered today by operating systems and software libraries. They are becoming available on-premise and in clouds. Leveraging them in your application can yield orders of magnitude improvements in latency and throughput, and much smaller code bases. We present simple abstractions exposing hardware capabilities, and work-in-progress demos: data storage using hardware erasure codes present in recent network adapters, streaming data from storage to GPUs using RDMA, and executing a deep learning distributed compute graph entirely in hardware using GPUDirect Async. Our demos are attempts to replace large code bases with few lines of Python, using interchangeable and unified hardware abstractions, so data and control events can flow directly device-to-device.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Deep Learning and AI Frameworks, HPC and AI, GTC Silicon Valley 2018 - ID S8154
Streaming:
Download:
 
Maximizing The Power of GPU For Diverse Workloads of Enterprise Digital Workspaces On VMware vSphere
Uday Kurkure (VMware), Hari Sivaraman (VMware)
Enterprise Digital Workspaces support diverse workloads including virtual desktops, deep learning, big data. Nvidia GPUs bring high performance computing (HPC) for graphics, GPGPU, especially machine learning workloads. They also provide HW encode an ...Read More
Enterprise Digital Workspaces support diverse workloads including virtual desktops, deep learning, big data. Nvidia GPUs bring high performance computing (HPC) for graphics, GPGPU, especially machine learning workloads. They also provide HW encode and decode to accelerate the processing of video contents. In this session, we will explore performance and resource utilization of various workloads that leverage different capabilities of GPU like graphics, compute and H.264 HW encode / decode. Nvidia virtualized GPUs and VMware vSphere brings in tremendous combined benefits for both GPU-based workloads and data center management via virtualization. We will present results of our research on running diverse workloads on vSphere platform using Nvidia GRID GPUs. We explore vSphere features of Suspend/Resume and vMotioning of vGPU based virtual machines. We will quantify benefits of vGPU for data center management using VMware vSphere and describe techniques for efficient management of workloads and datacenter resources.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GPU Virtualization, HPC and AI, GTC Silicon Valley 2018 - ID S8250
Streaming:
Download:
 
GPU Monitoring and Management with NVIDIA Data Center GPU Manager (DCGM)
David Beer (NVIDIA), Brent Stolle (NVIDIA)
NVIDIA DCGM is a monitoring and management daemon, GPU Diagnostic, and SDK geared towards managing GPUs in a cluster environment. DCGM is widely deployed both internally at NVIDIA and externally at large HPC labs and Cloud Service Providers. We will ...Read More
NVIDIA DCGM is a monitoring and management daemon, GPU Diagnostic, and SDK geared towards managing GPUs in a cluster environment. DCGM is widely deployed both internally at NVIDIA and externally at large HPC labs and Cloud Service Providers. We will go over the core features of DCGM and features that have been added in the last year. We will also demonstrate how DCGM can be used to monitor GPU health and alert on GPU errors using both the dcgmi command-line tools and the DCGM SDK.  Back
 
Keywords:
Data Center and Cloud Infrastructure, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8505
Streaming:
Download:
 
Compute Engineering Simulation Processing in Oracle Cloud Infrastructure (Presented by Oracle)
Taylor Newill (Oracle HPC)
Pre and post process CAE data near your cloud compute to save time, money, and IT headaches. Whether you're building the next supercar or visualizing a medical dataset, you can now eliminate the need for data transfer to and from on-premises by runn ...Read More
Pre and post process CAE data near your cloud compute to save time, money, and IT headaches. Whether you're building the next supercar or visualizing a medical dataset, you can now eliminate the need for data transfer to and from on-premises by running professional design and engineering applications in the cloud. See new Oracle Cloud Infrastructure GPUs in live demonstrations of data transfer, CAD pre-processing, and CAE post processing.  Back
 
Keywords:
Data Center and Cloud Infrastructure, GTC Silicon Valley 2018 - ID S8988
Streaming:
 
Considerations in Architecting an AI Ready Data Platform (Presented by DDN Storage)
James Coomer (DDN Storage)
Analytics and AI present a serious challenge to businesses in developing new expertise and transforming data architectures from enterprise-class to AI-ready. AI workloads demand a different approach to managing the data lifecycle. The new AI datacent ...Read More
Analytics and AI present a serious challenge to businesses in developing new expertise and transforming data architectures from enterprise-class to AI-ready. AI workloads demand a different approach to managing the data lifecycle. The new AI datacenter must be optimized for ingesting, storing, transforming and optimizing data and feeding that data through hyper-intensive analytics workflows and ultimately, extracting value. Failing fast during experimentation, and scaling successful models quickly to production is vital. Learn how to architect and deploy data platforms with robust and balanced performance for all I/O patterns.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Performance Optimization, GTC Silicon Valley 2018 - ID S8975
Streaming:
 
How to Use NGC Containers on AWS
Scott Ellis (NVIDIA), Jeffrey Weiss (NVIDIA)
We'll discuss how to use the NVIDIA GPU Cloud to easily run containerized Deep Learning applications. Come in only knowing the name NVIDIA GPU Cloud (NGC), and leave having successfully kicked off multiple Deep Learning containers. In this sess ...Read More
We'll discuss how to use the NVIDIA GPU Cloud to easily run containerized Deep Learning applications. Come in only knowing the name NVIDIA GPU Cloud (NGC), and leave having successfully kicked off multiple Deep Learning containers. In this session we'll use the WebUI to log into NGC, run jobs based on those NVIDIA containers using Volta-powered AWS instances, and explore how to customize and integrate NGC containers into a Deep Learning workflow.  Back
 
Keywords:
Data Center and Cloud Infrastructure, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8276
Streaming:
Download:
Deep Learning and AI
Presentation
Media
Healthcare AI Startup Pitches
Jensen Huang (NVIDIA)
Watch leading Healthcare AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale. ...Read More

Watch leading Healthcare AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale.

  Back
 
Keywords:
Deep Learning and AI, GPU Virtualization, GTC Silicon Valley 2018 - ID SE0008A
Streaming:
 
Enterprise AI Startup Pitches
Jensen Huang (NVIDIA)
Watch leading Enterprise AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale ...Read More

Watch leading Enterprise AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale

  Back
 
Keywords:
Deep Learning and AI, GPU Virtualization, GTC Silicon Valley 2018 - ID SE0008B
Streaming:
 
Autonomous Systems AI Startup Pitches
Jensen Huang (NVIDIA)
Watch leading Autonomous Systems AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale ...Read More

Watch leading Autonomous Systems AI Startups compete for cash prizes at the NVIDIA GTC 2018 Inception Awards Finale

  Back
 
Keywords:
Deep Learning and AI, GPU Virtualization, GTC Silicon Valley 2018 - ID SE0008C
Streaming:
Deep Learning and AI Frameworks
Presentation
Media
Multi-GPU Accelerated Methods in Deep Reinforcement Learning
Adam Stooke (UC Berkeley)
Deep reinforcement learning (RL) has achieved many recent successes, yet experiment turn-around time remains a key bottleneck in research and in practice. We investigate how to optimize existing deep RL algorithms for modern computers, specifically f ...Read More
Deep reinforcement learning (RL) has achieved many recent successes, yet experiment turn-around time remains a key bottleneck in research and in practice. We investigate how to optimize existing deep RL algorithms for modern computers, specifically for a combination of CPUs and GPUs. We confirm that both policy gradient and Q-value learning algorithms can be adapted to learn using many parallel simulator instances. We further find it possible to train using batch sizes considerably larger than are standard, without negatively affecting sample complexity or final performance. We leverage these facts to build a unified framework for parallelization that dramatically hastens experiments in both classes of algorithm. All neural network computations use GPUs, accelerating both data collection and training. Our results include using an entire NVIDIA DGX-1 to learn successful strategies in Atari games in single-digit minutes, using both synchronous and asynchronous algorithms.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, Performance Optimization, GTC Silicon Valley 2018 - ID S8272
Streaming:
Download:
 
AI and Deep Learning in R
Jared Lander (Lander Analytics)
We'll discuss use cases for machine learning on GPUs and how to implement them easily in the R programming language by walking through the ideas behind several modern techniques, including penalized regression, boosted trees, and deep nets. Along wi ...Read More
We'll discuss use cases for machine learning on GPUs and how to implement them easily in the R programming language by walking through the ideas behind several modern techniques, including penalized regression, boosted trees, and deep nets. Along with introducing the concepts briefly cover, we'll discuss some of the math behind the models and look at code examples to run the models on GPUs in R.  Back
 
Keywords:
Deep Learning and AI Frameworks, Programming Languages, GTC Silicon Valley 2018 - ID S8138
Streaming:
Download:
 
Deep Learning for Surface Reconstruction
Shafaatunnur Hasan (UTM Big Data Centre, Universiti Teknologi Malaysia), Siti Mariyam Shamsuddin (UTM Big Data Centre, Universiti Teknologi Malaysia)
We'll present deep learning algorithm in reconstructing the surfaces from massive data points. The deep learning consists of multiple layers in organizing the neurons for optimal neighborhood representations. The implementations are done by sli ...Read More
We'll present deep learning algorithm in reconstructing the surfaces from massive data points. The deep learning consists of multiple layers in organizing the neurons for optimal neighborhood representations. The implementations are done by slicing into half the standard self organizing map (SOM) network to form multiple layers. The Z-axis distance is omitted in the computation of neighborhood distance when updating the weighted neurons to avoid surface points discontinuity due the layers depth. In this scenario, the distance determining the winning node is computed using 2D calculation from four directions. As the layers increase, the complexity computations arise, and the processing power should increase as well. Thus, we implement CUDA programming to update the weights and distance of the winning node. Reduction techniques are implemented to obtain the smallest distance for the winning node. For weight updating process, each thread is given several nodes to calculate the distance between the winning node and the current node. Two parts are involved in designing and developing the algorithms: point reduction and point optimization for surface reconstruction.  Back
 
Keywords:
Deep Learning and AI Frameworks, Graphics and AI, GIS, GTC Silicon Valley 2018 - ID S8425
Streaming:
Download:
 
Node-Level Deep Learning Input Pipeline Optimization on GPGPU-Accelerated HPC Systems
Justin Fletcher (Maui High Performance Computing Center)
Learn how to implement and analyze a simple deep learning input pipeline pattern that prevents slowdowns from input queue exhaustion on accelerated HPC systems with limited impact to model performance. Queue exhaustion occurs because the throughput-d ...Read More
Learn how to implement and analyze a simple deep learning input pipeline pattern that prevents slowdowns from input queue exhaustion on accelerated HPC systems with limited impact to model performance. Queue exhaustion occurs because the throughput-driven dequeue rate is greater than the enqueue rate, which is bound by storage access bandwidth. In this session we will describe a technique that prevents queue exhaustion by artificially slowing the effective dequeue rate, without sacrificing substantial validation set performance. An example using TensorFlow is presented, and the resultant optimization step speedup and model performance are analyzed across several HPC resource configurations.  Back
 
Keywords:
Deep Learning and AI Frameworks, HPC and Supercomputing, GTC Silicon Valley 2018 - ID S8674
Streaming:
Download:
 
Deep Learning Hyperparameter Optimization with Competing Objectives via Bayesian Optimization
Scott Clark (SigOpt)
Bayesian Optimization is an efficient way to optimize machine learning model parameters, especially when evaluating different parameters is time-consuming or expensive. Deep learning pipelines like MXnet are notoriously expensive to train, even on GP ...Read More
Bayesian Optimization is an efficient way to optimize machine learning model parameters, especially when evaluating different parameters is time-consuming or expensive. Deep learning pipelines like MXnet are notoriously expensive to train, even on GPUs, and often have many tunable parameters including hyperparameters, the architecture, and feature transformations that can have a large impact on the efficacy of the model. In traditional optimization, a single metric like accuracy is optimized over a potentially large set of configurations with the goal of producing a single, best configuration. We'll explore real world extensions where multiple competing objectives need to be optimized, a portfolio of multiple solutions may be required, constraints on the underlying system make certain configurations not viable, and more. We'll present work from recent ICML and NIPS workshop papers and detailed examples, with code, for each extension.  Back
 
Keywords:
Deep Learning and AI Frameworks, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8136
Streaming:
Download:
 
Doing Bayesian Deep Learning with ZhuSuan
Jiaxin Shi (Tsinghua University)
We discuss the basic concepts of Bayesian deep learning will be introduced with a hands-on tutorial that walks through several example applications using ZhuSuan (https://github.com/thu-ml/zhusuan). We'll start with simpler models like Bayesian logi ...Read More
We discuss the basic concepts of Bayesian deep learning will be introduced with a hands-on tutorial that walks through several example applications using ZhuSuan (https://github.com/thu-ml/zhusuan). We'll start with simpler models like Bayesian logistic regression, and then proceed to deeper ones like Bayesian neural networks (BNN) and variational autoencoders (VAE). Learn how to use Bayesian methods to capture uncertainty of deep learning, including modeling the data distribution, calibrating the confidence of outputs, and smoothing predictions to prevent overfitting. Real problems (e.g. regression, image generation, semi-supervised classification) will be used to illustrate the models.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, GTC Silicon Valley 2018 - ID S8593
Streaming:
Download:
 
State-of-the-Art Large Scale Language Modeling in 12 Hours With a Single GPU
Nitish Shirish Keskar (Salesforce Research), Stephen Merity (Salesforce Research)
For sequence learning tasks that utilize recurrent neural networks, scale is both the key to accuracy and the bane of speed. We'll take existing state-of-the-art language modeling techniques and speed them up by orders of magnitude without losing ac ...Read More
For sequence learning tasks that utilize recurrent neural networks, scale is both the key to accuracy and the bane of speed. We'll take existing state-of-the-art language modeling techniques and speed them up by orders of magnitude without losing accuracy. The tactics include injecting flexibility into NVIDIA's black box cuDNN LSTM; replacing the LSTM with the more parallelized and customizable Quasi-Recurrent Neural Network; reducing the softmax bottleneck using the adaptive softmax; and investigating individual function efficiency on the GPU using the NVIDIA Visual Profiler. The end result is a general and scalable language model framework that can achieve state-of-the-art quality on the WikiText-103 dataset (103 million words) in under 12 hours using a single NVIDIA Volta V100. The resulting PyTorch codebase is open source for experimentation and extension.  Back
 
Keywords:
Deep Learning and AI Frameworks, Performance Optimization, GTC Silicon Valley 2018 - ID S8654
Streaming:
Download:
 
A Low-Latency Inference System for Recurrent Neural Networks
Jinyang Li (New York University)
We'll present cellular batching, which is a new way of performing batching on GPUs to accelerate model inference for recurrent neural networks (RNNs). Existing deep learning systems perform batching by collecting a fixed set of input samples and fus ...Read More
We'll present cellular batching, which is a new way of performing batching on GPUs to accelerate model inference for recurrent neural networks (RNNs). Existing deep learning systems perform batching by collecting a fixed set of input samples and fusing their underlying dataflow graphs together for execution. This approach does not perform well for RNNs with input-dependent dataflow graphs. We propose cellular batching, which can significantly improve both the latency and throughput of RNN inference. Cellular batching performs batching at the granularity of an RNN "cell'' -- a subgraph with shared weights -- and dynamically assembles a batched block for execution as requests join and leave the system. We show that this new way of batching can reduce the inference latency by 50 to 90 percent, while also increasing the throughput by 10 to 200 percent.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, Performance Optimization, GTC Silicon Valley 2018 - ID S8608
Streaming:
Download:
 
ONNX: Interoperable Deep learning (Presented by Facebook)
Dmytro Dzhulgakov (Facebook)
We'll discuss how to transfer models seamlessly from one framework to another using open neural network exchange (ONNX) from the project's lead developer. ONNX is an open specification to provide a common intermediate representation ...Read More
We'll discuss how to transfer models seamlessly from one framework to another using open neural network exchange (ONNX) from the project's lead developer. ONNX is an open specification to provide a common intermediate representation for deep learning models. This specification and set of tools, launched by Facebook, Microsoft, and Amazon, is now supported by a community of partners that includes hardware vendors, startups, and a growing number of deep learning frameworks. The ONNX ecosystem also includes support by hardware-optimized libraries such as NVIDIA's TensorRT. ONNX is the crucial first step toward an open ecosystem that empowers AI developers to choose the most effective tools for each project and accelerate AI research to production scale.   Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8818
Streaming:
Download:
 
Research To Production: How Facebook does AI at Scale (Presented by Facebook)
Sarah Bird (Facebook), Howard Mansell (Facebook AI Research)
Facebook's strength in AI innovation comes from the ability to quickly bring cutting-edge research into large scale production using a multi-faceted toolset. We'll discuss how Facebook leverages open source software to perform truly iter ...Read More

Facebook's strength in AI innovation comes from the ability to quickly bring cutting-edge research into large scale production using a multi-faceted toolset. We'll discuss how Facebook leverages open source software to perform truly iterative AI research, scale it seamlessly for inference, and deploy it across the data center and mobile environments with ONNX. 

  Back
 
Keywords:
Deep Learning and AI Frameworks, Deep Learning and AI, GTC Silicon Valley 2018 - ID S8795
Streaming:
 
Multi-GPU Training with NCCL
Sylvain Jeaugey (NVIDIA)
We'll cover recent features and performance improvement in the NVIDIA collective communication library (NCCL). NCCL is designed to make computing on multiple GPUs easy and is integrated in most deep learning frameworks to accelerate training ...Read More

We'll cover recent features and performance improvement in the NVIDIA collective communication library (NCCL). NCCL is designed to make computing on multiple GPUs easy and is integrated in most deep learning frameworks to accelerate training times. NCCL supports communication over Shared memory, PCI, NVLink, Sockets, and InfiniBand Verbs, to support both multi-GPU machines and multi-node clusters. 

  Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8462
Streaming:
Download:
 
Extracting Data from Tables and Charts in Natural Document Formats
Philipp Meerkamp (Bloomberg), David Rosenberg (Bloomberg)
Financial analysis depends on accurate financial data, and these data are often distributed via PDF and other "natural document" formats. While these formats are optimized for easy human comprehension, automatically extracting the data can ...Read More
Financial analysis depends on accurate financial data, and these data are often distributed via PDF and other "natural document" formats. While these formats are optimized for easy human comprehension, automatically extracting the data can be quite challenging. We'll describe our work using a deep learning pipeline to extract data from tables and charts in PDF documents. We'll also show some of our latest research, inspired by image captioning models, for directly going from images of tables to a markup language (LaTeX) representation.  Back
 
Keywords:
Deep Learning and AI Frameworks, Finance, GTC Silicon Valley 2018 - ID S8651
Streaming:
 
The Journey from a Small Development Lab Environment to a Production Datacenter for Deep Learning Applications
Ryan Olson (NVIDIA), Markus Weber (NVIDIA)
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such has the DGX Station, to rapidly develop and prototype deep learning applications. We'll demonstrate the setup o ...Read More
We'll do a dive deep into best practices and real world examples of leveraging the power and flexibility of local GPU workstations, such has the DGX Station, to rapidly develop and prototype deep learning applications. We'll demonstrate the setup of our small lab, which is capable of supporting a team of several developers/researchers, and our journey as we moved from lab to data center. Specifically, we'll walk through our experience of building the TensorRT Inference Demo, aka Flowers, used by Jensen to demonstrate the value of GPU computing throughout the world-wide GTCs. As an added bonus, get first-hand insights into the latest advancements coming to AI workstations this year. The flexibility for fast prototyping provided by our lab was an invaluable asset as we toyed with different software and hardware components. As the models and applications stabilized and we moved from lab to data center, we were able to run fully load-balanced over 64 V100s serving video inference demonstrating Software-in-the-Loop's (SIL) ReSim capabilities for Autonomous Vehicles at GTC EU. Real live examples will be given.  Back
 
Keywords:
Deep Learning and AI Frameworks, HPC and AI, GTC Silicon Valley 2018 - ID S8263
Streaming:
 
Experiences of End2end Deep Learning Optimization on Alibaba PAI Deep Learning Platform
Minmin Sun (Alibaba), Jun Yang (Alibaba)
We'll share experiences of end-to-end deep learning optimization on Alibaba's platform of artificial intelligence (PAI), including both offline training and online inference. For offline training, dedicated optimization is made for local and distri ...Read More
We'll share experiences of end-to-end deep learning optimization on Alibaba's platform of artificial intelligence (PAI), including both offline training and online inference. For offline training, dedicated optimization is made for local and distributed environment. For online inference, the optimization is done through both algorithm and system perspectives. Both the methodology and benchmark number are shared during this session. We'll share several business applications driven by these optimizations to ensure learning to bridge the gap between low-level optimization and real business scenarios.  Back
 
Keywords:
Deep Learning and AI Frameworks, Performance Optimization, GTC Silicon Valley 2018 - ID S8113
Streaming:
 
How To Train and Execute a Deep Learning Model Able to Re-identify and Extract Attributes from Humans
Matthieu Ospici (Atos)
We'll present a deep learning system able to decide if two people are similar or not. This system use the global appearance of a person, not just the face, to perform the re-identification. Our system also provides attributes (top color, bot ...Read More

We'll present a deep learning system able to decide if two people are similar or not. This system use the global appearance of a person, not just the face, to perform the re-identification. Our system also provides attributes (top color, bottom color, genre, length of the clothes, and the hair). We'll describe how to train a system with tensorflow on a GPU cluster and how to use it on a global video analysis system running on GPU devices.

  Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8355
Streaming:
Download:
 
Performance Optimization for Deep-Learning on the latest OpenPOWER systems
Khoa Huynh (IBM), Jonathan Samn (IBM), Brian Wan (IBM)
We''ll discuss how cognitive workloads could leverage the latest OpenPOWER systems with NVIDIA Volta V100 GPUs and fast NVLink 2.0 CPU-GPU interconnects. IBM has formed a close partnership with NVIDIA to offer GPU-enabled OpenPOWER systems and P ...Read More
We''ll discuss how cognitive workloads could leverage the latest OpenPOWER systems with NVIDIA Volta V100 GPUs and fast NVLink 2.0 CPU-GPU interconnects. IBM has formed a close partnership with NVIDIA to offer GPU-enabled OpenPOWER systems and PowerAI software to our customers and developers. We''ll focus on the latest OpenPOWER systems and how large-scale deep-learning neural network training could leverage the unique capabilities of these systems with PowerAI Release 4. Also discussed is the new IBM distributed deep learning (DDL) technology that allows neural network model training to scale almost linearly across hundreds of NVIDIA GPUs.  Back
 
Keywords:
Deep Learning and AI Frameworks, Performance Optimization, GTC Silicon Valley 2018 - ID S8765
Streaming:
Download:
 
Flavors: Library of AI Powered Trie Structures for Fast Parallel Lookup
Krzysztof Kaczmarski (Warsaw University of Technology), Albert Wolant (Warsaw University of Technology)
Learn how to use deep learning to build highly optimized data structures matching your needs exactly. During this session, you will find out what can be accomplished, if you combine massively parallel data structures and modern AI techniques to achie ...Read More
Learn how to use deep learning to build highly optimized data structures matching your needs exactly. During this session, you will find out what can be accomplished, if you combine massively parallel data structures and modern AI techniques to achieve best performance for data lookup. We will present results on real life data, both from academia and industry, that will show just how flexible presented method is. We will also share insights on optimization process gained during the project.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, Performance Optimization, GTC Silicon Valley 2018 - ID S8401
Streaming:
Download:
 
Beyond What is Being Said: Toward Understanding Human-to-Human Conversations
Samuel Kim (Gridspace)
Gridspace is researching human conversations using various GPU-accelerated deep learning algorithms. Our GPU-based software stack provides a novel way to process large-scale speech data; it is capable of understanding what is being said and how it is ...Read More
Gridspace is researching human conversations using various GPU-accelerated deep learning algorithms. Our GPU-based software stack provides a novel way to process large-scale speech data; it is capable of understanding what is being said and how it is being said. We''ll introduce several commercial features including speech recognition, emotion recognition, sentiment analysis, and call grading. Compared to conventional ways of dealing with large-scale speech data, the proposed method provides the platform that analyze a whole data rather than fraction of sampled data.  Back
 
Keywords:
Deep Learning and AI Frameworks, Speech and Language Processing, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8615
Streaming:
 
GPU Coder: Automatic CUDA and TensorRT Code Generation from MATLAB
Jaya Shankar (MathWorks), Girish Venkataramani (MathWorks)
Learn how GPU Coder produces high-performance CUDA code automatically from a high-level algorithm description in MATLAB. Write your deep learning application with the expressive power of MATLAB, which allows you to describe not just the use of your t ...Read More
Learn how GPU Coder produces high-performance CUDA code automatically from a high-level algorithm description in MATLAB. Write your deep learning application with the expressive power of MATLAB, which allows you to describe not just the use of your trained deep learning model in inference mode but also perform data-augmentation and post-processing of the results to create a complete deployment-ready application. GPU Coder can then generate optimized inference code for the whole application. The deep learning inference model is compiled down to TensorRT, while the rest of the application logic is parallelized through creation of CUDA kernels and integration with other CUDA optimized libraries like cuBLAS, cuFFT, etc. The generated code can be cross-compiled to any NVIDIA GPU device that supports TensorRT. This allows engineers and scientists to unlock the expressive ease-of-use of the MATLAB programming language while unleashing deep learning performance by leveraging TensorRT.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, GTC Silicon Valley 2018 - ID S8480
Streaming:
Download:
 
GPU Accelerated Machine Learning for Bond Price Prediction
Venkat Bala (RN Financial Corporation), Rafael Nicolas Fermin Cota (RN Financial Corporation)
We''ll discuss our application of deep learning and classical machine learning (ML) to the prediction of bond prices. The performance gains obtained from using GPUs over conventional high-performance CPUs for the model training process will be discus ...Read More
We''ll discuss our application of deep learning and classical machine learning (ML) to the prediction of bond prices. The performance gains obtained from using GPUs over conventional high-performance CPUs for the model training process will be discussed.  Back
 
Keywords:
Deep Learning and AI Frameworks, NVIDIA Inception Program, Finance, GTC Silicon Valley 2018 - ID S8655
Streaming:
Download:
 
Predictive Learning of Factor Based Strategies Using Deep Neural Networks for Investment and Risk Management
Yigal Jhirad (Cohen & Steers), Blay Tarnoff (Cohen & Steers)
We develop and implement an approach using deep neural networks to process financial and macroeconomic signals to help identify key inflection points in equity market-based factor performance such as momentum and volatility. The model may be used to ...Read More
We develop and implement an approach using deep neural networks to process financial and macroeconomic signals to help identify key inflection points in equity market-based factor performance such as momentum and volatility. The model may be used to calibrate factor rotation strategies and better assess portfolio risks associated with factor-based exposures. The machine learning algorithm relies on the GPU for high-performance computations to drive an optimization framework within a deep neural network.  Back
 
Keywords:
Deep Learning and AI Frameworks, Finance, GTC Silicon Valley 2018 - ID S8520
Streaming:
Download:
 
Myia: A Differentiable Language for Deep Learning
Olivier Breuleux (MILA)
Myia is a new, experimental deep learning framework that aims to provide to deep learning researchers both the expressive power and the performance that they need. Symbolic frameworks such as TensorFlow only cover a curated subset of programming lang ...Read More
Myia is a new, experimental deep learning framework that aims to provide to deep learning researchers both the expressive power and the performance that they need. Symbolic frameworks such as TensorFlow only cover a curated subset of programming language features and do not support second order gradients very well. Dynamic frameworks such as PyTorch, while very powerful, use an operator overloading approach for automatic differentiation, which does not lend itself well to optimization. With Myia, we attempt to have the best of both worlds: we implement a general and composable approach to automatic differentiation over a functional abstraction of a subset of the Python programming language. That subset includes if, while, for, and recursion, providing plenty of expressive power, and yet it can also be analyzed statically to provide the best possible performance. We''ll present the Myia language from a high-level technical perspective, including a short primer on functional programming and automatic differentiation. It is of special interest to deep learning framework or library implementers.  Back
 
Keywords:
Deep Learning and AI Frameworks, Programming Languages, GTC Silicon Valley 2018 - ID S8441
Streaming:
Download:
 
PyTorch: A Fast and Flexible Deep Learning Framework (Presented by Facebook)
Soumith Chintala (Facebook)
We''ll discuss how to get started with PyTorch from the creator of the project, Soumith Chintala. PyTorch is a fast and flexible deep learning framework that has been called a ''breath of fresh air'' by researchers a ...Read More

We''ll discuss how to get started with PyTorch from the creator of the project, Soumith Chintala. PyTorch is a fast and flexible deep learning framework that has been called a ''breath of fresh air'' by researchers and developers alike for its ease of use, flexibility, and similarity to python programming. It consists of an ndarray library that natively supports GPU execution (automatic differentiation engine that is flexible and fast), and an optimization package for gradient based optimization methods. 

  Back
 
Keywords:
Deep Learning and AI Frameworks, Deep Learning and AI, GTC Silicon Valley 2018 - ID S8817
Streaming:
Download:
 
How AI Can Help You Find the Tattoo of Your Dreams
Dennis Micky Jensen (Tattoodo)
Tattoodo, a Copenhagen-based startup, is the No. 1 destination for tattoo lovers around the world, covering all aspects of the global, ever growing tattoo culture. Our platform for the cultivation and appreciation of tattoo art currently has almost 1 ...Read More
Tattoodo, a Copenhagen-based startup, is the No. 1 destination for tattoo lovers around the world, covering all aspects of the global, ever growing tattoo culture. Our platform for the cultivation and appreciation of tattoo art currently has almost 1 million users and among them 50,000 tattoo artists. The platform has over 400,000 uploaded pictures, and renders around 1.5 billion monthly views across all platforms. We spent a lot of time and effort on classifying the tattoo pictures that are uploaded. A community member is able to provide a textual description and tag the tattoo with arbitrary hashtags, which obviously is a lot of responsibility to put in the hands of one member. To build a more useful tattoo platform, we decided to train a convolutional neural network to recognize and classify different tattoo styles. To train it, we used 100,000 images that are already associated with tattoo styles. We are using Caffe, a deep learning framework developed by Berkeley AI Research, which we found suitable for our needs. Training was done on the NVIDIA DIGITS deep learning GPU training system backed by a g2.2xlarge AWS instance.  Back
 
Keywords:
Deep Learning and AI Frameworks, Graphics and AI, GTC Silicon Valley 2018 - ID S8329
Streaming:
Download:
 
Crash - Practical Applications of Deep Learning in the Insurance Claims Industry
Nigel Cannings (Intelligent Voice)
Deep learning, assisted with GPU acceleration, is pervading many sectors and the insurance space is no exception. We''ll illustrate how deep learning applications in image and speech recognition are forming the backbone of innovative application ...Read More
Deep learning, assisted with GPU acceleration, is pervading many sectors and the insurance space is no exception. We''ll illustrate how deep learning applications in image and speech recognition are forming the backbone of innovative applications in the insurance industry. Real-world examples of image and speech deep learning technology are presented, demonstrating how ground-breaking applications have been engineered in the industry to automate decision-support, assist humans, improve customer experiences and reduce costs.  Back
 
Keywords:
Deep Learning and AI Frameworks, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8720
Streaming:
Download:
 
Aural2: Doing What the User Wants, Before they Finish Speaking (Presented by IBM)
Glen Darling (IBM), Isaac Leonard (IBM)
Many speech based human-machine interaction systems are based on transcribing incoming audio, followed by natural language parsing applied to the resulting list of words. Such word level speech understanding systems have difficulty solving the use-me ...Read More
Many speech based human-machine interaction systems are based on transcribing incoming audio, followed by natural language parsing applied to the resulting list of words. Such word level speech understanding systems have difficulty solving the use-mention distinction; the difference between mentioning a word and using it, a task which humans usually have no difficulty performing. We describe Aural2, an LSTM model and training infrastructure capable of training it to directly transform an audio stream into the user''s intents, slightly before they have finished speaking. This model is small and cheap to run, making it suitable for use on resource constrained edge and IoT devices., and demonstrates the benefits that GPUs bring to edge devices.   Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S81037
Streaming:
Download:
 
Sony's Deep Learning Software. Neural Network Libraries/Console
Yoshiyuki Kobayashi (Sony Corporation)
Neural Network Libraries is latest deep learning framework combining various features such as high speed training and inference with CUDA, ease of use, high portability and high scalability. Neural Network Console is integrated development environmen ...Read More
Neural Network Libraries is latest deep learning framework combining various features such as high speed training and inference with CUDA, ease of use, high portability and high scalability. Neural Network Console is integrated development environment for deep learning that enables full-scale research and development on GUI. These software can be utilized in a wide range of scenes such as for improving productivity of research and development of deep learning, for efficient human resource development, etc. In this session, we will introduce its features and functions according to the actual workflow.  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, GTC Silicon Valley 2018 - ID S8912
Streaming:
Download:
 
HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads
Sergey Serebryakov (Hewlett Packard Labs), Natalia Vassilieva (Hewlett Packard Enterprise)
PE Deep Learning Cookbook is a set of open source tools to guide the choice of the best hardware/software environment for a given deep learning task based on extensive benchmarks of reference deep learning workloads and performance modelling. Deep le ...Read More
PE Deep Learning Cookbook is a set of open source tools to guide the choice of the best hardware/software environment for a given deep learning task based on extensive benchmarks of reference deep learning workloads and performance modelling. Deep learning is a key enabling technology behind the recent revival of artificial intelligence. It is already embedded in different products we use on a daily basis and has the potential to disrupt many industries. There is a vibrant and fast growing ecosystem of software and hardware for deep learning. Various deep learning frameworks are available for anyone who wants to try out this technology. With the variety of choices in hardware configurations and software packages, it is hard to pick the most optimal tools the effectiveness of hardware/software environment varies depending on the deep learning workload. HPE Deep Learning Cookbook is a set of open source tools to guide the choice of the best hardware/software environment for a given deep learning task based on extensive benchmarks of reference deep learning workloads and performance modelling.  Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8555
Streaming:
Download:
 
Fast Data Pipelines for Deep Learning Training
Trevor Gale (Northeastern University), Simon Layton (NVIDIA), Przemyslaw Tredak (NVIDIA)
With every generation of GPU it becomes increasingly more difficult to keep the data pipeline full so that the GPU can be fully utilized. We'll propose a method for offloading the CPU and using the GPU to process image data to increase thr ...Read More
With every generation of GPU it becomes increasingly more difficult to keep the data pipeline full so that the GPU can be fully utilized. We'll propose a method for offloading the CPU and using the GPU to process image data to increase throughput.  Back
 
Keywords:
Deep Learning and AI Frameworks, Performance Optimization, GTC Silicon Valley 2018 - ID S8906
Streaming:
Download:
 
AI Embodiment at the Edge: Leveraging Deep Learning with the Intu OSS Project (Presented by IBM)
Glen Darling (IBM), Christopher Dye (IBM)
Augmented Intelligence in human-machine experiences rely on creating presence with users. We call this cognitive embodiment. To achieve this, the AI system must gather intelligence of its spatial and human environment, and possess a semantically rich ...Read More
Augmented Intelligence in human-machine experiences rely on creating presence with users. We call this cognitive embodiment. To achieve this, the AI system must gather intelligence of its spatial and human environment, and possess a semantically rich context of the environment around it. Deep Learning workloads are a valuable element in the composition of cognitive embodiment. Using Intu's 'Self' open source AI Middleware project, on NVIDIA Jetson TX2 GPU-enabled hardware, we present a pattern for AI at the Edge, leveraging insights gathered from local voice recognition, image classification, and video deep learning workloads. A full-bodied AI also reaches out to cloud services for powerful capabilities including Natural Language processing, speech Tonal analysis, extending Embodiment beyond a single device. Self brings all these capabilities to bear for user interaction, face emotion detection, voice command and speech interaction. Have a conversation with an embodied 'Self'.   Back
 
Keywords:
Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S81035
Streaming:
Download:
 
CuLE : A Companion Library for Accelerated RL Training
Iuri Frosio (NVIDIA)
Traditional RL training is dominated by experience collection processes executing on the CPU. However, this CPU oriented design pattern limits the utility of DL accelerators, such as GPUs. In this talk we present CuLE (cuda learning environment), an ...Read More
Traditional RL training is dominated by experience collection processes executing on the CPU. However, this CPU oriented design pattern limits the utility of DL accelerators, such as GPUs. In this talk we present CuLE (cuda learning environment), an experimental deep RL companion library, to facilitate the generation of RL updates directly on the GPU. CuLE provides an implementation of ALE (atari learning environment), a challenging RL benchmark for discrete episodic tasks, executing directly on the GPU with the number of environments ranging from a few hundred to several thousand. Although traditional deep RL implementations use 12-16 agents coupled with replay memory to achieve training efficiency CuLE can generate a massive number of samples per step and supports new training scenarios that minimize expensive data movement operations. With 1024 agents CuLE achieves an 8-10x performance improvement by executing directly on the GPU compared to 1024 agents running in parallel on a 12-core CPU. We plan to extend CuLE to support a new set GPU-centric deep RL training schemes and new challenging training environments through integration with GFN.?  Back
 
Keywords:
Deep Learning and AI Frameworks, Tools and Libraries, GTC Silicon Valley 2018 - ID S8440
Streaming:
Download:
Finance
Presentation
Media
Deep Thinking: The Challenges of Deep Learning and GPU Acceleration of Financial Data
Erind Brahimi (Wells Fargo)
Accelerated analytics offer massive upside over conventional computing in the financial industry. Deep learning and AI accelerated analytics have many applications in finance, such as fraud detection, risk management, and loss forecasting. GPUs ...Read More
Accelerated analytics offer massive upside over conventional computing in the financial industry. Deep learning and AI accelerated analytics have many applications in finance, such as fraud detection, risk management, and loss forecasting. GPUs are leveraged to provide high-performance computing on a scalable platform for quantitative analysis of big data providing agile methods for ingesting data, performing automated data mining, and implementing robust deep learning architectures. Applying deep learning methods to complex financial data, we can exploit non-linear relationships in financial data that lead to critical risk events.  Back
 
Keywords:
Finance, GTC Silicon Valley 2018 - ID S8754
Streaming:
Download:
 
Geometric Deep Learning For Long-Term Value Investing
Jonathan Masci (NNAISENSE)
We''ll introduce recent work done by NNAISENSE that harnesses deep learning to automatically build custom portfolios for long-term investing from company fundamentals. NVIDIA GPUs are the cornerstone of our deep learning architecture development, ena ...Read More
We''ll introduce recent work done by NNAISENSE that harnesses deep learning to automatically build custom portfolios for long-term investing from company fundamentals. NVIDIA GPUs are the cornerstone of our deep learning architecture development, enabling the testing of financial models in a walk-forward fashion, where retraining the entire system can be done monthly. The main focus of the talk is on portfolio construction algorithm, which is purely data driven and optimized over criteria such as Sharpe and Information ratio. The central challenge we face is in the design of deep learning systems that can work on sets of observations that are represented in unstructured or structured form (e.g. graphs). We;ll introduce the concepts behind geometric deep learning and show how techniques from this emerging field can help our portfolio construction stage.  Back
 
Keywords:
Finance, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8767
Streaming:
 
Mergers & Acquisitions using Deep Learning
Jonathan Bailey (EDM Consultancy), Chris Ryan (EDM Consultancy)
We''ll present a case study of how a bank used machine learning to perform due diligence during company acquisitions. Techniques, strategy, and decision making mechanisms that ensured potential risks were illuminated, and mitigated. Technical details ...Read More
We''ll present a case study of how a bank used machine learning to perform due diligence during company acquisitions. Techniques, strategy, and decision making mechanisms that ensured potential risks were illuminated, and mitigated. Technical details on the machine learning briefly discussed. We''ll discuss how to employ cutting edge compute to slash costs and raise your ROI using NVIDIA DGX1 to achieve deep learning in real-time on millions of documents.  Back
 
Keywords:
Finance, NVIDIA Inception Program, GTC Silicon Valley 2018 - ID S8763
Streaming:
Download:
 
The New Era of Investments
Hyung Sik (Qraft Technologies)
We''ll discuss Qraft Technologies plan to deliver: 1) The remarkable performances Qraft''s AI engines have achieved in the financial industry; 2) The concept of technology used in the AI engines to generate strategic investment portfolios. ...Read More
We''ll discuss Qraft Technologies plan to deliver: 1) The remarkable performances Qraft''s AI engines have achieved in the financial industry; 2) The concept of technology used in the AI engines to generate strategic investment portfolios. Qraft provides materials that include actual examples of a robo-fund where AI is used to create a mutual fund, robo-advisor where AI recommends an optimal portfolio consisting of mutual funds and fully reflects an investor''s propensity, and other important achievements that Qraft has obtained in the financial industry.  Qraft is constructing an eco-system of AI in investment that includes the world''s well-known institutions and researchers.   Back
 
Keywords:
Finance, NVIDIA Inception Program, Finance, GTC Silicon Valley 2018 - ID S8836
Streaming:
Download:
 
<