Hundreds of talks and competing events crammed into a few days can be daunting. Get an overview of GTC's programs and events and how to make best use of them from Greg Estes, NVIDIA's VP of developer programs. Addressing both first-timers and returning alums, Greg will cover how to get the most from your time here, including can't-miss talks and never-before-seen tech demos. He'll also cover NVIDIA's resources for developers, startups, and larger organizations, as well as training courses and networking opportunities.
A fireside chat with U.S. Rep. Jerry McNerney (D-Calif.), co-chair of the congressional AI caucus, and Ned Finkle, VP of Govt. Affairs, NVIDIA. Artificial Intelligence has become a front-and-center issue for policymakers. Legislative proposals to encourage AI development and head off possible harms are gaining traction, and the Administration is working to build a national strategy. This fireside chat will give enterprises and researchers a first-hand look at how key Members of Congress are approaching AI, as well as what policies they're advocating for and expect.
This customer panel brings together AI implementers who have deployed deep learning at scale. The discussion will focus on specific technical challenges they faced, solution design considerations, and best practices learned from implementing their respective solutions.
Artificial Intelligence has the potential to profoundly affect our world and lives. In this era of constant change, how do organizations keep up? We'll discuss the forces that drive technology forward and the technology trends, including AI, that can help organizations remain relevant in a world of constant transformation.
I will introduce a game developed at Johns Hopkins University/Applied Physics Laboratory called Reconnaissance Blind Chess (RBC), a chess variant where the players do not see their opponent's moves, but they can gain information about the ground-truth board position through the use of an (imperfect) sensor. RBC incorporates key aspects of active sensing and planning: players have to decide where to sense, use the information gained through sensing to update their board estimates, and use that world model to decide where to move. Thus, just as chess and go have been challenge problems for decision making with complete information, RBC is intended to be a common challenge problem for decision making under uncertainty. After motivating the game concept and its relationship to other chess variants, I will describe the current rules of RBC as well as other potential rulesets, give a short introduction to the game implementation and bot API, and discuss some of our initial research on the complexity of RBC as well as bot algorithm
We'll discuss an implementation of GPU convolution that favors coalesced accesses without requiring prior data transformations. Convolutions are the core operation of deep learning applications based on convolutional neural networks. Current GPU architectures are typically used for training deep CNNs, but some state-of-the-art implementations are inefficient for some commonly used network configurations. We'll discuss experiments that used our new implementation, which yielded notable performance improvements including up to 2.29X speedups in a wide range of common CNN configurations.
Exploring the Best Server for AI Speaker: Samuel D. Matzek, Sr. Software Engineer Speaker: Maria Ward, IBM Accelerated Server Offering Manager Explore the server at the heart of the Summit and Sierra supercomputers, and the best server for AI. We will discuss the technical details that set this server apart and why it matters for your machine learning and deep learning workloads. IBM Cloud for AI at Scale Speaker: Alex Hudak, IBM Cloud Offering Manager AI is fast changing the modern enterprise with new applications that are resource demanding, but provide new capabilities to drive insight from customer data. IBM Cloud is partnering with NVIDIA to provide a world class and customized cloud environment to meet the needs of these new applications. Learn about the wide range of NVIDIA GPU solutions inside the IBM Cloud virtual and bare metal server portfolio, and how customers are using them across Deep Learning, Analytics, HPC workloads, and more. IBM Spectrum LSF Family Overview & GPU Support Speaker: Larry Adams, Global Architect - Cross Sector, Developer, Consultant, IBM Systems How to Fuel the Data Pipeline Speaker: Kent Koeninger, IBM IBM Storage Reference Architecture for AI with Autonomous Driving Speaker: Kent Koeninger, IBM
Take a journey through the TensorFlow container provided by the NVIDIA GPU Cloud. We'll start with how to launch and navigate inside the container, and stop along the way to explore the included demo scripts, extend the container with extra software, and examine best practices for how to take advantage of all the benefits bundled inside the NGC TensorFlow container. This session will help NGC beginners get the most out of the TensorFlow container and become productive as quickly as possible.
We'll describe our work at Intelligent Voice on explainable AI. We are working to separate AI technology into smaller components so it can be more easily explained, build explainability into AI architecture design, and make it possible for AI to progress within confines of current regulation. New GDPR regulations in Europe, which affect any company with European consumers, give people a right to challenge computer-aided decisions and to have these decisions explained. We'll discuss how existing technology can make it difficult to provide an explanation and how that inhibits AI adoption in customer-facing fields such as insurance, health, and financial services.
We'll discuss Project MagLev, NVIDIA's internal end-to-end AI platform for developing its self-driving car software, DRIVE. We'll explore the platform that supports continuous data ingest from multiple cars producing TB of data per hour. We'll also cover how the platform enables autonomous AI designers to iterate training of new neural network designs across thousands of GPU systems and validate the behavior of these designs over multi PB-scale data sets. We will talk about our overall architecture for everything from data center deployment to AI pipeline automation, as well as large-scale AI dataset management, AI training, and testing.
BMW’s logistics and industry 4.0 research team, X Works, will present how GPU computing power is being leveraged in an end-to-end pipeline for object labelling and detection, developed in house and deployed to a wide range of applications throughout BMW Group. The talk will include a description of how GPU computing is being used to support the creation of photorealistic meshes, and how our 3D pipeline helps BMW associates efficiently create large datasets to train 2D/3D detection models for industrial use case in robotics, autonomous transport, interactive layout planning, virtual reality visualization and smart three dimensional maps.
Intelligent buildings help create and maintain a safer, more secure, productive, and comfortable environment. A modern-day commercial building may have hundreds of air conditioners, thermostats, and other devices connected to thousands of sensors for measuring critical parameters like temperature, airflow and pressure. As a result, configuring or commissioning new buildings can take weeks or months. During equipment installation, installers typically embed contextual information around the sensors in the sensor name. We'll discuss the neural net-based models we developed to recognize and extract patterns in the sensor-naming convention and its time-series data to predicts tags necessary for fault detection and diagnostics. We'll describe our findings showing that our context-discovery process accurately recommends important information to system integrators with 80-90 percent confidence.
12:45-1:00 NVIDIA Business Development Ecosystem Update 1:00-1:45 How the Latest Generation of GPUs will Redefine Visual Computing 1:45-2:30 The Role of GPUs in the Future of Data Science 2:30-3:30 Fireside Chat with Jensen Huang, NVIDIA Founder, President and CEO 3:30-4:30 Inception Showcase
NVIDIA Inception is the leading AI startup accelerator with over 3,600 members around the world. Inception members are tackling hard problems ranging from medical imaging to robotics to seamless retail and have raised over $14B collectively. Several Inception members have already been acquired by the worlds leading corporations and many are continuing to develop innovative businesses. The Inception Showcase will highlight some of the most innovative and exciting companies in the program. Inception startups highlighted at previous GTC events included Athelas, BIOS, AiFi, Kinema Systems, DeepGram and SubtleMedical. Seven startups will present their company and technology and respond to questions in front of an audience of investors, industry leaders and developers during a fast-paced one-hour event.
The core of RAPIDS is CUDA DataFrame (cuDF), a library that provides Pandas-like DataFrame (a columnar data structure) functionality with GPU acceleration. cuDF provides a Python interface for use in existing data science workflows, and underneath cuDF is libcuDF, an open-source CUDA C++ library that provides a column data structure and algorithms to operate on these columns, such as filtering, selection, sorting, joining, and groupby. In this talk you will learn about some of the C++ and CUDA internals of libcuDF. This talk will cover how we perform run-time type dispatch on type-erased data structures to enable operating on a variety of data types and interface with dynamic languages like Python. Well describe how and why we built a pool allocator for CUDA device memory to massively improve performance on multi-GPU systems. And well dive into GPU algorithms we use for multi-column database operations like groupby and join. If you are interested in using GPU DataFrames via libcuDFs C/C++ interface, or if you are interested in contributing to the cuDF / libcuDF open source project, then this talk is for you.
See how RAPIDS and the open source ecosystem are advancing data science. In this session, we will explore RAPIDS, the NEW open source data science platform from NVIDIA. Come learn how to get started leveraging these open-source libraries for faster performance and easier development on GPUs. See the latest engineering work and new release features, including, benchmarks, roadmaps, and demos. Finally, hear how customers are leveraging RAPIDS in production, benefiting from early adoption, and outperforming CPU equivalents.
Graphs are a ubiquitous part of technology we use daily in systems like GPS graphs help find the shortest path between two points and in social networks, which use them to help users find friends. We'll explain why analyzing these vast networks with possibly billions of entries requires the computing power of GPUs. We'll then discuss the performance of graph algorithms on the GPU and show benchmarking results from several graph frameworks. We'll also cover the RAPIDS roadmap that will help unify these frameworks and make them easy to use and simple to deploy.
Location intelligence is key to understanding areas such as property insights, environmental monitoring, disaster management and prevention, traffic flows, and customer behavior. We'll discuss our work involving Europe's property insurance sector, which has been disrupted by the growing use of comparison websites that require real-time quotations. To build deep learning models, large volumes of data from satellite images, 3D sensors, GPS-enabled devices, social media, and other sources must be merged using computationally intensive coordinate conversion and matching. We'll outline our solution, which uses 3D CNNs to estimate risk factors from color 3D virtual models of individual properties. We'll describe how we used RAPIDS and cover our entire process, from processing raw data, merging sources, generating and labeling colorized voxel cubes for training, to model building, inference, and final application.
Modern data science demands interactive exploration and analysis of large volumes of data. Learn how NVIDIA and RAPIDS take advantage of GPU acceleration by using libraries such as cuDF, cuIO, and cuString. The computational limits of CPUs are being realized. We'll how RAPIDS uses GPUs to accelerate existing workflows and enable workflows that were previously impossible. We'll cover cuDF's high-level architecture and its GPU use, and do a technical dive into cuDF internals such as the cuIO and cuString libraries. We'll also share testing and benchmarking results and reveal some of the new features and optimizations we're investigating for the future of RAPIDS and cuDF.
Learn how RAPIDS uses Dask to scale to distributed clusters of machines. Dask, a library for scalable computing in Python, is known for scaling out popular PyData libraries like Numpy, Pandas, and Scikit-Learn. The GPU-Accelerated data science software stack RAPIDS also uses Dask to easily scale to multiple GPUs on a single node, and multiple nodes within a cluster. We'll explain how RAPIDS used Dask to scale out, discuss the challenges of integrating GPUs into the existing PyData stack, and describe how this work creates opportunities for Python users.
Learn about BlazingSQL, our new, free GPU SQL engine built on RAPIDS open-source software. We will show multiple demo workflows using BlazingSQL to connect data lakes to RAPIDS tools. We'll explain how we dramatically accelerated our engine and made it substantially more lightweight by integrating Apache Arrow into GPU memory and cuDF into RAPIDS. That made it easy to install and deploy BlazingSQL + RAPIDS in a matter of minutes. More importantly, we built a robust framework to help users bring data from data lakes into GPU-Accelerated workloads without having to ETL on CPU memory or separate GPU clusters. We'll discuss how that makes it possible to keep everything in the GPU while BlazingSQL manages the SQL ETL. RAPIDS can then take these results to continue machine learning, deep learning, and visualization workloads.
Walmart Labs has been charged with building the next-generation stores forecasting system, replacing the current system from JDA. Due to the size of the forecasting problem 52 weekly forecasts for roughly 500 million store-item combinations, generated every week we realized early on that we would have to use GPU computing if we wanted to move beyond simple forecasting approaches such as exponential smoothing. We have taken a multi-pronged approach to the problem of improving forecast accuracy while remaining within execution time windows using NVIDIA-supplied software such as XGBoost for forecasting, developing custom algorithms (some in CUDA) for various forecasting and forecasting-related processes, and moving to a RAPIDS-based feature generation pipeline. At the moment, roughly 20% of our items are being forecasted by the new system, and we expect to have 100% item coverage by the end of the year. In this talk we will outline our forecasting strategy both from an algorithmic and from a computational perspective. We will show how GPU computing has enabled us to significantly improve forecast accuracy, and highlight the key bottlenecks that we have been able to overcome. We will provide runtime comparisons of CPU vs GPU-based algorithms on our real-world problems, and describe how GPU-based development works for us (hint: its easy to do.) We will also describe our collaboration with NVIDIA, who have been extremely helpful, continuously refining their algorithms and tools to better meet the needs of industry, and what tools and capabilities we see being especially useful for our path forward.
RAPIDS is an open-source platform for GPU data science, incubated by NVIDIA. Built to look and feel like popular tools in the Python Data Science ecosystem, RAPIDS is easy to use and dramatically speeds up execution of all steps of a typical data science workflow. Intended for working data scientists, this session will be an in-depth walk through of all the stages of a model data science workflow using RAPIDS. The presentation will cover ingesting and cleaning data, feature engineering, working with strings, user-defined functions, and applying machine learning. The session will discuss the community and ecosystem around RAPIDS and future plans for the cuML library. Additionally, the session will cover how users can contribute to RAPIDS. At the end of the session, attendees will have learned RAPIDS benefits for data science, how to get started installing RAPIDS, and how to build their own workflows using RAPIDS.
Network defense and cybersecurity applications traditionally rely on heuristics and signatures to protect networks and detect anomalies. Large companies may generate over 10TB of data daily, spread across different sensors and heterogenous data types. The difficulty of providing timely ingest, feature engineering, feature exploration, and model generation has made signature-based detection the only option. We'll show how to use RAPIDS and GPU acceleration to overcome these obstacles. We'll walk through data engineering steps involving large amounts of heterogeneous data (both source and format) and explore how GPUs can accelerate feature exploration and hyperparameter selection. This enables more in-house data scientists and information security experts to use internally collected data to generate predictive models for anomaly detection rather than rely on signature-based detection.
We'll discuss cuML, a GPU-Accelerated library of machine learning algorithms within the RAPIDS data science ecosystem. The cuML library allows data scientists, researchers, and software engineers to run traditional ML tasks on GPUs without going into the details of CUDA programming. We'll show you how to get tremendous speed-up for traditional machine learning workloads by using APIs like Scikit-Learn with Python. We'll also provide code examples, benchmarks, and the latest news.
NVIDIA and HP are partnering to accelerate data science workloads. Hear from customers about how they are using Z by HP's data science workstation with NVIDIA RAPIDS technology to transform their workflows and learn how you could do the same.
Learn how the OmniSci GPU-Accelerated SQL engine fits into the overall RAPIDS partner ecosystem for open source GPU analytics. Using open data, we'll show how to ingest data that's from both streaming and standing sources, perform descriptive statistics and feature engineering using SQL and cuDF, and return the results as a GPU DataFrame. We'll also describe how data science workflow can be accomplished using tools from the RAPIDS ecosystem, all without the data ever leaving the GPU.
Learn about using Tensor Cores to perform very fast matrix multiply-accumulate steps like those required in AI training. The key to Tensor Core performance is the use of 16-bit floating point arithmetic, but that causes significant rounding errors. Although algorithms like binomial correction or Karatsuba can reduce rounding errors considerably, they require additional calculations. We'll detail performance of these algorithms based on the Warp Matrix Multiply Accumulate API.
Legacy production methods can't keep up with the global nature of content creation. Studios need to operate where tax incentives are offered, and artist talent may be located anywhere in the world. A Virtual Studio lets you deploy resources where and when you need them in a matter of minutes, rather than weeks, so you can ramp up and down as production ebbs and flows. Virtual Studios are fueled by GPUs, which provide artists and engineers with both a powerful virtual workstation and the ability to accelerate renders and simulations, both locally and distributed across clusters. On the cloud, you're able to visualize and manipulate massive datasets that would be difficult or even impossible to achieve on traditional hardware. This session examines the benefits, strategies, and challenges of building a Virtual Studio on Google Cloud Platform, powered by NVIDIA GPUs.
We'll examine the potential for spatial computing and machine learning to reintroduce people to the physical potential of their bodies by focusing on Embody, MAP Lab's 2019 Sundance premiere. Inspired by movement traditions such as aikido, yoga, and dance, Embody is a social VR experience that uses visual metaphor and encouragement from teachers and friends to bring about coordinated body movement. We'll explain how this experience, which is piloted entirely by body movement and position, reclaims the body's potential inside the digital landscape. Users prompt each other with conversation, mirroring, and environmental channeling to step together through physical sequences designed to center, balance, extend, and strengthen. We hope players who experience Embody will be reminded of their deep physical potential and remember that the body is a flexible tool and able to change (http://www.sundance.org/projects/embody).
Generative methods allow a computer to automatically distill the essence of a dataset and then produce novel examples that are indistinguishable from the original data. That's the promise, but getting there has been difficult. This talk focuses on recent advances in generative adversarial networks (GAN), describing ideas that have finally enabled the synthesis of credible high-resolution images. It also covers recent work by NVIDIA (StyleGAN) that makes the image generation more controllable by borrowing ideas from style transfer literature, and also leads to an interesting, unsupervised separation of high-level attributes (e.g. pose or identity in case of human faces) and inconsequential variation in the images (exact placement of hair, etc.).
We'll examine the challenges for telecommunications companies of harvesting the considerable computational capacity of modern GPU architectures. One issue is that low latency inference requires small batch sizes, which are inherently detrimental to Tensor Core performance. Another involves efficient coefficient reuse, which demands very large matrix-matrix multiplications, while feedforward DNNs typically used for telecommunications ML have relatively small vector-matrix multiplications. We'll discuss our approach, which aims to provide low latency with significantly higher performance by improving use of computation capacity available in Tensor Cores.
Meeting the latency requirements of 5G networks requires massive parallelization. We'll discuss how to parallelize and map certain radio access network (RAN) functions to GPU architectures to achieve orders-of-magnitude acceleration. We'll describe how to realize selected RAN functions using online machine learning methods. We'll also explore the possibility of a machine learning function orchestrator (MLFO) in the context of end-to-end network slicing where deep neural networks are an interesting option. Our talk will use findings of the ITU-T focus group on machine learning for 5G to explore the challenge of implementing MLFO, leading to new mobile network architectures.
Learn how GPUs are pushing the limits of the largest astronomical telescopes on Earth and how they'll be used to image life-bearing planets outside our solar system. Thanks to hardware features such as Tensor Cores and mixed-precision support, plus optimized AI frameworks, GPU technology is changing how large data streams from optical sensors are digested in real time. We'll discuss how real-time AI made possible by GPUs opens up new means to optimally control the system and calibrate images, which will help scientists get the most out of the largest optical telescopes. GPUs will also benefit future extreme-size facilities like the European Extremely Large Telescope because the complexity of maintaining exquisite image quality increases with the square of its diameter size. We'll present on-sky results obtained on the 8.2-meter Subaru Telescope and explain why these techniques will be essential to future giant telescopes.
Leaders from the mapping technology companies will discuss the advantages of various algorithms to create and maintain maps, followed by a short Q&A session. HERE: Vladimir Shestak, Lead Software Engineer Automated Driving Edge Perception for HD Map Maintenance: We start this talk by presenting a brief overview of HD Live Map created by HERE and its use for connected ADAS or automated driving solutions. Although building such a map with a required centimeter level precision is technically hard, the instant the HD Live Map is built, changes in the real world can occur causing the map to no longer reflect reality. Hence, a proper maintenance strategy must be in place with the goal to identify discrepancies between the HD Live Map and the real world and heal the HD Live Map as quickly as possible. We discuss a spectrum of techniques developed by HERE to address the map-healing process and then focus on our low-cost solutions for in-vehicle change detection. The example system employs a consumer-grade Android-based sensing system streaming imagery and telemetry in real-time into HERE Edge Perception software stack. We present the high-level software architecture of the stack, its main components, i.e., feature detection, object tracking and triangulation, RWO and Maplet generation, as well as in-vehicle deployment options. The real-time performance evaluation of the system concludes our talk. NavInfo Europe: Geetank Raipuria, Computer Vision Engineer Real-Time Object Detection and Semantic Segmentation: This session will discuss how NavInfo uses computer vision and deep learning to build high-definition maps that cover China's highways and large city streets. This involves performing object detection and semantic segmentation on visual imagery collected from vehicle sensors. The NavInfo Europe Advanced Research Lab creates processes that extract information from this data, both in real-time onboard vehicles using the NVIDIA DRIVE platform, and faster than real-time, processing offline gathered video material through NVIDIA DeepStream.
Its time to separate the signal from the noise when it comes to autonomous driving. And as self-driving trucks near commercial reality, the stakes are high for safe operation on our highways. Join Dr. Xiaodi Hou, Founder, President and CTO of TuSimple, the largest self-driving truck company worldwide, for a discussion of what it takes to design, test and deploy a fully autonomous truck. Dr. Hou will lay it on the line in terms of whats working and whats not in the design and testing of todays self-driving trucks.
We will elaborate on how our holistic approach to design and validation creates a single environment to engineer and experience the autonomous vehicle. From Cognitive Augmented Design & Model Based System Engineering to realistic validation at scale, AI is enabling AV developers to increase safety while managing the costs of ever-increasing complexity.
Learn how a successful implementation of a low memory footprint, multi-GPU iterative method makes it possible to efficiently resolve localization of spontaneous nonlinear flow in deforming porous media. Grasping this physical process is essential to ensure safe underground waste storage and understand natural fluid migration in reservoirs. We'll describe our parallel, matrix-free solver design, which provides a short time to solution and can solve a variety of coupled and nonlinear systems of partial differential equations in 3D. We will unveil the key algorithmic and optimization concepts that enable our stencil-based solvers to converge in few iterations, while tackling the hardware limit on the most recent NVIDIA high-bandwidth GPU accelerators. Also, we will explain how we achieved 98 percent parallel efficiency on 5000 NVIDIA Tesla P100 GPUs on the hybrid Cray XC-50 Piz Daint supercomputer at the Swiss National Supercomputing Centre, CSCS.