This talk will describe the process of developing autonomous driving directly from the virtual environment TRONIS, a high resolution virtual environment for prototyping and safeguarding highly automated and autonomous driving functions exploiting a state of the art gaming engine as introduced by UNREAL. Well showcase this process on a real RC-model with High-End NVIDIA hardware targeting self-driving capabilities on a real world Truck. With the help of TRONIS we make early decisions on sensor configurations e.g. camera, sensor positions and deployed algorithms. The development team works on independent instances of the virtual car which build the foundation for multiple experimental setups.
Learn how to adopt a MATLAB-centric workflow to design, develop, and deploy computer vision and deep learning applications on to GPUs whether on your desktop, a cluster, or on embedded Tegra platforms. The workflow starts with algorithm design in MATLAB. The deep learning network is defined in MATLAB and is trained using MATLAB's GPU and parallel computing support. Then, the trained network is augmented with traditional computer vision techniques and the application can be verified in MATLAB. Finally, a compiler auto-generates portable and optimized CUDA code from the MATLAB algorithm, which can be cross-compiled to Tegra. Performance benchmark for Alexnet inference shows that the auto-generated CUDA code is ~2.5x faster than mxNet, ~5x faster than Caffe2 and is ~7x faster than TensorFlow.
A key technology challenge in computer vision for Autonomous Driving is semantic segmentation of images in a video stream, for which fully-convolutional neural networks (FCNN) are the state-of-the-art. In this research, we explore the functional and non-functional performance of using a hierarchical classifier head for the FCNN versus using a single flat classifier head. Our experiments are conducted and evaluated on the Cityscapes dataset. On basis of the results, we argue that using a hierarchical classifier head for the FCNN can have specific advantages for autonomous driving. Furthermore, we show real-time usage of our network on the DRIVE PX 2 platform.
Learn how combining machine learning and computer vision with GPU computing helps to create a next-generation informational ADAS experience. This talk will present a real-time software solution that encompasses a set of advanced algorithms to create an augmented reality for the driver, utilizing vehicle sensors, map data, telematics, and navigation guidance. The broad range of features includes augmented navigation, visualization for cases of advanced parking assistance, adaptive cruise control and lane keeping, driver infographics, driver health monitoring, support in low visibility. Our approach augments driver's visual reality with supplementary objects in real time, and works with various output devices such as head unit displays, digital clusters, and head-up displays.
The growing range of functions of ADAS and automated systems in vehicles as well as the progressive change towards agile development processes require efficient test. Testing and validation within simulation are indispensable for this as real prototypes are not available at all times and the test catalog can be driven repeatedly and reproducibly. This paper presents different approaches to be used in simulation in order to increase the efficiency of development and testing for different areas of application. This comprises the use of virtual prototypes, the utilization of sensor models and the reuse of test scenarios throughout the entire development process, which may also be applied to train artificial intelligence.
This talk details a team of 17 Udacity Self-Driving Car students as they attempted to apply deep learning algorithms to win an autonomous vehicle race. At the 2017 Self Racing Cars event held at Thunderhill Raceway in California, the team received a car and had two days before the start of the event to work on the car. In this time, we developed a neural network using Keras and Tensorflow which steered the car based on the input from just one front-facing camera in order to navigate all turns on the racetrack. We will discuss the events leading up to the race, development methods used, and future plans including the use of ROS and semantic segmentation.
This presentation shows how driving simulators together with DNN algorithms can be used in order to streamline and facilitate the development of ADAS and Autonomous Vehicle systems. Driving Simulators provide an excellent tool to develop, test and validate control systems for automotive industry. Testing ADAS systems on the driving simulator makes it safer, more affordable and repeateble. This session will focus on a special application in which NVIDIA DRIVE PX 2 has been interfaced with a camera and put in the loop on a driving simulator. Object recognition algorithms have been developed in order to develop and test a Lane Keeping Assist (LKA) system. The robustness of the system can be tested on the simulator by altering the environmental conditions and vehicle parameters.
Thanks to recent breakthroughs in AI vehicles will learn and collaborate with humans. There will be a steering wheel in the majority of vehicles for a long time. Therefore a human centric approach is needed in order to save more lives in the traffic, that is a safe combination of AI and UI.
TomTom is leading in HD Maps in coverage and number of OEMs working with our HD Map. Our multi-source, multi-sensor approach leads to HD maps that have greater coverage, are more richly attributed, and have higher quality than single-source, single-sensor maps. Hear how were weaving in more and more sources, such as AI-intensive video processing, into our map making to accelerate towards our goal of real-time and highly precise maps for safer and more comfortable driving.
We present our experience of running computationally intensive camera-based perception algorithms on NVIDIA GPUs. Geometric (depth) and semantic (classification) information is fused in the form of semantic stixels, which provide a rich and compact representation of the traffic scene. We present some strategies to reduce the computational complexity of the algorithms. Using synthetic data generated by the SYNTHIA tool, including slanted roads from a simulation of San Francisco city, we evaluate performance latencies and frame rates on a DrivePX2-based platform.
Learn how deep learning is used to process video streams to analyse human behaviour in real-time. We will detail our solution to recognise fine-grained movement patterns of people how they perform everyday actions like e.g. walking, eating, shaking hands, talking to each other. The novelty of our technical solution is that our system learns these capabilities from watching lots of video snippets showing such actions. This is exciting because very different applications can be realised with the same algorithms as we follow a purely data-driven, machine learning approach. We will explain what new types of deep neural networks we created and how we employ our Crowd Acting (tm) platform to cost-efficiently acquire hundred thousands videos for that.
2017 is the year when the first driver monitoring systems goes into series production with global automotive OEMs. It will be a mainstay as a vital part in most level 3 automated cars but it also has unique stand alone applications such as drowsiness and attention, functions that adress approximately half of all traffic accidents. Starting in 2019 there will be more advanced systems going to the market based on improvements in hardware such as high resolution cameras and GPU. Around 2022 there is a third generation of in-car AI to be expected as the hardware will consist of multiple HD cameras running on the latest GPUs.
In our NVIDIA lab in New Jersey we taught a deep convolutional neural network (DNN) to drive a car by observing human drivers and emulating their behavior. We found that these networks can learn more aspects of the driving task than is commonly learned today. We present examples of learned lane keeping, lane changes, and turns. We also introduce tools to visualize the internal information processing of the neural network and discuss the results.
GPUs can significantly enhance the capabilities of Military Ground Vehicles. In this session we will discuss the challenges facing the integrator of real time vision systems in the Military applications. From video streaming and military streaming protocols through to deploying vision systems for 360 degree situational awareness with AI capabilities. GPUs are being used for enhanced autonomy and in the defence sector and across the board from Ground Vehicles through to Naval and Air applications. Each application space presenting its own challenges through to deployment. Come and find out how the defence industry is addressing these challenges and where the future potential of GPU enabled platforms lie.
The autonomous electric car revolution is here and a bright clean future awaits. Yet as we shift to this fundamentally different technology, it becomes clear that perhaps the entire vehicle deserves a rethink. This means not just adding powerful computers to outdated vehicle platforms, but instead redesigning the agile device, for this very different future. This process doesnt start with the mechanical structure of yesteryear, instead it starts with the GPU.
Deep Learning has emerged as the most successful field of machine learning with overwhelming success in industrial speech, language and vision benchmarks. Consequently it evolved into the central field of research for IT giants like Google, facebook, Microsoft, Baidu, and Amazon. Deep Learning is founded on novel neural network techniques, the recent availability of very fast computers, and massive data sets. In its core, Deep Learning discovers multiple levels of abstract representations of the input. Currently the development of self-driving cars is one of the major technological challenges across automotive companies. We apply Deep Learning to improve real-time video data analysis for autonomous vehicles, in particular, semantic segmentation.
Polymatica is an OLAP and Data Mining server with hybrid CPU+GPU architecture which turns any analytical work on billions-records data volumes into a proactive process with no waitings. Polymatica architecture uses NVIDIA Multi-GPU (i.e. in DGX-1) in critical operations with billions of raw business data records. This allows to eliminate pauses and accelerate the speed of analytical operations for up to hundred times. You'll see the performance difference on the example of the real analytical process in retail on different hardware: 1) CPU-only calculations on 2*Intel Xeon, no GPU; 2) 2*Intel Xeon + single Tesla P100; 3) DGX-1: 2*Intel Xeon + 8*Tesla P100. Polymatica on DGX-1 become the fastest OLAP and Data Mining engine allowing advanced analytics on datasets of billions of records.
New deep learning frameworks are being developed on a monthly basis. For most of them, the inventors did not have scale-out parallelisation in mind. ApacheSpark and other data parallel frameworks, on the other hand, are becoming the de-facto standard for BigData analysis. In this talk, we will have a look at different deep learning frameworks and their parallelisation strategies on GPUs and ApacheSpark. Well start with DeepLearning4J and ApacheSystemML as first class citizens. We will then have a look at TensorSpark and TensorFrames and finish with CaffeOnSpark to explain concepts like Inter- and Intra-model parallelism, distributed Cross-Validation and Jeff Dean style parameter averaging.
We utilize a MapR converged data platform to serve as the data layer to provide distributed file system, key-value storage and streams to store and build the data pipeline. On top of that, we use Kubernetes as an orchestration layer to manage the containers to train and deploy deep learning models, as well as serve the deep learning models in the form of containers.
This session will present an overview on how we recently applied modern deep learning techniques to the wide area of nanoscience. We will focus on deep convolutional neural network training to classify Scanning Electron Microscope (SEM) images at the nanoscale, discussing first the issues we faced, and then how we solved them by improving the standard deep learning tools. This session aims to introduce a new promising and stimulating field of research that implements deep learning techniques in the nanoscience domain, with the final aim to provide researchers with advanced and innovative tools. These will contribute to improve the scientific research in the boosting field of experimental and computational nanoscience.
In the world of analytics and AI for many, GPU-accelerated analytics is equivalent to speeding up training time. The question, however, remains is how one interprets such highly complex black box models? How these models can help decision-making? Well discuss and present here a GPU based architecture to not only accelerate training the models but also use the GPU based databases and visual analytics to render billions of rows to solve the challenges of interpreting these black box models. With the advent of algorithms, databases and visualization tools, all based on a GPU architecture a solution like this has become more accessible. Interactive visualization of the model, based on partial dependence analysis, is one approach to interpret these opaque models and is our focus here.
Learn how large requests on big datasets, like production or finance data, can benefit from hybrid engine approaches for calculating on in-memory databases. While hybrid architectures are state-of-the-art in specialized calculation scenarios (e.g., linear algebra), multi-GPU or even multicore usage in database servers is still far from everyday use. In general, the approach to handle requests on large datasets would be scaling the database resources by adding new hardware nodes to the compute cluster. We use intelligent request planning and load balancing to distribute the calculations to multi-GPU and multicore engines in one node. These calculation engines are specifically designed for handling hundreds of millions of cells in parallel with minimal merging overhead.
Discover how Credit Suisse has implemented Deep Learning in eCommunications Surveillance, and how moving to GPU-accelerated models has yielded significant business value. The solution works on unstructured data and leverages bleeding-edge Natural Language Processing techniques, and will be enhanced with emotion analysis running on GPU-farms. This talk will include a demo of the functionality.
Deep learning optimization in real world applications is often limited by the lack of valuable data, either due to missing labels or the sparseness of relevant events (e.g. failures, anomalies) in the dataset. We face this problem when we optimize dispatching and rerouting decisions in the Swiss railway network, where the recorded data is variable over time and only contains a few valuable events. To overcome this deficiency we use the high computational power of modern GPUs to simulate millions of physically plausible scenarios. We use this artificial data to train our deep reinforcement learning algorithms to find and evaluate novel and optimal dispatching and rerouting strategies.
A key driver for pushing high-performance computing is the enablement of new research. One of the biggest and most exiting scientific challenge requiring high-performance computing is to decode the human brain. Many of the research topics in this field require scalable compute resources or the use of advance data analytics methods (including deep learning) for processing extreme scale data volumes. GPUs are a key enabling technology and we will thus focus on the opportunities for using these for computing, data analytics and visualisation. GPU-accelerated servers based on POWER processors are here of particular interest due to the tight integration of CPU and GPU using NVLink and the enhanced data transport capabilities.
Using the latest advancements from TensorFlow including the Accelerated Linear Algebra (XLA) Framework, JITundefinedAOT Compiler, and Graph Transform Tool , Ill demonstrate how to optimize, profile, and deploy TensorFlow Models in GPU-based production environment. This talk is 100% demo based with open source tools and completely reproducible through Docker on your own GPU cluster. In addition, I spin up a GPU cloud instance for every attendee in the audience. We go through the notebooks together as I demonstrate the process of continuously training, optimizing, deploying, and serving a TensorFlow model on a large, distributed cluster of Nvidia GPUs managed by the attendees.
NVIDIA DGX Systems powered by Volta deliver breakthrough performance for today''s most popular deep learning frameworks. Attend this session to hear from DGX product experts and gain insights that will help researchers, developers, and data science practitioners accelerate training and iterate faster than ever. Learn (1) best practices for deploying an end-to-end deep learning practice, (2) how the newest DGX systems including DGX Station address the bottlenecks impacting your data science, and (3) how DGX software including optimized deep learning frameworks give your environment a performance advantage over GPU hardware alone.
Caffe2 is a lightweight, modular, and scalable deep learning framework refactored from the previous Caffe. Caffe2 has been widely used at Facebook to enable new AI & AR experiences. This talk will be divided into two parts. In the first part, we will explain some framework basics, the strengths of Caffe2, large scale training support and will walk you through several product use-cases at Facebook including computer vision, machine translation, speech recognition and content ranking. The second part will explain how users benefit from Caffe2''s built-in neural network model compression, fast convolution for mobile CPUs, and GPU acceleration.
Come and learn about new fast low-rank matrix computations on GPUs! By exploiting the low-rank off-diagonal block structure, we design and implement fast linear algebra operations on massively parallel hardware architectures. The main idea is to refactor the numerical algorithms and the corresponding implementations by aggregating similar numerical operations in terms of highly optimized batched kernels. Applications in weather prediction, seismic imaging and material science are employed to assess the trade-off between numerical accuracy and parallel performance of these fast matrix computations compared to more traditional approaches..
The attendees can learn about how the behavior of Human Brain is simulated by using current computers, and the different challenges which the implementation has to deal with. We cover the main steps of the simulation and the methodologies behind this simulation. In particular we highlight and focus on those transformations and optimizations carried out to achieve a good performance on NVIDIA GPUs.
The talk will cover two related topics: firstly, how AI is disrupting the creative industries in which Happy Finish work and secondly, discuss a specific project example where Happy Finish created a hero campaign image for the Unilever Baby Dove brand using a Generative Adversarial Network, gaining widespread media attention. Happy Finish is working on new solutions relating the machine generation of content, using AI to create campaign content for their clients. Marco and Daniel will outline their vision of how machines can collaborate with humans in the creativity process.
The goal of the session is to deep dive into key technical building blocks of interactive Computer Aided Engineering (CAE) and to understand along specific prototypes how GPU computing will impact it. Considering the example of interactive design assistants, we will explain the ingredients of future GPU-based simulation codes: (i) multi-level voxel geometry representation from integration to finite elements, (ii) Indirect (weak) realization of boundary conditions, (iii) (non-linear) geometric multi-grid methods. By streamlining all algorithms with respects to GPU, state-of-the-art industrial solutions are outperformed by orders of magnitude in computational efficiency yet conserving accuracy. This is shown along a few prototypes towards the vision of a virtual maker space.
This session will give the audience a quick overview of recent developments in the field of 3D surface analysis with deep learning techniques and an insight into our approach for 3D surface repair. In recent years, deep learning methods have shown to be able to tackle many vision related problems with astonishing success. Compared to the application of Deep Learning for image processing, applications for geometry processing in 3D are still rare. The main reason for this is the lack of a suitable 3D representation. We present a method for 3D surface analysis in which we use different data representations and machine learning methods to repair defect or damaged 3D surfaces. After this session, you should have an idea of how to approach 3D related problems with deep learning.
This session will cover how the PreScan simulation platform can be used to generate virtual sensor data of all sensor technologies relevant to automated driving, such as camera, radar, lidar, ultrasone, and DSRC. By generating synthetic sensor data as input for deep neural networks, training for driving applications can be automated. We will cover the special requirements that virtual sensor data needs to meet in order to be suitable for training algorithms that will eventually be deployed in the real-world. In addition, we will highlight the value of injecting synthetic sensor data directly into platforms such as the NVIDIA DRIVE PX 2 for virtual validation of automated driving applications by means of Hardware-in-the-Loop (HiL) simulation.
Learn a simple strategy guideline to optimize applications runtime. The strategy is based on four steps and illustrated on a two-dimensional Discontinuous Galerkin solver for computational fluid dynamics on structured meshes. Starting from a CPU sequential code, we guide the audience through the different steps that allowed us to increase performances on a GPU around 149 times the original runtime of the code (performances evaluated on a K20Xm). The same optimization strategy is applied to the CPU code and increases performances around 35 times the original run time (performances evaluated on a E5-1650v3 processor). Based on this methodology, we finally end up with an optimized unified version of the code which can run simultaneously on both GPU and CPU architectures.
Learn how to use configurable radix trees to perform fast parallel lookup operations on GPU. In this session you will find out how to use our library to create radix tree specifically tailored to your needs. You will also see examples on how to squeeze the most performance out of presented data structures in real world applications. We will also show our results on using Deep Learning as a tool to customize data structures.
Online shopping is nothing if not efficient. Walmart together with new Jersey-startup Jet take things a step further, using AI and Deep Learning to optimize their entire E-Commerce business. The first AI application we discuss is Jet’s unique smart merchant selection: the platform finds the best merchant and warehouse combination in real time so that the total order cost is as low as possible. Then we show how to efficiently pack fresh and frozen orders with Deep Reinforcement Learning. The value of this approach is not just to find the best boxes and the tightest packing, but also the least amount of coolant and its placement so that the temperature of all items stays within the required limits during shipment.
Calculation of surface normals can be crucial to the process of extracting useful information from point clouds. Surface normals give an estimate of the objects in te scene which might be of importance for more complex algorithms like feature extraction using machine learning techniques. In this poster, we present our implementation of normal estimation on a GPU and CPU and show results for both platforms. Through our implementation we show that GPU impementations can be up to an order of magnitude faster or more on a rather modest desktop Xeon workstation when compared to a GPU implementation on a Quadro M4000 graphics card. To substantiate our finding we also share profilign information and plots on the distribution of errors in our approach.
Johann Jungwirth (JJ) provides his insights about the digital transformation of the automotive industry. He describes how automotive companies transform from hardware companies to hardware, software, and services companies. Furthermore, he highlights the increasing capabilities of AI technologies and how they will lead to the reinvention of the automotive industry, e.g. through the realization of self-driving-cars and customer-centered mobility services.
This session will describe how Supercomputing Systems AG addresses the key challenges when optimizing software runtime for self-driving vehicles. Our automotive Tier1 customer has developed the algorithms in C++ on a standard PC. The goal was to build a prototype on the PX2 Platform. In a first step, we profiled the algorithms on PC and PX2 and then started our optimizations on ARM and DENVER cores. As a further optimization, we will offload one of the algorithms to GPU.
Patrick van der Smagt will introduce the open-source AI research model at the Volkswagen Data Lab in Munich. Then he will present the winners of the 2017 VW-NVIDIA Deep Learning and Robotics Challenge. The winning team of this challenge, which best solved a robotics challenge using deep neural networks, will be able to present their solution. Finally, the winners of Jugend Innovativ, Austria''s national science competition, will be presented.
Accurate simulation of unsteady turbulent flow is critical for improved design of greener aircraft that are more fuel-efficient. We will demonstrate the application of PyFR to petascale simulation of such flows. Rationale behind algorithmic choices, which offer increased levels of accuracy and enable sustained computation at up to 58% of peak DP-FLOPs on unstructured grids, will be discussed. A range of software innovations will also be detailed, including use of runtime code generation, which enables PyFR to efficiently target multiple platforms via a single implementation. Finally, results will be presented from fullscale simulations of flow over low-pressure turbine blades, along with scaling results, and performance data demonstrating sustained computation at up to 13.7 DP-PFLOPs.
A description of the process to make a tangible industrial product from lab developed algorithms This speech is based on a real world example, and explain the common pitfalls and caveats, the process used and the strategies to fit large GPU algorithms on an embedded GPU while maintaining acceptable performance and latencies.
There has been an explosion of research for camera based lane detection using deep learning approaches. We would like to present our model of Lane Detection which uses a blend of conventional and deep learning based approach. This has been tested in real life automotive use case and we would like to provide snapshots of the results so far.
Come and learn how the grand challenge of controlling adaptive optics systems on future Extremely Large Telescopes is being solved using GPUs. As part of Green Flash, an international EU funded joint industrial and academic project, our team is developing solutions based on GPUs for the real-time control of large optical systems operating under tough operating environments. This includes the hard real-time data pipeline, the soft real-time supervisor module as well as a real-time capable numerical simulation to test and verify the proposed solutions. We will discuss how the unprecedented memory bandwidth provided by HBM2 on the new Pascal architecture is changing the game in dimensioning these complex real-time computers crunching up to 200 Gb/s of noisy data.
Starting from 2020, during the Large Hadron Collider Runs 3 and 4, the increased accelerator luminosity with the consequently increased number of simultaneous proton-proton collisions (pile-up) will pose significant new challenges for the CMS experiment. A many-threads-per-event approach would scale with the pileup, by offloading the combinatorics to the number of threads available on the GPU. This would allow a faster execution of track reconstruction and physics selection algorithms in charge of accepting the events with the most interesting physics content.
An overwhelming amount of experimental evidence suggests that elucidations of protein function, interactions, and pathology are incomplete without inclusion of intrinsic protein disorder and structural dynamics. Thus, to expand our understanding of intrinsic protein disorder and provide Machine Learning and AI solutions for biotechnology, we have created a database of secondary structure propensities for proteins (dSPP) as a reference resource for experimental research and computational biophysics. Database of Structural Propensities of Proteins (dSPP) is the world’s first interactive repository of structural and dynamic features of proteins with seamless integration for leading Machine Learning frameworks, Keras and Tensorflow.
Learn how one the #1 Aviation Equipment supplier provides its engineers with a solution that allows them to work wherever the need to and collaborate in real time. They implemented a centralized and virtualized environment to deliver a secure, flexible and scalable VDI solution to provide them access to their tools like Solidworks, SmarTeams and Catia with a better than local User experience for engineers around the world.
Well introduce a novel approach to digital pathology analytics, which brings together a powerful image server and deep learning-based image analysis on a cloud platform. Recent advances in artificial intelligence (AI) and deep learning in particular show great promise in several fields of medicine, including pathology. Human expert judgment, augmented by deep learning algorithms, has the potential to speed up the diagnostic process and to make diagnostic assessments more reproducible. We will present examples on context-intelligent image analysis applications, including e.g. fully automated epithelial cell proliferation assay and tumor grading. We will also present other examples of complex image analysis algorithms, which all run on-demand on our WebMicroscope® Cloud environment.
The WCHG and BDI at the University of Oxford have an established research computing platform for genomics, statistical genetics and structural biology research and I will outline how we are developing this platform to include a significant GPU infrastructure to support our researchers great wave of enthusiasm for exploring the potential of deep learning and AI for life sciences research. We are deploying a mixture of GPU architectures and deep learning AI frameworks and I will report on our current plans the the initial areas of research in the life sciences that show promise for AI.
Learn how GPU-based Computational Fluid Dynamics (CFD) paves the way for affordable high-fidelity simulations of automotive aerodynamics. Highly-resolved, transient CFD simulations based on pure CPU systems are computationally expensive and constrained by available computational resources. This was posing a big challenge for automotive OEMs in their aerodynamic design process over many years. To overcome this problem, we present ultraFluidX, a novel CFD solver that was specifically designed to leverage the massively parallel architecture of GPUs. With its multi-GPU implementation based on CUDA-aware MPI, the tool can achieve turnaround times of just a few hours for simulations of fully detailed production-level passenger and heavy-duty vehicles a breakthrough for simulation-based design.
Learn how one of the leading institutes for global weather predictions, the European Centre for Medium-Range Weather Forecasts (ECMWF), is preparing for exascale supercomputing and the efficient use of future HPC computing hardware. I will name the main reasons why it is difficult to design efficient weather and climate models and provide an overview on the ongoing community effort to achieve the best possible model performance on existing and future HPC architectures. I will present the EU H2020 projects ESCAPE and ESiWACE and discuss recent approaches to increase computing performance in weather and climate modelling such as the use of reduced numerical precision and deep learning.
We present our findings on using the NVIDIA OptiX framework to simulate the scattering of electrons as encountered in scanning electron microscope environments. In particular, we discuss how we implemented volume scattering and coplanar material transition boundaries with varying material properties within the framework. The results have been verified with established CPU based simulation packages. While achieving comparable accuracy, significant speed ups are realized.
A promising, exciting journey to enterprise 3D virtualisation. A long-awaited answer to the question: Is it possible for you to run your high-end-graphics engineering applications, such as CATIA and NX, in your enterprise data center and cloud? Can virtual desktops provide the required performance, just like a powerful workstation? Delivering high performance remote workstations was technically inadequate, complicated and costly but now with NVIDIA GRID technology, the reply is absolutely YES! Join our session and hear our exciting journey for 3D Desktop virtualisation. Learn the details of our transition story to a successful extended live system; examine the provided values and technical considerations necessary to properly enable high performance 3D Desktops.
We present an approach of using real time path tracing in combination with traditional deferred techniques. This method allows to use most elements of a traditional rendering pipeline (like direct light and post effects) and keep the BVH ray traversal usage at a minimum. In combination with adaptive filtering, GPU data streaming and mesh preprocessing, this technique allows for real time frame rates up to Virtual Reality usage on a single GPU. The robust implementation is used for architectural visualization but can also be used at games and other areas with a wide range of direct and indirect lighting phenomena. We finally compare our results with our offline path tracer implementation.
Proving that such a complex system as an autonomous car is safe cannot be done using existing standards. A new method needs to be invented that is much more data driven and probability based. Traditional redundant solutions don't apply when trying to optimize a Precision-Recall curve. Getting acceptance from the regulatory bodies and the public will be much easier if the industry converges on what this new method shall be.
Well discuss the challenges in creating maps from hundreds of millions of street-level images and the solutions we have developed using deep learning and computer vision techniques. Map creation is one of the essential problems for autonomous driving. The two essential components for map creation are object recognition and 3D reconstruction. Well discuss how we increase our object recognition capacity by combining the deep learning techniques weve developed and the Mapillary Vistas Dataset (the worlds largest street-level dataset with instance-aware segmentation). Well also look into the challenges in large-scale 3D reconstruction including scalability, semantic understanding integration, and camera self-calibration. Finally, we will demonstrate the map data that we generate.
As AI makes the connected world of cars, robots and buildings more and more intelligent, it is becoming increasingly important for these intelligent entities to interact with humans in natural ways. Human beings are highly visual creatures for whom 80% of communication is non-verbal. In this talk, we present a computer vision AI technology that allows humans to interact naturally with intelligent machines by giving the machines the ability to see and understand human intent. We provide a brief overview of how to apply GPU-based deep learning techniques to extract 3D human motion data capture from standard 2D RGB video. We'll describe in detail the stages of our NVIDIA® CUDA®-based pipeline, from training on DGX-1s to edge-based deployment on Jetson TX2s.
Learn how our world can be understood better and faster by our robotic companions thanks to embedded GPUs. Our current advances in embedding the JetsonTX2 into Softbanks Pepper, the worlds leading affordable humanoid robot, shows that along with the newly gained autonomy, confidentiality issues are also addressed due to embedded processing power. As no personal data needs to be processed on the cloud, your home privacy is maintained. This enhanced solution that we will present helps the robot to navigate better, recognize more objects quicker and allows it to interact fluidly with humans. A live demo of the autonomous Pepper Robot embedding the Jetson TX2 and navigating and interacting on stage will be done during the session.
Explore how researchers are using deep learning to uncover stories hidden in photos while connecting them to audiences. With the rise of mobile cameras, the process of capturing good photos has been democratized - and this overload of content has created a challenge in search. One of the important aspects of photography is that every image communicates with a different audience in different form. The goal of good search and discovery is to connect a target audience with stories that resonate with them. Deep Learning models encode rich representations about photographs. In this talk I will explain how our researchers use various combination of machine learning driven techniques that help understand various subtleties in photographic data, and match it with a target audience.
Perception is a key component of self-driving technology. The better a car understands its environment, the more reliable the decision-making car control system will be. This session will provide an overview of the key sensor processing and fusion algorithms required for autonomous vehicles, and discuss the challenges on the way towards Level 5 self-driving. Specifically, we address the challenge of processing huge amounts of data from all vehicle sensors in real-time, while achieving good results on public and private benchmarks.
In a post robot-revolution world we imagine a shiny army of willing workers and boundless leisure time with free dental for all - in reality, getting a robot to do simple household tasks is still beyond us. Why are humanoid robots so hard to build and control and what can be done to fully take advantage of the wealth of new AI tools available? We view the robot as an advanced power tool that can perform a wide range of tasks using only human tools and adaptable software to the extent that from the users point of view the task could be accomplished entirely by the robot a robot is not an appliance. Such a machine is necessarily complicated and the great promise of GPUs and DL is that we can finally control robots that approach the mechanical capabilities of the human body.
Park Smart is a solution to lead drivers to find free parking spaces, and help parking owners and managers to improve their business. We exploit the paradigm of Edge Computing, moving the computational load from servers in the Cloud to embedded devices located in place. Such a solution dramatically reduces the bandwidth consumption by ~95%. We perform the fine-tuning of a pre-trained CNN model able to classify empty vs. non empty parking lots using the NVIDIA Jetson inside our AISee box, and then we stream the result to the Cloud as a JSON file. A DL pipeline allows us to have a more robust classification with respect to classical CV techniques. We will present our end-to-end architecture together with the results of the benchmark tests about fine-tuning and classification on TXn.
Color the invisible: This is the potential of hyperspectral imaging. We will dive into the key concepts of molecular spectroscopy and will uncover the hidden treasures this amazing new technology offers to industry. The application in real world will be shown in several showcases from different sectors ranging from recycling to pharmaceuticals. Solving industrial applications is not always easy, especially with untrained personnel. To overcome these problems, we outline a two-tier approach with a learning phase and an inference phase. The learning phase focuses on the user and reduces the complexity of the learning process and the physics. As the inference phase is solely oriented on validity and performance, we will illustrate how and why the Tegra excels in the presented applications.
In this session you will discover how GPU processing is used for real-time plant classification and recognition on a field robot performing autonomous weeding. We will talk about the use of embedded GPUs to perform fast image pre-processing and classification using deep learning, in comparison with other methods. A major focus will be on the computation time and power consumption constraints of this real-world application. In addition, we will present the advantages of our completely autonomous robot in terms of cost/labour efficiency and environmental impact.
We present a novel unsupervised method for face identity learning from video sequences. The method exploits Convolutional Neural Networks for face detection and face description together with a smart learning mechanism that exploits the temporal coherence of visual data in video streams. We introduce a novel feature matching solution based on Reverse Nearest Neighbour and a feature forgetting strategy that supports incremental learning with memory size control, while time progresses. It is shown that the proposed learning procedure is asymptotically stable and can be effectively applied to relevant applications like multiple face tracking and online open world face recognition from video streams. The whole system including the smart incremental learning mechanism take advantage of the GPU.
This session gives an insight into the sensor-based environment modeling for autonomous driving at Continental. We show how conventional computer vision algorithms can be combined with Artificial Intelligence to make a large step towards fully autonomous driving. The presentation highlights our latest achievements in challenges like lane marking perception, construction site detection, road topology prediction, camera-based verification of radar objects. We demonstrate how the online environment model, which serves as an input for the longitudinal and lateral control of self-driving vehicles, is computed on the NVIDIA DRIVE PX 2.
Computer Vision with CNNs performs well for people detection. This is not enough. A step forward can be taken to understand the aspect of people detected in low resolution, or corrupted by occlusions in the crowd; to track them in the wild; to detect saliency and pay attention to details only; to forecast motion and human actions. The next solutions will be provided by new neural architectures based on autoencoders and recurrent architectures, such as Generative Adversarial Networks and Long Short Term Memories. The session will present how they work, how they can be implemented on GPUs and how they are used in real applications, such as in AI cities form static and moving cameras and in collaborative environments.
The talk will describe the solutions we devised to provide our prototype of humanoid service robot, R1, with a visual object detection system which can be trained by people in an interactive, natural way. We previously integrated Deep Learning methods on our robot, equipped with an NVIDIA Jetson TX2. The resulting recognition system could be effectively trained on the fly on board the robot and was presented at last GTC. In this talk we will discuss how we are extending it to localize and recognize multiple objects in the image, towards a system which can continuously learn to detect new objects presented by users. We will address problems as ground-truth acquisition, training/inference time, prediction robustness. We will present the system deployed on R1 and discuss open challenges.
A fundamental component required for safe autonomous driving is highly accurate maps, which contain semantic information regarding the position and content of traffic signs, lane markings, and other road features. At Mapscape, we rely heavily on deep learning to extract information from images to aid with the creation of these maps. In this presentation, we explore two parts of our process: An object detection pipeline running onboard NVIDIA Jetson devices through TensorRT capable of recognizing among 167 different traffic signs in real time, and a semantic segmentation pipeline capable of extracting up to 45 different road level features such as lanes, arrows, and other road surface signs.
This session will show how we combine high performance GPU processing with Deep Learning (DL). We use automated tomographic imaging microscopes for various studies in physics and biology. These systems have raw data flow of up to 2GB/s, making real-time (RT) data processing mandatory. To make the system more intelligent, an advanced processing pipeline must be incorporated. So far DL inference speed doesnt allow us to apply it to all the data. To address the problem, we are designing a hybrid system, that allows DL usage for high throughput microscopy in RT. Concepts and approaches that we use to design the system will be illustrated with examples from high energy physics and biology.
In this talk, I will highlight the main research challenges facing the field of activity detection in untrimmed videos, as well as, deep learning based methods developed at KAUST to address them. Massive amounts of video data need to be processed for relevant semantic information that predominantly focuses on human activities (i.e. single human, human-to-human, and human-to-object interactions). While this problem is encountered in many real-world applications (e.g. video surveillance, large-scale video summarization, and ad placement in video platforms), automated vision solutions have been hindered by several challenges including the lack of large-scale datasets for learning and the need for real-time processing. I will highlight how deep learning can be used to tackle these challenges.
The goal of this session is to illustrate how the world of ophthalmology is changing thanks to AI and deep learning. We will cover the basic aspects of eye diseases and how you can apply GPU-accelerated machine learning to improve imaging and analysis of the eye. We present solutions to get ready for this transition by covering the latest innovations in the field of deep learning in ophthalmology, reaching for the first time human-level performance in the detection, classification and monitoring of eye disease. Furthermore, we will unravel how advances in GPU technology enable improved imaging quality for portable, low-cost Optical Coherence Tomography (3D) and Fundus (2D) imaging, the leading modalities of reference in eye care.
Big data analytics methods for the large scale analysis of imaging, genetic, laboratory, and clinical data have great potential to improve our understanding of disease, and to improve disease diagnosis and prognosis. Both classical machine learning (e.g. radiomics, multi feature classification) and deep learning methods are currently used in these domains. In this talk, I will present the results and challenges for both approaches to make impact in the context of a number of applications. Specifically, we will discuss early and differential diagnosis and improved prognosis of dementia, and improved neuro tumor characterization and treatment response prediction.
This talk will focus on the use of deep learning techniques for the discovery and quantification of clinically useful information from medical images. The talk will describe how deep learning can be used for the reconstruction of medical images from undersampled data, image super-resolution, image segmentation and image classification. We will also show the clinical utility of applications of deep learning for the interpretation of medical images in applications such as brain tumour segmentation, cardiac image analysis and applications in neonatal and fetal imaging.
Government agencies and commercial companies today demonstrate high demand to versatile, stable and highly-efficient person identification solutions supporting cross-domain face recognition and person database clusterization in both controlled and uncontrolled scenarios. Now it becomes possible to successfully resolve cross-domain face recognition challenge using deep learning and even tasks of quadratic complexity using GPU-powered inference of CNN-based face recognition algorithms. We''ll focus on (I) the concept of the GPU-powered platform for cross-domain face recognition; (II) its essential performance and critical technical characteristics; (III) reaching required accuracy and performance by using NVIDIA GPUs; (IV) examples of completed and ongoing face recognition projects
In this talk, we will show how an industrial robot learns to grasp objects in a self-supervised manner. Starting with random grasps with a success rate of just a few percent, the robot improves its grasping success to well over 90% in a few dozen hours. In the short term, this could improve applications like bin picking and warehouse automation, while in the long term a general grasping controller could be built with the presented techniques. Already, the self-learned grasping system can handle new, unseen objects and environments with remarkable accuracy. The algorithm runs with high performance on consumer GPUs, paving the way to embedded implementations on Nvidia Jetson boards for applications.
In this talk, I will present a novel framework for deep learning with 3D data called OctNet which enables 3D CNNs on high-dimensional inputs. I will demonstrate the utility of the OctNet representation on several 3D tasks including classification, orientation estimation and point cloud labeling. In the second part of my talk, I will present an extension of OctNet called OctNetFusion which jointly predicts the space partitioning function with the output representation, resulting in an end-to-end trainable model for volumetric 3D reconstruction at resolutions up to 512 x 512 x 512.
Benefit from our experiences applying Nvidia solutions including Jetson to implement high performance deep learning and vision systems, from training on high powered workstations to deployment on embedded systems. Low power, high performance, AI friendly processing allows us to provide the performance to enable us to exploit approaches and algorithms developed for space in terrestrial spin-off applications. We will use examples and demonstrations from our Mars rover development systems to show how we were easily able to leverage GPUs to advance our R&D work on autonomous science into practical terrestrial applications for the automated inspection of the built environment.
In this session, you shall be introduced to a new framework for scientific computing, mainly aimed at deep learning workloads. The framework consists of an ndarray library that natively supports GPU execution, an automatic differentiation engine that is flexible and fast, and an optimization package for gradient based optimization methods. We shall discuss practical workflows, our features on top of python multiprocessing for efficient parallel data loaders and finally we shall briefly look at our upcoming just-in-time Tensor compiler to fuse computations and execute them more efficiently.
In Deep Learning, Inference is where neural networks deliver insights. What started with images is quickly expanding to include speech, NLP and video. As data sets get bigger, networks get deeper and more complex, and latency requirements get tighter, GPUs are the ideal platform to accelerate these workloads, both for high batch and low-latency use-cases. In this talk, youll learn how inference gets done on GPUs, and get the latest on TensorRT 3.0, the latest version of NVIDIAs inference engine.
Starship Technologies is developing the future of delivery self-driving robots that will bring goods to people all around the world. We have entered commercial pilots with hundreds of robots driving along the sidewalks across the United States and four countries in Europe. Soon there will be thousands and millions of robots around the world. The talk gives an overview of the story, business, and technology of Starship Technologies.
We will look at 3 main areas of the automotive product life cycle, each with an expert panelist, these will include design, engineering and marketing. For design, we will see how short-iteration cycles and rich experiences can make the next-level of design decisions. For engineering, we will look at how a game engine can fast-track on-vehicle system development. And finally, marketing will show game engines can drive consumer engagement. Panelists are: Árpád Takács (AImotive, AI Researcher and Outreach Scientist), Kian Saemian (Mackevision, Senior Business Development Manager) and Daniel Motus (BMW, Head of Cubing, Virtual Reality Interior).
Augmented Reality has been around for a while now with plenty of major investments made to develop the technology. Yet years have passed and we still dont see AR in our daily business lives despite such obvious benefits. At best we are only seeing sandbox projects for consumer marketing and attention grabbing stunts. Is AR tech not good enough for industrial applications or is it simply too expensive? More importantly, what do AR solution providers collectively need to do to open up their market for the masses?
Presenting the concept of a collaborative, immersive and user-centric virtual engineering space, we discuss the challenges in designing the engineering workspace of the future, leveraging VR/AR solutions provided by the developing VR/AR market through on-premise and public cloud infrastructure. We'll present the challenges of delivering immersive content to arbitrary devices, user-centric design, modalities and pipeline implementation for collaborative virtual spaces to support both local and remote users. An introduction of the technical foundations will cover solution architectures extending to specific rendering and content streaming techniques, virtualization and the VR Datacentre all combined to enable high definition, interactive content in shared, rich immersive experiences.
WebVR enabled streaming
Live demo set-up at GTC Europe
In this session, NVIDIA GRID Community Advisors Ruben Spruijt and Benny Tritsch present their latest findings on benchmarking user experience performance in GRID-accelerated environments hosted on-premises and in the cloud. Get in-depth information on the latest versions of Citrix XenApp/XenDesktop, VMware Horizon and Microsoft RDS when accelerated by NVIDIA GPUs. What is the performance impact caused by remoting protocol settings, latency and common WAN scenarios? Hundreds of recorded screen videos and telemetry data sets, combined with a unique visualisation tool, allow Ruben and Benny to analyse and compare the performance of selected GRID-accelerated remote desktops and VDI scenarios, live on stage.
In this session, we will describe the successful proof of concept in which a radiology desktop computer was replaced by a thin client and a VDI desktop, hosted in the hospital's data center. We will also show you what the challenges were and what the solution has brought the hospital in terms of advantages in comparison with the old situation. The case is very interesting for anyone wanting to learn more about the possibilities of virtualised graphics.
During this session, you will learn how AWL Techniek, a a global operating specialist in designing and building state-of-the-art automated welding robots, rapidly increased their market share in worlds best fully automated welding robots. These state-of-the-art robots are used in the automotive industry all over the world. This rapid growth was made possible with the use of NVIDIA GRID technology combined with Citrix XenDesktop virtualization techniques. With these solutions AWL Techniek is able to shorten their time-to-market, both with their robots as well as with new branch locations all over the globe.
Come join us, and learn how to build a data-centric GPU clusters for artificial intelligence. We will briefly present the state-of-the-art techniques for distributed Machine Learning, and the special requirements they impose on the GPU cluster. Additionally, we will present an overview of interconnect technologies used to scale and accelerate distributed Machine Learning. During the session we will cover RDMA, NVIDIA's GPUDirect RDMA and GPUDirect Asynch as well as in-network-computing and how the use of those technologies enables new level of scalability and performance in large scale deployments in artificial intelligence and high performance computing.
The transformation towards value-based healthcare needs inventive ways to lower cost and increase outcomes. Artificial Intelligence is key to realizing value-based care. Turning medical images into biomarkers helps to increase effectiveness of care through quantitative imaging.
Learn why EVERY remote user should have GPU resources available to them. We''ll discuss the advantages end-users experience once their virtual desktopsundefinedsessions have GPU capabilities. Recent data from the NVIDIA GRID Performance Engineering team shows a significant impact GPUs like the Tesla M10 has on knowledge workers. The data includes real user testing and scientific data like end user latency, remoted frames, bandwidth, and CPU utilization, which all play a significant role in the overall user experience.
In this session we will discuss with our partners INS GmbH and Planet GmbH the advantages of IBM POWER systems in combination with NVIDIA technology as an ideal choice for a performant, reliable and highly scalable AI infrastructure. By combining the platform for deep learning with IBM POWER AI software, enterprises can rapidly deploy a fully optimized and supported platform for machine learning with blazing performance. The PowerAI platform includes the most popular machine learning frameworks and their dependencies, and it is built for easy and rapid deployment.
This talk gives an overview about Deep Learning and the overall field of AI with a focus on automotive applications beyond autonomous driving. Deep Learning solutions as the driving force of AI enables the transition of Volkswagen towards a data driven company - this talk gives insights on how this transitions has been already implemented.
This session discusses tree-structured neural networks for planning and reasoning and the computational challenges that arise for this promising class of models. Despite the tremendous success of deep learning in recent years, we lack neural networks that facilitate explicit planning and reasoning. Automated planning and reasoning encompasses tree structures over possible future states and provides elegant ways for combining neural and symbolic computation, as well as model-free and model-based reinforcement learning. While these models are promising extensions to more established architectures, tree structures pose unique challenges to GPU computation. I will talk about our efforts on addressing these challenges for reasoning in knowledge bases and model-based deep reinforcement learning.
Using the Nigerian experience as a case study, we explore the adoption of virtual reality in a frontier market. In 2016, Imisi 3D, a virtual reality creation lab set up in Lagos with the dual purpose of growing a community of content creators for the extended reality technologies and driving engagement and adoption of these technologies. In a country known for consumption when it comes to technology, this was a dedicated effort to change the narrative to creation while positioning these technologies as tools for creating solutions. Imisi 3D went on to host Africa's first VR hackathon, started local AR/VR meetups, and an online community. Ultimately we will reflect on lessons learned from the journey so far, and consider what the future holds for the continent and the rest of the world.
Modern computing hardware and NVIDIA Jetson TX1 / TX2 performance create new possibilities for drones and enable autonomous AI systems, where image processing can be done on-board during flight or near the camera. We'll present how PIXEVIA system covers vision processing and AI tasks for drones, e.g., image stabilization, position estimation, object detection, tracking, and classification using deep neural networks, and self-evolvement after deployment. We'll describe software frameworks Caffe/Tensorflow with cuDNN, VisionWorks, and NVIDIA CUDA to achieve real-time vision processing and object recognition. Real-world use cases with drone manufacturers Aerialtronics and Squadrons Systems, and with smart city applications in Vilnius and Tallinn will be presented during this talk.
We will show a complete system where a mobile robot learns to locate and retrieve objects using reinforcement learning and the DIANNE deep learning framework. DIANNE is used to train models on high-end GPU systems in the cloud with simulated data. The trained network is then transferred to the robot equipped with a Jetson TX1 embedded GPU. The Jetson TX1 allows the robot to process real-time information from rich sensors mounted on the robot or deployed in the environment.
PwC Drone Powered Solutions is a first global center of excellence focusing on fusion of drone technology with other technologies including but not limited to machine learning, photogrammetry and image data processing. Adam Wi?niewski will show examples on how DPS worked with clients from various industries on testing applications and deployment of technologies enabled by drones in their operations. We have developed end-to-end drone powered solutions for capital projects monitoring, infrastructure maintenance, mining operations supervision, environmental protection, insurance claims assessment and many others.
Developing highly automated driving (HAD) systems requires various different data processing stages. Using a software framework for HAD systems such as EB robinos can provide the basic software modules to streamline this process. It provides building blocks to develop different HAD applications ranging from valet parking to highway driving. Before mass production, the development of HAD systems needs to change from rapid prototyping to an embedded platform. In this presentation we show how we ported our HAD software framework EB robinos to NVIDIA DRIVE PX.
High-performance embedded platforms, like Nvidia Parker and upcoming Xavier architecture, have the potential of revolutionizing traditional safety-critical domains, where innovative automotive and avionic applications are being proposed to safely replace human activities. However, these domains require sound guarantees be given not only on the functional correctness but also on the timing delays of the critical activities, according to well defined safety standards (e.g., ISO26262). In this talk, we will explain how predictability can be achieved on inherently unpredictable multi-core systems, analyzing the main timing bottlenecks of these challenging platforms, and presenting a holistic framework that aims at overcoming them to achieve predictable performance.
Iko is developing a skin imaging system for clinical use thats faster, cheaper and simpler to use in the field, offering vastly improved accuracy over current imaging methods. Antmicro is developing the Jetson TX-based handheld embedded device to be used in the system, incorporating real time depth image processing, spatial tracking and 3D visualisation. This session will present the limitations of current skin imaging methods for clinical use, including detecting and tracking potential skin cancer, along with a detailed presentation of the system design and development process. This session is ideal for those interested in the role GPU technology can play in medical imaging applications, and offers a high level view into the design and development process of such a system.
We present a face recognition system that can recognize multiple persons parallel in real-time running on a single Jetson TX2. Due to rapid progress in deep learning accuracy of face recognition has surpassed human level recently. GPUs became the major platform to train and run deep learning models. Speed of NVidia GPUs on deep learning tasks is increasing rapidly due to hardware and software optimizations. We present a system that combines the most accurate face detection and recognition models with the fastest software stack. Combined with a 4K camera the system can recognize over 10 persons parallel in crowd situations even from 10 meter range. The system can be deployed to low power embedded environments such as drones.
A TALK ABOUT THE JOYS AND HURDLES OF DEVELOPING MULTI-SENSORY MULTI-PLAYER VR EXPERIENCES. You-VR develops software and content for narrative and B2B multi-sensory multi-user VR experiences. Within a series of test set-ups, You-VR attempts to reach a perfect equilibrium between personal agency, UX convenience and affordability. The aim is to combine multi-user VR (4 people), with full-body avatars, including hand- and eye-tracking, wirelessly connected to an array of hi-end GPUs in virtual machines - all laser tracked on a large-scale playfield.
Development for Virtual Reality is a hunt for each millisecond of latency its the one metric that needs to be perfect for an immersive and comfortable experience. Additionally, due to the way VR content is rendered, it takes up to 7x the amount of GPU throughput needed when compared to traditional gaming. In Dominic Eskofiers talk, youll learn how to maximize both framerate and visual quality of your Virtual Reality app by using sophisticated rendering techniques built into popular game engines, NVIDIAs SDKs and modern GPUs.
Do you need to compute larger or faster than a single GPU allows you to? Then come to this session and learn how to scale your application to multiple GPUs. In this session, you will learn how to use the different available multi GPU programming models and what are their individual advantages. All programming models will be introduced using same example applying a domain decomposition strategy.
Murex has been an early adopters of GPU for pricing and risk management of complex financial options. GPU adoption has generated performance boost of its software while reducing its usage cost. Each new generation of GPU has also shown the importance of the necessary reshaping of the architecture of the software using its GPU accelerated analytics. Minsky featuring far better GPU memory bandwidth and GPU-CPU interconnect rase the bar even further. Murex will show how it has handled this new challenge for its business.
HPE Deep Learning solutions empower innovation at any scale, building on our purpose-built HPC systems and technologies solutions, applications and support services. Deep Learning demands massive amounts of computational power. Those computation power usually involve heterogeneous computation resources, e.g., GPUs and InfiniBand as installed on HPE Apollo. NovuMinds NovuForce system leveraging state of art technologies make the deployment and configuration procedure fast and smooth. NovuForce deep learning softwares within the docker image has been optimized for the latest technology like NVIDIA Pascal GPU and infiniband GPUDirect RDMA. This flexibility of the software, combined with the broad GPU servers in HPE portfolio, makes one of the most efficient and scalable solutions.
Discover how we designed and optimized a highly-scalable dense solver to solve Maxwell equations on our GPU-powered supercomputer. After describing our industrial application and its heavy computation requirements, we detail how we modernized it with programmability concerns in mind. We show how we solved the challenge of tightly combining tasks with MPI, and illustrate how this scaled up to 50000 CPU cores, reaching 1.38 Petaflops. A focus is then given on the integration of GPUs in this model, along with a few implementation tricks to ensure truly asynchronous programming. Finally, after briefly detailing how we added hierarchical compression techniques into our distributed solver over CPUs, we describe how we plan to unlock the challenges that yet prevented porting it on GPUs.
We leverage NVIDIA GPUs for connected components labeling and image classification applied to Digital Rock Physics (DRP), to help characterize reservoir rocks and study their pore distributions. We show on this talk how NVIDIA GPUs helped us satisfy strict real-time restrictions dictated by the imaging hardware used to scan the rock samples. We present a detailed description of the workflow from a DRP approach perspectives, our algorithm and optimization techniques and performance results on the latest NVIDIA GPU generations.
In order to prepare the scientific communities, GENCI and its partners have set up a technology watch group and lead collaborations with vendors, relying on HPC experts and early adopted HPC solutions. The two main objectives are providing guidance and prepare the scientific communities to challenges of exascale architectures. The talk will present the OpenPOWER platform bought by GENCI and provided to the scientific community. Then, it will present the first results obtained on the platform for a set of about 15 applications using all the solutions provided to the users (CUDA,OpenACC,OpenMP,...). Finally, a presentation about one specific application will be made regarding its porting effort and techniques used for GPUs with both OpenACC and OpenMP.
Wireless-VR is widely defined as the key solution for maximum immersion. But why? Is it only the obvious reason of the omission of the heavy and inflexible cable? There is more behind it. Learn how the development of tracking technology goes hand in hand with the increasing demand of Wireless-VR Hardware solutions, what hardware is out on the market now, what is coming and how can wireless solutions - whether standalone devices or Addons - create a higher value for your VR application? How large-scale location based VR and hardware manufacturers are expanding the boundaries of the VR industry, both for Entertainment and B2B?
With over 5000 GPU-accelerated nodes, Piz Daint has been Europes leading supercomputing systems since 2013, and is currently one of the most performant and energy efficient supercomputers on the planet. It has been designed to optimize throughput of multiple applications, covering all aspects of the workflow, including data analysis and visualisation. We will discuss ongoing efforts to further integrate these extreme-scale compute and data services with infrastructure services of the cloud. As Tier-0 systems of PRACE, Piz Daint is accessible to all scientists in Europe and worldwide. It provides a baseline for future development of exascale computing. We will present a strategy for developing exascale computing technologies in domains such as weather and climate or materials science.
The presentation will give an overview about the new NVIDIA Volta GPU architecture and the latest CUDA 9 release. The NVIDIA Volta architecture powers the worlds most advanced data center GPU for AI, HPC, and Graphics. Volta features a new Streaming Multiprocessor (SM) architecture and includes enhanced features like NVLINK2 and the Multi-Process Service (MPS) that delivers major improvements in performance, energy efficiency, and ease of programmability. New features like Independent Thread Scheduling and the Tensor Cores enable Volta to simultaneously deliver the fastest and most accessible performance. CUDA is NVIDIA''s parallel computing platform and programming model. You''ll learn about new programming model enhancements and performance improvements in the latest CUDA9 release.
This talk will give an introduction on the use of VR to enhance the understanding of Scientific Visualisation. Different application scenarios from hydro-meteorology, zoology, genetics and geophysics will be introduced in this talk and the benefits of VR technology in their specific application context will be explained. An outlook showing industrial use cases working with Mixed Reality technology will be given.
In contrast to demands for less regulation in the US, European financial institutions face new MiFID II and GDPR regulations which fundamentally affect how records are stored, retrieved and destroyed. 50% of all corporate data will have a voice component in the next 5 years, which implies that companies not only need to know where data is being held, but also what is being said in it, and who is saying it. Part of this talk will showcase the solution produced by Telefonica/O2 and Intelligent Voice to capture, index and analyse mobile phone calls, and introduce them as part of a compliance and monitoring workflow for MiFID II. We will also show how machine learning can be applied to analysing real-time voice conversations to help spot fraud to an accuracy level on a par with humans
GEHC introduced the Vivid E95 premium cardiovascular ultrasound scanner in June 2015 based on the ground breaking cSound system architecture. The Vivid E95 uses two Quadro GPUs for real time image reconstruction, image processing and visualization. The session will first give a quick introduction to the architecture and the clinical benefits. It will then cover new GPU based features that were recently introduced to further improve the performance and usability of the Vivid E95. Finally the session will cover future plans for making the scanner more intelligent with use of deep learning algorithms and initial results of using TensorRT for real time cardiac view detection will be shared
Leveraging NatSec technology to make real-time video streaming from vehicles possible, zero-latency, secure and affordable; and applying the latest generation of FaceRec analytics to ensure only authorised people are behind the wheel.
SeeQuestor uses Deep Learning and Affordable Supercomputers to provide Radically Faster Video Intelligence to Police and Law Enforcement Agencies who need to search 100s or 1,000s of hours of CCTV or other video data as part of a criminal investigation or a search for a missing person. Developed with input from the Met Police and the British Transport Police, SeeQuestor is now in use by law enforcement agencies around the world. This session will focus on the technology used (Deep Learning and Affordable Super Computers, powered by GPUs), the academic pedigree (two leading computer vision research groups from the UK), and illustrate the capabilities of the SeeQuestor platform with examples drawn from real use cases.
VR-systems are fundamental tools for Audis internal product development process. Despite the fact that VR has been used for product evaluation for many years, physical models havent been completely replaced. Therefore, deficits of current VR-systems have to be resolved. So, we introduce a new large-scale multi-user VR system, which could improve the confidence in internal evaluations performed in VR. Furthermore, Audi is using VR for retail gaining upselling potential called Walking VR and Sitting VR. See our approach bringing same highly complex and fully configurable 3D content into the cloud directly streamed to a customer under usage of a game engine. See how this gets measured and offering a significant upselling potential plus some examples of stunning visual real-time quality.
Graphics acceleration is no longer something exclusively for engineers and 3D designers. With the release of Windows Server 2016, Windows 10, Office 2016 and a growing demand for a perfect multimedia experience when browsing, the everyday user demand for graphics is growing as well. In this session you will hear how Holstebro Municipality, a Local Government located in the western part of Jutland in Denmark, implemented Nvidias GRID solution in their Citrix environment to be able to provide all of their users with the best graphical experience. As a side note they became able to virtualize the workload of their technical staff as well, leading to an even greater benefit and gain.
Many companies and brands are facing problems to implement their concepts into XR Projects. From 360 storytelling, to 3d audio, to interactive experiences, to virtual worlds, the game- and film industry already consists of a lot of experts in all these areas. This talk will be about how NOYS VR are tackling these challenges by adapting established standards and combining them to new art & creation processes for this new medium. Although this example is based on the music, gaming entertainment industry, it shows how different industries can benefit from each other and monetize through innovative content while putting the customers into new and social realities.
Video and image resolutions keep growing. It is UHD undefined 4K today and 8K has been announced. UHD is already a challenge for existing video and image compression algorithms that quickly overload the CPU. New GPU-based compression algorithms help overcome this by addressing the two main bottlenecks - CPU performance and the PCIe bus limitations. Cinegys DANIEL2 GPU video codec is a highly scalable CUDA-based video encoder and decoder (codec) for professional video editing, post-production and broadcast. It is also used large scale imaging (GIS, medical) and VR. Using a GPU-based video codec can increase performance twenty-fold compared to traditional CPU based approaches while at the same time offloading the CPU for other tasks. 8K video editing on a NVIDIA-based notebooks is real.
cuDIMOT (CUDA Diffusion Modelling Toolbox) is a toolbox for designing and fitting nonlinear models (non-only diffusion) on NVIDIA GPUs. It offers a friendly interface for implementing new models and it automatically generates parallel CUDA code. Various model-fitting approaches are available, including Grid Search, nonlinear Levenberg-Marquardt optimisation and Bayesian inference using MCMC. We present how cuDIMOT has been developed and is being used in the context of diffusion MRI for studying brain tissue microstructure. The toolbox achieves accelerations of two orders of magnitude using a single K80 NVIDIA device compared to the commonly used CPU tools. Large projects such as the UK Biobank, an epidemiological study scanning 100,000 subjects, will tremendously benefit from this toolbox.
We discuss two deep neural networks that aid radiologists with the localisation of anomalies in biomedical volumes. Our first application concerns the detection of lung nodules on CT chest scans. Secondly, we designed a network that localises the neural foramina on MRI lumbar spine scans. Both applications require a tremendous amount of computational power as the input data is in 3D. In our talk, we explain and compare these two networks and their performance. Moreover, we discuss some of the practical problems that arise when designing neural networks for medical image analysis.
Deep learning has become the most powerful driver in medical image analysis. In this talk, I provide an overview of these recent developments, with a focus on results of recent competitions in radiology, pathology, and ophthalmology. I show how detection of lung cancer with CT can be improved with deep learning, as shown in the LUNA16 challenge and the Kaggle Data Science Bowl of 2017. The CAMELYON16 and CAMELYON17 challenges have shown that deep networks outperform pathologist at the detection of lymph node metastases of breast cancer. Finally, in ophthalmology, early detection of diabetic retinopathy with convolutional networks has shown excellent results. These developments will have a major impact on healthcare.
In this presentation, we will discuss modeling electronic health record (EHR) data with deep learning and Deeplearning4j We describe how to train an long short-term memory recurrent neural network (LSTM RNN) to predict in-hospital mortality among patients hospitalized in the intensive care unit (ICU). Of particular note, our results show that even for a dataset of moderate size, the LSTM is competitive with alternative approaches, including decision trees and multilayer perceptrons, using hand-engineering features. We will also show how to parallelize model training on a Spark cluster. Finally, we will highlight potential extensions of this work and other use cases for EHR data and deep learning. All code and data are publicly available so that attendees may reproduce our work.
Across the Mediterranean basins, the Messinian salinity crisis resulted in the deposition of up to 2 km thick multi-layered evaporitic succession consisting of alternating layers of halite and clastics. Such geological objects obscure seismic imaging and may even be over pressurized posing potential drilling hazards, which are often hard to predict. We demonstrate TPDOT&TWSM approach developed in IPGG SB RAS by example of evaluating the interference wavefields wave fragment into the shadow zone for real geological case from the Levant Basin, offshore Israel. Using of GPUs allowed accelerating TWSM algorithm based on multiple large size matrix-vector operations in hundreds and more times.
Machine learning and deep learning applications are revolutionizing how we as consumers interact with our compute devices by imbuing them with speech recognition, machine vision, and other perceptual capabilities. We are now seeing new advancements in AI which move from simple pattern recognition and sensory data processing to much deeper semantic processing. These new advancements in essence bridge the gap between machine learning techniques, including commoditized deep learning on SIMD GPUs , and the next generation of specialized distributed-memory MIMD hardware for large-scale graph analysis for symbolic artificial intelligence. In this session we will cover the new capabilities and use cases Cisco is targeting with this new breakthrough technology.
Get the latest information on how financial markets are using advanced in-database analytics for real-time risk aggregation. Advanced in-database analytics allows the bank to run custom XVA algorithms at scale with the GPUs massive parallelization. This approach allows banks to move counterparty risk analysis from batchundefinedovernight to a streamingundefinedreal-time system for flexible real-time monitoring by traders, auditors, and management. Real-world examples and insights will be provided, including how a multinational bank is using Kinetica as a real-time risk modeling engine running on public cloud-based, Microsoft Azure GPU instances. The bank can now handle time-sensitive, compute-intensive risk computations to project years into the future across hundreds of variables.
This session will cover how to use NVIDIA DRIVE PX to build a self-driving vehicle, including insights into data acquisition, data annotation, neural network training, and in-vehicle inference. In addition, it will focus on how DRIVE PX delivers the performance, energy-efficiency and safety requirements for it to be brain of production autonomous vehicles.
Massive amounts of labeled and unlabeled datasets are needed both for training autonomous vehicles to navigate complex and unexpected driving scenarios, and to evaluate that training. As a substitute for hours of recorded data, Pro-SiVIC creates synthetic data to simulate the output from multiple sensor systems for outdoor scenarios that combine vehicles, obstacles, pedestrians, weather, and road conditions. We will demonstrate how powerful and efficient parallel computing with NVIDIA Drive PX2 can be used with Pro-SiVIC synthetic data to process that data in real time. We will compare the performance of a trained lane detection algorithm, running on Drive PX2, against a 3D Pro-SiVIC scene with simulated raw camera data, and a real video recorded from a car in similar conditions.
Recent advances in earth observation are opening up a new exciting area for exploration of satellite image data. In this session you will learn how to analyse this new data source with deep neural networks. Focusing on Emergency Response, you will learn (1) how to apply deep neural networks for Semantic Segmentation on satellite imagery. Additionally, we present recent advances of the Multimedia Satellite Task at MediaEval 2017 and show (2) how to extract and fuse content of natural disasters from Satellite Imagery and Social Media Streams. It is assumed that registrants are already familiar with fundamentals of deep neural networks.
Medical images uniquely represent the anatomical and functional progress of diseases in 3D space and time. Radiomics denotes the emerging endeavor of systematic extraction, mining and leveraging of this rich information towards personalized medicine. We aim to comprehensively summarize imaging information from multiple time-points and modalities in condensed, quantitative signatures and link them with clinical and biological parameters (e.g. genomics or proteomics). We develop our methods for various clinical applications, with a particular emphasis on prostate cancer, breast cancer and brain tumors. The talk will introduce these developments with a particular focus on the machine learning aspects and big data applications where large-scale heterogeneous data sources are analyzed.
Artificial Intelligence the next big thing, disruptive, vital, innovative, dangerous, saving grace, ubiquitous, unavoidable, engine to future growth, a privacy nightmare? It can be everything, but can it be nothing? Competitive advantage requires it. An insecure, turbulent world demands it. What is your strategy on AI? At the forefront of innovation, a follower, wait and see? This session will attempt to peel back the layers of complexity around AI and introduce a conceptual framework around establishing an AI strategy for your business one that starts where it needs to and scales naturally from the development phase through to full implementation.
The combination of Nutanix AHV and NVIDIA brings out the best in your graphical applications, VDI environments or 3D applications. Come join the session and learn how the Nutanix Enterprise Cloud Platform drives performance, efficiency and agility for your virtualized applications.
Gwen will explain HPs Vision on Commercial VR space and how HP is approaching it today . HP's commercial VR strategy objective is to deliver the best and most immersive VR and compute experience and offer end to end solutions. All this will then translate into optimizing their investment and cut costs for commercial customers. She will also deep dive into the Z VR Backpack, the first wearable and untethered VR PC.
While the technology has been around for 20 years, the professional use of VR is nothing new; however with advanced graphics technologies from NVIDIA and the latest Lenovo computer workstations it is now possible to deliver real-world return on investment without the multi-million dollar price tag of just a few years ago. Use VR at every stage of your business and utilize game-changing implementations of AR and VR to take workflows within manufacturing to new heights; stream lining and speeding up production processes, saving both time money and resources. We will be presenting our global VR Strategy, some typical industry references and VR demos using the latest high performance ThinkStation & ThinkPad workstations; powered by NVIDIA VR Ready graphics to deliver the best VR experiences.
Today GPUs are used in many different industries to solve different Problems, from AI over Deep learning to Mixed reality. In this session you will learn about the industry use cases and how the partnership of Microsoft and NVIDIA is enabling your digital transformation. We will define what Mixed Reality is and what you can learn from Game Developers creating visual high-quality content for this platform with the help of Azure and NVIDIA GPUs. And we will share with you how you can engage with industry experts from Microsoft and NVIDIA to help you to transform your Business.
EU new privacy regulation requires all companies to understand what personal information they hold about EU citizens and they need consent from each person to keep holding such information. In this session we learn how to apply Deep Learning on content (i.e. documents and records) to enable efficient GDPR discovery. Elinar has built repeatable AI based solution that uses NVidia GPUs for per customer learning and inferencing workloads. This session will cover common challenges for building discovery pipeline and guidance on how to scale the solution for high volumes.
Bring your ideas to life with NVIDIA Holodeck, the worlds first intelligent, photorealistic, and collaborative virtual reality platform. With Holodeck designers will be able to visualize large, highly detailed models and explore them in photo-real fidelity in real-time. Design teams can collaborate on these complex models remotely to discover new ideas, streamline reviews, and minimize costly physical prototyping. Holodeck even promises to tap into AI to accelerate design workflows and complex simulations. Come hear the talk and then experience Holodeck demos in the VR Village!
AI systems are all the rage, autonomous cars and cooperative robots seem to be around the corner. However, in the open field of public life we need to design these artifacts in a way that people can not only interact, but cooperate with them. In this talk, the concept of Informed Trust will be discussed as a model to enable a joyful and safe cooperation between men and machine.
We are already surrounded by intelligence-based user experience such as home AI digital assistants, smartphones that provide suggested actions and contents, shopping bots that propose items to buy based of shopping patterns, etc. Interactions with machines around us are quickly becoming the norm. In-vehicle user experience needs intelligence not only to delight the user with a truly personalized experience and to simplify repetitive actions but also to minimize cognitive load and to decrease driver distraction. The latest Mercedes-Benz head-unit was designed and build with this demand in its DNA. Driver behavior and interactions are analyses in real-time to predict what the driver will do next, using machine learning algorithms developed in-house.
Path planning is one of the key functional blocks for autonomous vehicles constantly updating their route in real-time. In this talk we present the main ideas behind our energy-efficient, parallel, nearoptimal path planner. Approximate path computation has proven a promising approach to reduce total execution time, at the cost of a slight loss in accuracy. Due to the fusion of environmental information with the kinematics of the vehicle, the safety of the mission is always guaranteed despite the non-optimality of the path. Furthermore, we show how we can ensure efficient use of embedded GPU resources (NVidia Tegra X1), through program transformations. Lastly, we introduce the predictable execution model (PREM) and its potentiality when applied to our planner.
Re-Inventing the Scientific Method: How Artificial Intelligence is Revolutionising Drug Discovery Despite the huge growth of knowledge and information, the process of scientific discovery has not changed for 50 years.? In drug discovery, the current system is not working health systems and services around the world are failing and developing medicines is still a very lengthy, risky and expensive process with the cost of developing a new drug conservatively estimated at $1bn with less than 1 in 10 drugs entering the clinic making it to market. Re-Inventing the Scientific Method: How Artificial Intelligence is Revolutionising Drug Discovery
LEVERTON develops and applies deep learning technology to extract, structure and manage data from corporate documents in more than 20 languages. Learn how we leverage deep learning and NLP to solve problems of Optical Character Recognition (OCR), Document Classification, and Information Extraction to turn unstructured documents into structured data. Through the application of AI technology, data quality can be improved, processes accelerated, and efficiencies increased significantly. Find out how global organizations in real estate, finance, accounting and law save time and money - powered by deep learning technology.
Roborace is the world''s first driverless electric racing series providing an extreme motorsport and entertainment platform for the future of road relevant technologies. This session will explore how the platform will help teams to develop level 5 autonomous software that will one day make it to our roads by pushing it to the limits in extreme yet safe environments around the world.
NMT is often performed using sequence to sequence modeling, where the input is a sequence of variable length tensor representation of a sentence in source language , and the output is the another variable length tensor representation of target language. Sockeye project, a sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet Incubating. It implements the well-known encoder-decoder architecture with attention. The talk covers LSTM networks, NMT fundamentals, an overview of how to use Sockeye for implementing translation tasks, and areas of active research for those who are interested in further study of the subject.
As GRID and Quadro vDWS environments grow over time, the potential for mixed versions requires thoughtful planning and preparation. This session will explore customer use cases where various versions of GRID and Quadro vDWS are needed and present potential solutions for building and supporting these environments. Also covered will be the resources available to build and support your own mixed environment. Presenters will be offering an extended Q&A time to allow for discussion and to capture your questions and concerns.
In this session, attendees will learn more about the benefits that a GPU adds to VDI-workloads based on Windows 10- and Office 2016. It is a follow-up to my session from last year on Windows 7- and Office 2013-workloads and their gain from GPUs.
Deep learning is today spearheading a revival in the field of artificial intelligence. Covering a diverse range of industries, such as, car manufacturing, e-commerce, social media, computer software and hardware makers (including HPE), renewable energy and search engines. Deep learning is penetrating into many industry verticals and changing the way they all do business. Yet, any business looking to adapt deep learning, faces inevitable questions for which there are no obvious or immediate answers. Will they need different configurations for different problems they face?
In the work we present here, we dynamically generate code to perform Monte Carlo simulation in the context of a large existing code base for computing prices of financial derivatives where the computation is specified at run time by the user. We JIT compile the generated code and execute it on a GPU. We observed speedups of up to about 2x with our method.
Virtual testing is the key to the development of ADAS and HAD systems. Research projects on national (PEGASUS) and European (Enable-S3) level have been setup explicitly to define methods and quality criteria for the testing of HAD functions and identify the virtual domain as one of their top priorities. As vehicles depend increasingly on sensors like LIDAR, RADAR and SONAR, an accurate representation of these sensors for test and validation purposes is mandatory. Sensor data will flow into deep learning neural networks on Nvidia Driveworks, or it will be used in software-in-the-loop (SiL) or hardware-in-the-loop (HiL) using Nvidia PX2 for test setups.
Dan Harper of CityscapeVR will explore the huge potential of VR technologies to completely revolutionise business processes in the AEC (Architecture, Engineering, Construction) sector rather than just making small incremental gains. The talk will also explore some case studies showing strong client adoption. VR is here and it is creating a substantial buzz, as well as a lot of questions. Are others using it? ?Is it something I should be using and if so, how? Perhaps most importantly: Why should I be using it? There are many challenges: Technology is moving fast, there are lots of choices of hardware, software etc. Strong opinions are rife across the age brackets. The real question is: How can Virtual Reality and Real-time technologies generate real value within a design business?
Game graphics are maturing: near-cinema quality, on sophisticated APIs, game engines, and GPUs. Consumer virtual reality is the Wild West: exciting new opportunities and wide open research challenges. In this talk, Dr. McGuire will identify the most critical of these challenges and describe how NVIDIA Research is tackling them. The talk will focus on reducing latency, increasing frame rate and field of view, and matching rendering to both display optics and the human visual system.