SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC On-Demand

AI and DL Research
Presentation
Media
Matchbox: Automatic Batching for Dynamic Deep Learning
Matchbox is an open source PyTorch-based tool that lets users implement their deep learning models as imperative code that applies to individual data samples, then efficiently train and validate them on batched data using GPUs. By automatically keeping track of batch-level masking and padding and rewriting data-dependent control flow, Matchbox simplifies model code, eliminates a class of implementation bugs, and allows programmers to work directly at a more natural level of abstraction.
Matchbox is an open source PyTorch-based tool that lets users implement their deep learning models as imperative code that applies to individual data samples, then efficiently train and validate them on batched data using GPUs. By automatically keeping track of batch-level masking and padding and rewriting data-dependent control flow, Matchbox simplifies model code, eliminates a class of implementation bugs, and allows programmers to work directly at a more natural level of abstraction.  Back
 
Keywords:
AI and DL Research, Deep Learning and AI Frameworks, GTC Silicon Valley 2018 - ID S8977
Streaming:
Download:
Share:
Deep Learning and AI
Presentation
Media
Quasi-Recurrent Neural Networks - A Highly Optimized RNN Architecture for the GPU
We introduce quasi-recurrent neural networks (QRNNs), an approach to neural sequence modeling that provides predictive accuracy equal or better than cuDNN LSTMs while being up to 16 times faster at train and test time than the highly optimized NVIDIA cuDNN LSTM implementation. This is possible by constructing an RNN architecture tailored to achieve high throughput on an NVIDIA GPU using convolutional layers, which apply in parallel across timesteps, and a minimalist recurrent pooling function written in CUDA, which applies in parallel across channels. We'll discuss in detail the design choices of the QRNN, including how to investigate GPU efficiency using the NVIDIA Visual Profiler, and finally our experiments on language modeling, sentiment classification, and character-level neural machine translation that show the advantages and viability of QRNNs as a basic building block for a variety of sequence tasks.
We introduce quasi-recurrent neural networks (QRNNs), an approach to neural sequence modeling that provides predictive accuracy equal or better than cuDNN LSTMs while being up to 16 times faster at train and test time than the highly optimized NVIDIA cuDNN LSTM implementation. This is possible by constructing an RNN architecture tailored to achieve high throughput on an NVIDIA GPU using convolutional layers, which apply in parallel across timesteps, and a minimalist recurrent pooling function written in CUDA, which applies in parallel across channels. We'll discuss in detail the design choices of the QRNN, including how to investigate GPU efficiency using the NVIDIA Visual Profiler, and finally our experiments on language modeling, sentiment classification, and character-level neural machine translation that show the advantages and viability of QRNNs as a basic building block for a variety of sequence tasks.  Back
 
Keywords:
Deep Learning and AI, Tools and Libraries, Performance Optimization, GTC Silicon Valley 2017 - ID S7265
Download:
Share: