SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC On-Demand

Deep Learning and AI
Presentation
Media
8-Bit Inference with TensorRT
Szymon Migacz (NVIDIA)
We'll describe a method for converting FP32 models to 8-bit integer (INT8) models for improved efficiency. Traditionally, convolutional neural networks are trained using 32-bit floating-point arithmetic (FP32) and, by default, inference on these mod ...Read More
We'll describe a method for converting FP32 models to 8-bit integer (INT8) models for improved efficiency. Traditionally, convolutional neural networks are trained using 32-bit floating-point arithmetic (FP32) and, by default, inference on these models employs FP32 as well. Our conversion method doesn't require re-training or fine-tuning of the original FP32 network. A number of standard networks (AlexNet, VGG, GoogLeNet, ResNet) have been converted from FP32 to INT8 and have achieved comparable Top 1 and Top 5 inference accuracy. The methods are implemented in TensorRT and can be executed on GPUs that support new INT8 inference instructions.  Back
 
Keywords:
Deep Learning and AI, Tools and Libraries, Self-Driving Cars, Federal, GTC 2017 - ID S7310
Download:
 
 
NVIDIA - World Leader in Visual Computing Technologies
Copyright © 2017 NVIDIA Corporation Legal Info | Privacy Policy