As multi-GPU deep learning performance improves, the performance of the storage system hosting a dataset becomes critical in keeping these GPUs fully utilized. We survey the different methods for providing training data to a TensorFlow application on a GPU, and benchmark data throughput for a variety of popular neural network architectures. We look at performance and potential bottlenecks for local storage technologies (SCSI SSD and NVMe), high performance network-attached file systems, TensorFlow native connectors (HDFS and S3), and FUSE-connected object storage.
In the era of cloud computing, new approaches to the underlying technologies will be required to fully realize the benefits of AI and deep learning. The combination of IBM Bluemix with NVIDIA GPUs provides the infrastructure foundation for customers to scale their deep learning and AI workloads. IBM Bluemix and cloud hosting partner Rescale will discuss NVIDIAs new P100 GPUs, including benchmark results and how they can enable next-gen AI and deep learning applications.