Scaling Deep Learning

Deep Learning is a relatively new area of Machine Learning research which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Artificial Intelligence. The video presentation below is from the 2016 Stanford HPC Conference, where Brian Catanzaro from Baidu presents: “Scaling Deep Learning.”

During the past few years, Deep Learning has made incredible progress towards solving many previously difficult Artificial Intelligence tasks,” said Brian Catanzaro. “Although the techniques behind deep learning have been studied for decades, they rely on large data sets and large computational resources, and so have only recently become practical for many problems. Training deep neural networks is very computationally intensive: training one of our models takes tens of exaflops of work, and so HPC techniques are key to creating these models. As in other fields, progress in artificial intelligence is iterative, building on previous ideas. This means that the turnaround time in training one of our models is a key bottleneck to progress in AI: the quicker we can realize an idea as a trainable model, train it on a large data set, and test it, the quicker we find ways of improving our models. Accordingly, we care a great deal about scaling our model training, and in particular, we need to strongly scale the training process. In this talk, I will discuss the key insights that make deep learning work for many problems, describe the training problem, and detail our use of standard HPC techniques to allow us to rapidly iterate on our models. I will explain how HPC ideas are becoming increasingly central to progress in AI. I will also show several examples of how deep learning is helping us solve difficult AI problems.”

 

Bryan Catanzaro is a research scientist at Baidu’s Silicon Valley AI Lab, where he leads a team of researchers focusing on scaling deep neural network training and deployment. Before joining Baidu, Bryan led machine learning efforts at NVIDIA, including the creation of CUDNN. Bryan received his PhD from the University of California at Berkeley, where he wrote the first Support Vector Machine training library to run on Graphics processors, and created Copperhead, a Python-based DSL for parallel programming.

Here are the slides used in the presentation:

 

Sign up for the free insideAI News newsletter.