The new MIT Press title “Deep Learning Revolution,” by Professor Terrence J. Sejnowski, offers a useful historical perspective coupled with a contemporary look at the technologies behind the fast moving field of deep learning. This is not a technical book about deep learning principles or practices in the same class as my favorite “Deep Learning” by Goodfellow, Bengio, and Courville that provides a critical survey of the field’s state-of-the-art. Deep Learning Revolution, in contrast, gives a perspective for how the field has progressed since it’s inception, and leads the reader along each turn of a very twisted path. In another perspective, the book reads a lot like a memoir by the author, giving an overview of the advancement of deep learning over the time of his career. Since the 1980s, Sejnowski has worked with biologically inspired neural nets, and now holds the Francis Crick Chair at the Salk Institute for Biological Studies, and is Distinguished Professor at the UCSD.
The Neural Information Processing Systems (NIPS) conferences are a thread throughout the narrative of the book, as the conferences have had a significant influence on Sejnowski and the field as a whole.
Deep Learning Revolution has two intertwined themes: how human intelligence evolved and how AI is evolving,” writes Sejnowski. “The big difference between the two kinds of intelligence is that it took human intelligence many millions of years to evolve, but AI is evolving on a trajectory measured in decades … Deep Learning Revolution tells that story and explores the origins and consequences of deep learning from my perspective both as a pioneer in developing learning algorithms for neural networks in the 1980s and as the president of the NIPS Foundation, which has overseen discoveries in machine learning and deep learning over the last 30 years.”
Part I starts off with a motivation for deep learning along with background material needed to understand its genesis. Part II drills down into learning algorithms used in a number of different classes of neural network architectures. Part III wraps up with exploring the impact that deep learning is having on our lives as well as future impact it may have.
For those unfamiliar with deep learning, the first chapter does a good job of laying out use cases, some well-known, others not so much, including – game playing (Go, Poker), healthcare, algorithmic trading, autonomous vehicles, natural language translation, speech recognition, legal, and eLearning.
The next several chapters go way back to the 1960s and the early dawn of AI and neural networks starting at the MIT AI Lab where readers look at The Perceptron, early cognitive neuroscience and visual cortex connections.
Making many references to his collaboration with deep learning luminary Geoffrey Hinton, Sejnowski then delves into the algorithms such as independent component analysis (ICA) learning algorithms, Hopfield nets, Boltzmann machines (named after Ludwig Boltzmann, the 19th century physicist who was a founder of statistical mechanics), backprop, convolutional neural networks (CNNs), generative adversarial networks (GANs), and reinforcement learning (with tie-ins to neuroscience).
The final chapters look to the future of deep learning and how cognitive computing is seeing a rise in application. We learn about social robots, and facial expression analysis. Chapter 14 “Hell, Mr. Chips” offers a timely review of the birth of new architectures for the computer chip industry to run learning algorithms, whether deep, reinforcement, or other, thousands of times faster and more efficiently. Several more chapters on neuroscience conclude the book.
The book appropriately pays credit to the industry’s leading researchers including Geoffrey Hinton, Yann LeCun, and Yoshua Bengio, and identifies their main contributions. I think its useful to know the stars of any field you dive into. Along the trajectory of the field of deep learning, today’s experts were criticized about their methodologies in the past, only to be vindicated decades later:
Whereas, in convex optimization problems, there are no local minima and convergence is guaranteed to the global minimum, in nonconvex optimization problems, this is not the case,” writes Sejnowski. “We were told by optimization experts that, because learning in networks with hidden units was a nonconvex optimization problem, we were wasting our time – our networks would get trapped in local minima. Empirical evidence suggested that they were wrong.”
I particularly enjoyed the section in Chapter 8 “Limitations of Neural Networks” that addressed the concerns of today’s hot topic of explainable AI: “Although they may give the right answer to a problem, currently, there is no way to explain how neural networks arrive at that answer.”
I also greatly appreciated the book’s “Notes” section at the end of the book which contains a nice bibliography of – seminal papers, books, articles, and conference proceedings – for the field of deep learning dating back to the beginning.
If you’re serious about deep learning, as either a researcher, practitioner or student, you should definitely consider consuming this book. Having many different perspectives on a rapidly evolving field like deep learning can only help you keep pace. I took the book along on my recent trip to the H2O.ai conference in San Francisco and carried it around with me to dip into it in between sessions, as well as later in the Hilton Hotel lobby after the conference was done for the day, and a number of people stopped to ask me about the book. It has a pretty catchy title, and the physical quality of the book is top-rate.
Contributed by Daniel D. Gutierrez, Managing Editor and Resident Data Scientist for insideAI News. In addition to being a tech journalist, Daniel also is a consultant in data scientist, author, educator and sits on a number of advisory boards for various start-up companies.
Sign up for the free insideAI News newsletter.