Book Review: Deep Learning by Goodfellow, Bengio, and Courville

deeplearningbookI don’t usually get excited about a new book for the field in which I’ve been deeply involved for quite a long time, but a timely and useful new resource just came out that provided me much anticipation. “Deep Learning” by three experts in the field – Ian Goodfellow, Yoshua Bengio, and Aaron Courville is destined to considered the AI-bible moving forward. Ian Goodfellow is Research Scientist at OpenAI, Yoshua Bengio is Professor of Computer Science at the Université de Montréal, and Aaron Courville is Assistant Professor of Computer Science at the Université de Montréal. Published by MIT Press, the text should be mandatory reading by all data scientists and machine learning practitioners to get a proper foothold in this rapidly growing area of next-gen technology.

I had been anticipating the release of this book since reading an announcement of its impending publication when I was at the NVIDIA GPU Technology Conference earlier this year. I learned that a draft version of the book was available online in HTML-only form at the website: www.deeplearningbook.org (no PDF available due to contract with MIT Press). Since then, I pretty much devoured a number of the book’s sections as an important resource for fundamentals and current research directions.

Deep learning has taken the world of technology by storm since the beginning of the decade, said Yann LeCun, Director of AI Research, Facebook; Silver Professor of Computer Science, Data Science, and Neuroscience, New York University. “There was a need for a textbook for students, practitioners, and instructors that includes basic concepts, practical aspects, and advanced research topics. This is the first comprehensive textbook on the subject, written by some of the most innovative and prolific researchers in the field. This will be a reference for years to come.”

I really like how the book is organized around the following three main sections:

  • Section I: Applied Math and Machine Learning Basics
  • Section II: Deep Networks: Modern Practices
  • Section III: Deep Learning Research

Based on these sections, the book has something for most people. Section I is very welcome, providing the mathematical background for fully understanding the fundamentals of deep learning. You can probably approximate an understanding without the math, but if you truly wish to get the most out of this technology, a basis in linear algebra (including SVD and PCA), probability theory, and numerical computation like gradient-based optimization, will deliver you to a successful path.

Section II is the “meat” of the book, discussing basic topics like hidden units, back-propagation, and regularization for deep learning. There also are chapters on critical methodologies like training deep models, convolutional networks, and recurrent and recursive nets (RNNs). Chapter 11 has a useful discussion of performance metrics including “precision” and “recall,” PR curves and F-score. Chapter 12 sort of wraps up the discussion by offering some clarity of deep learning deployment techniques, including a discussion of GPUs and some use cases like computer vision, speech recognition and NLP.

As an academic myself, I was very pleased to see the inclusion of Section III. I’m always on the look out for new techniques coming out of academia. I routinely scan what’s coming through the arXiv.org pre-print server for machine learning and artificial intelligence. I find that keeping an eye on the research end of the field allows me to see what may be coming through the pipeline as research is converted into production solutions. So that’s why I appreciate this section of the book – important to someone who wants to understand the breadth of perspectives that have been brought to the field of deep learning and push the field forward toward true AI.

Another, maybe unusual, thing I liked about this book is its rather exhaustive Bibliography. At 55 pages, this is a great resource for anyone wanting to know how the field has evolved over time. It includes other books, refereed journal articles, pre-print articles, etc. I think there is educational value in going back to read seminal papers in just about any field, and if you’re serious about deep learning, you might give this a try.

My only problem with the book is that at 700+ pages it represents yet-another time sink for me. Now that I have the book in my hands, I’ll find it hard to resist spending untold hours within its pages. I already have a prime place for it on my desk, right next to my copy of the “machine learning bible,” The Elements of Statistical Learning, and Introduction to Linear Algebra, among a few others. My desk is reserved for only the very best resources!

Dan_officeContributed by Daniel D. Gutierrez, Managing Editor of insideAI News. In addition to being a tech journalist, Daniel also is a practicing data scientist, author, educator and sits on a number of advisory boards for various start-up companies. 

 

 

Sign up for the free insideAI News newsletter.