In this special guest feature, Scott Clark, Co-founder and CEO of SigOpt, discusses why measurement should be the first step of any deep learning strategy. Before SigOpt, Scott led academic research for Yelp’s Ad Targeting team, which included innovative outreach projects like the Yelp Dataset Challenge and open sourcing MOE. Scott holds a PhD in Applied Mathematics and an MS in Computer Science from Cornell University, as well as BS degrees in Mathematics, Physics and Computational Physics from Oregon State University. Scott was chosen as one of Forbes’ 30 under 30 in 2016.
In machine learning and deep learning, identifying the right metric before experiments begin is essential. Machine learning models are built to achieve whatever programmers identify as the target, but if that target doesn’t map back to a business goal, the model will fail to provide real value. When applied to the right metrics these models can provide incredible business value, but when optimized towards the wrong goals they can also cause incredible business loss.
Take driverless cars for example. Deep learning models play an essential role in analyzing data and directing the car on where to steer. Yet if those models aren’t pointed at the right metrics, it could be detrimental. Uber’s engineers could identify getting from point A to point B as fast as possible as the target metric, and the machine will find the fastest route. However, that outcome could result in the car causing many accidents along the way or driving in a way that makes all the passengers sick.
On the flip side, trying to optimize too many metrics at once can be cumbersome, and difficult to determine real value. Identifying a specific, well defined metrics can help organizations make consistent decisions and give teams a single target to work toward. Companies looking to deploy deep learning should keep the following best practices in mind to determine their key objectives:
1. Connect the objective to a business goal
The best objective is one that, when maximized, will have a direct impact on overall business goals. For example, Microsoft recently wanted to use machine learning to increase Bing’s ad revenue. The company ran an experiment that increased ad relevance and decreased search accuracy–it literally made searches less effective for users. Ultimately, this increased the frequency of searches and people clicked through to more ads since their relative targeting had increased compared to the search results. While this did succeed in increasing the objective metric, this short-term revenue bump was offset with users abandoning the search engine. Businesses should identify what their business goals are – with the right prioritization – and tune their machine learning models to drive back to those goals.
Another potential pitfall is to tune towards academic metrics instead of business value. A fraud detection model could look great through the lens of AUC ROC or log loss (two mathematical ways to determine the accuracy of a classifier), but still misclassify a large number of high value transactions. A method that is 99% accurate may not be favorable if it misses the 1% of most costly fraud. These academic metrics are extremely useful in developing new methods and proving different aspects of algorithms, but the can be misaligned with actual business value. Blindly optimizing towards a goal without knowing the impact can lead you towards building a system that ends up costing you a lot of business value.
2. Avoid overfitting
Machine learning models also need to be trained and tuned to accurately hit their targets. However, if the model is over-optimized for training data, it won’t perform well when new data is presented, rendering the model ineffective in practice. To ensure the models don’t overshoot the target and can process new, unseen data, researchers and data scientists should apply techniques like cross validation and regularization to their models. This keeps the model from effectively “memorizing” the training data without actually discovering the underlying patterns in the problem. This ensures the model is correctly optimized and prevents overfitting of the model to the training data.
Another aspect of this is the concept of robustness. The model should be resilient to slight changes in the data or environment in which it operates. A model could work extremely good offline, but often simplifying or static assumptions made offline can cause you to build a model that quickly fails when faced with the real world. An example is when building deep learning systems in adversarial environments like algorithmic trading or advertising. Not only will the real world data be different than what you trained on (need to avoid overfitting), but also assumptions made about the environment will be constantly changing (the models you are trading or bidding against). A model built in isolation without taking these aspects into account can look good on paper, but quickly cause problems once deployed. An example is the trillion dollar flash crash of 2010 where cascading algorithmic decisions led to a massive market shift in less than an hour.
3. Always be iterating
As new data, frameworks, and applications become available there is an opportunity to constantly be updating what targets you are shooting for. Older techniques may have required you to define success with purely academic benchmarks, or only optimize towards a single objective. Additionally, as business needs shift you should always make sure that your models stay aligned with those objectives. Revenue may be the most important metric for a model at one point, but user retention or long term value might become more important as the business evolves. In deep learning accuracy may be the most important metric when standing up a proof of concept, but complexity, training time, and inference time may factor in when it is time to go to production.
My making sure you are always pointing at the right target and optimizing toward it you can ensure you’re getting the most value from your deep learning investments, while also making sure the teams developing them are aligned as well. Since there are many steps between a specific model and explicit business value, outlining a clear success metric is essential to know whether or not the model is making progress. Optimizing deep learning measurement with a defined metric is not only valuable, but can also help organizations determine if they need to reevaluate business metrics at an even larger scale.
Sign up for the free insideAI News newsletter.