In this special guest feature, Alexander Khaytin, COO for Yandex Data Factory, explains how businesses can introduce “data democracy” and systematic testing and how agility can be introduced into even the most inflexible of organizations, overcoming the barriers prohibiting machine learning adoption and benefit. As Chief Operating Officer at YDF, Alexander oversees projects from concept to completion, alongside contributing to YDF’s partnership, sales and technology strategies. Prior to joining Yandex in 2014, Alexander spent over a decade providing consulting and strategic analysis services for businesses in telecom, construction, energy, retail and finance industries. As a Partner at a system integrator, Korus Consulting, from 2011 to 2014, he ran projects for some of Russia’s leading brands, including state-owned hi-tech corporation Rostech, Moscow City Telephone Network, mobile broadband services provider and smartphone manufacturer Yota, Bank Saint Petersburg and Russia’s largest e-payment system Yandex.Money.
Machine learning has come to play an important role for businesses looking to transform the way they work. But for those venturing into machine learning projects, several challenges come to present themselves on their road to digital transformation.
Most companies today own data assets but, as is well reported, few are successful in extracting any real business value. Often the scale of a machine learning project is too daunting for companies still at the start of their data-driven journey. This poor track record of success stems from three primary obstacles: data inaccessibility and security concerns, inability or reticence to test and experiment, and rigid business processes undermining innovation. Only once these three obstacles have been overcome, will a business achieve the success promised by machine learning experts.
1. Data inaccessibility and security concerns
While most companies undertaking machine learning projects inevitably own and store vast quantities of data, this data is not always ready to use. With data often siloed in separate storage and processing systems, the aggregation of data can be time-consuming and difficult. Additionally, when extracting data, companies must take data security into consideration with almost all data being “poisoned” by personal or sensitive kind of data. In which case, the obscurification or encryption of data may need to occur before it is inputted into a machine learning model. These difficulties in accessing data present a great challenge for companies wanting to systematically utilize their data assets, but a simple solution exists.
Companies must store personal and highly sensitive data separately to “other” data – this will minimize security risks and reduce the need for data protection. Less sensitive data should then be anonymised and trusted teams given free access. By creating this environment of “Digital Democracy” – a term coined by Walmart – companies will then be able to use the data as and when they need it, and share it both with internal and external teams. This would allow for the fast development and testing of Proof of Concept projects, and easy collaboration with external providers.
2. The inability or reticence to test and experiment
Companies that are looking to harness business opportunities through machine learning will quickly find out that testing and experimentation are an essential part of machine learning technology. With no way to explain the algorithm, you have to test it against the real data to prove its efficiency. In certain cases a historical test on the past data is possible. However, when it comes to prescriptive analytics, the measure of business impact can only truly be assessed by actually applying a machine learning model in the real business process. For most companies, often at the start of their digital transformation, the prospect of launching large scale machine learning projects which haven’t already demonstrated their value in previous trials can be daunting. As such willingness to adopt such projects is slow as enterprises prove less willing to venture into the unknown.
To overcome this barrier and ensure the smooth adoption of machine learning technologies, companies should foster an experimental culture and provide the infrastructure to support it. Isolated sandboxes should be created to allow different teams to operate and test differing approaches in order to prevent “Digital Democracy” from turning into “Digital Anarchy”. At the same time, any data transfer to the operational IT systems should be restricted and controlled.
Such infrastructure will allow for the creation of a real testing environment, whereby different tools and approaches can be tested and compared against each other to ensure the best one goes into production. Taking “Next Best Offer” marketing as an example through which a recommender system suggests the item most likely to be bought by a specific customer, only a field test will be able to prove how effective this recommender system is through the analysis of actual purchases made by customers compared to ones made by a control group.
When a company allows for open access to depersonalized data, different teams can develop competing models. The company then splits their audience in the required number of representative groups, and exposes “streams” of the customer base to competing solutions. When the results are evaluated, the best approach can then be adopted across the board and a small section left to operate on a different mechanism to allow for continuous comparisons should the preferred approach come to fail.
3. Rigid business processes undermining innovation
If companies wish to succeed in the data science field, they must first abandon rigid business processes in favor of flexibility. Just like any other disruptive technology, machine learning requires change to mindsets, skill-sets and infrastructure. It cannot come to exist, flourish and succeed without breaking the habitual rules and processes – and to overcome this challenge, there is only one answer: agility.
There are no beaten paths with machine learning yet: the technology is new, the success is not guaranteed and the experimentation is crucial. By ensuring agile and flexible business processes, companies will spend less time, effort and money on unsuccessful projects. Failing fast and learning fast and allowing for continuous comparisons and quick test projects will enable companies to build upon their experience towards a robust machine learning strategy.
While most companies today realize the potential that lies in the data they collect and store, they are often left encumbered by their big data, unsure how to utilize it and extract business value. The separation of “personal” data, eased access, sandbox experimentation and agile processes will help enterprises towards a successful machine learning journey and make this innovation manageable.
Sign up for the free insideAI News newsletter.