IT Operations Analytics

VishnuIn this special guest feature, Vishnu Nanduri of IBM makes sense out of the emerging field of IT Operations Analytics, making optimal decisions using data to drive business value. Vishnu is the Practice Leader for Data Science and Engineering in IT Operations Analytics at IBM. He currently leads a team of talented data scientists and data engineers in ITOA. He holds a Ph.D. in Industrial Engineering from the University of South Florida, and has been an active researcher and academician in the domain of analytics and machine learning for over a decade. Follow him on twitter @drvnanduri. The opinions in this article are entirely those of the author and not of his employer.

Analytics, as defined by the Institute for Operations Research and Management Sciences is a complete business problem solving and decision making process using a broad set of analytical methodologies that enable the creation of business value. To paraphrase, it is ‘making optimal decisions using data to drive business value.’ In this era, where a newspaper column, a blog, or a LinkedIn post cannot be complete without the mention of analytics or big data, it is often difficult for people outside the field to cut through the clutter and learn the true meaning of analytics. For example, my mother, an educated layperson, asked me a few days ago, what I did for a living. She knew I did something worthwhile, or at least that is what she told me. What I told her is what I have documented in this article. In the series of posts, just like other industry experts, I hope to de-clutter this space and add value to this emerging field.

IT operations analytics (ITOA) is an emerging new field in the crowded space of analytics that already includes customer analytics, healthcare analytics, fraud analytics, social media analytics, and mobile analytics to name just a few. ITOA is best explained using an illustrative example. John intends to log-in to his favorite stock trading website (which shall not be named) on a given Monday morning to acquire stock based on some good news he read regarding a company over the weekend. John tries to login and unfortunately the browser displays an error message that he obviously cannot comprehend (assuming he is a regular guy like one of us). He tries a few more times and realizes that the stock trading website is down. John then calls the customer service center and registers a complaint regarding the website login error. The customer service representative in turn registers the complaint, and sends it to the back-end team. This isn’t unusual, so what’s the business value here?

So what’s the big deal?

John is disgruntled, he could not acquire stock at the price he wanted (that stock rose to unprecedented heights over the day), he then logs in to Facebook to vent and spread his displeasure throughout his network; tweet to his followers about the erring website; and in the process, influences a few hundred potential customers to stay away from this company. On the other hand, the stock trading company lost a good chunk of money on brokerage fees; lost John–a valued customer up until the incident, and the several others that John convinced to stay away from them. Now let’s extrapolate this scenario to a few thousand Johns and Janes across the country who tried to log in on that Monday morning. We all know that this scenario is unfortunately not that uncommon. Moreover, it is a prevalent headache for many CIOs and CTOs. Enter ITOA. ITOA aims to prevent such incidents using some smart search techniques and some good old mathematics and statistics. And if those incidents do happen, ITOA can help diagnose the problem very quickly and help prevent future incidents. In summary, ITOA is the science of making optimal decisions using data from IT infrastructure to drive business value both to the company as well as its clients.

Now let’s get slightly more technical. Typical analytics involves dealing with structured data (in some cases). The data may include profit and losses, website views, items purchased, items recommended, customer demographics, etc, which I typically refer to as the front-end data. ITOA, on the other hand, deals with “back end” data, such as log data, application data, server data, and other not so pretty-looking machine-generated data. ITOA derives insights and searches for patterns and generates early warnings for potential incidents (i.e., stock trading application failure) by studying the logs. A log is simply a record of an event that happened. A typical enterprise, like the fictitious stock trading website we have been talking about may generate many gigabytes and sometimes up to a few terabytes of logs each day. These logs are usually incomprehensible to the layperson and sometimes even to the IT experts. ITOA analyzes such logs from various IT elements such as servers, middleware, and applications and so on to help prevent failures. ITOA is a perfect use case of the “Big Data” problem and perhaps the epitome of the proverbial “needle in the haystack” scenario (e.g., searching for the anomalous log messages from terabytes of machine generated data). In these series of posts, I will go over a few techniques we can use to tackle this enormous challenge. Stay tuned…

 

Sign up for the free insideAI News newsletter.