In this special guest feature, Paul Scully, a Vice President at Grok, believes that sometimes it’s easier to look at what NOT to do in order to find an AIOps solution that will work for your company. Read on to learn more about what to avoid when it comes to finding an AIOps platform that will benefit your company. With 20 years of deep expertise in helping IT Organizations improve the reliability and efficiency of their infrastructure, Grok is intently focused on building the industry’s most innovative platform to bring the best of Machine Learning to IT Operations Management.
As data grows, so, too, does the AIOps market. Forrester reports 68 percent of companies surveyed have plans to invest in AIOps-enabled monitoring solutions over the next 12 months. And Gartner estimates the size of the AIOps platform market at between $300 million and $500 million per year. It poses the question – if you are going to spend millions on AIOps platforms and integrate them into your critical systems, how do you know what to look for?
Sometimes it’s easier to look at what NOT to do in order to find a solution that will work for your company. Read on to learn more about what to avoid when it comes to finding an AIOps platform that will benefit your company.
AVOID: Significant Retooling of Your Current Platforms
If you are looking for significant short-term benefits from an AIOps platform for IT Operations Management (ITOM), you should be wary of solutions that require replacing large portions of your current systems. Organizations that take a “throw the baby out with the bathwater” approach to implementing AIOps find themselves bogged down with too much to do because these projects focus on replacing much of the existing toolset. In reality, this approach only increases the complexity, cost, and timing of deploying machine learning in IT Operations.
Most ITOM systems have evolved over many years, with significant effort already invested to ingest and format data, and then integrate the data with other systems. Similarly, the work queues have also evolved to incorporate a deep knowledge of the event handling process and/or incident management process. Replacing these functions only complicates the adoption of AIOps. You should consider AIOps platforms that can easily integrate into your existing monitoring infrastructure, adding an intelligence layer to the existing footprint. This allows for a much faster deployment time as well as focusing the effort and work on what really matters: results.
AVOID: Locking into a Single ITOM Reference Architecture
There are many AIOPs platforms on the market that are extensions of existing product portfolios. These solutions typically only have good integrations with tools inside their portfolio but tend to discourage integrating outside of the ecosystem if that means replacing one of their existing solutions. This makes it difficult to replace these systems or augment them with best of breed point solutions.
When evaluating an AIOps solution customers should consider solutions that are not beholden to a single vendor’s ecosystem. A solution that is truly agnostic provides much more flexibility and reduced total cost of ownership over time. Think twice about AIOps platforms that:
- Are embedded in an existing event management or ITSM system
- Do not have a stand-alone framework and make it hard to integrate with any vendor’s solution
- Require a rip-and-replace of an existing IT system
AVOID: Approaches That Require Frequent Re-Training
Different AIOPs platforms have different requirements. Different requirements mean your teams have to be trained certain ways. Understanding the objectives of the AIOps platform is important up front since they define what the data focuses on and how the operations team will work with them.
For instance, AIOps platforms that are focused on Service Assurance need to be real-time, are required to scale and must respond in seconds. This type of solution is deployed in an environment where resources are already stretched thin, meaning teams do not have the skill set to conduct constant care and feeding of the platform (nor do they have the time to frequently retrain the algorithms). Make sure you’re looking for an offering that does not require constant manual retraining and that can easily integrate different data feeds.
AVOID: Offerings with a Singular Focus
Many AIOps offerings actually only focused on a single area of artificial intelligence and ingest a single data type. For example, there are countless offerings that are focused on applying machine learning to log data while others are focused on time series data and others events. To be a complete AIOps solution for Service Assurance requires the ability to ingest Logs, Events and Performance Metrics – all of them, not just one. Also, remember that this ingestion of data needs to be done against real-time, streaming data not only historical data.
AVOID: Marketing Messages as Cover for Lack of AI
The term AI has a very broad definition, whereas the term Machine Learning is more focused, and Deep Learning even more so. However, these terms have somehow become interchangeable. They are not. Unfortunately, some vendors have capitalized on the AI boom by adding “AI” to their marketing messages or by adding a very small amount of AI functionality to their existing offering so they can claim their solutions is an AIOps platform. This is misleading at best and deceptive at worst.
One way you can spot deceptive marketing messages is if the offering requires a lot of manual rules. True Machine Learning solutions should not require a long list of rules be built and maintained to implement the solution. Furthermore, pay attention to the types of machine learning algorithms that are deployed in the solution. If there is only one type of algorithm such as limited anomaly detection then chances are the solution has added a minimal amount of AI capability in an attempt to put marketing ahead of technology capabilities.
AVOID: Platforms that Don’t Adequately Scale
Scalability is important, especially for AI systems that have strict time constraints. AIOps systems that run against primary historical data for the purpose analytics tend have less constraints on response times from the machine learning models. However, if the system is focused on real-time data such as in a Service Assurance environment response time becomes very important.
As the business grows so does the data within the organization. When new customers are brought on, they come with new data and potentially new equipment. As new services are rolled out new data is generated and all of this new data must be captured in the AI platform. Sizing the AI platform at the beginning for the data set that exists at the time can quickly result in the system running out of resources causing response time degradation or worse system failure.
Deploying AI within a microservices architecture allows for components to more easily scale on demand. In addition, it allows components to be decentralized and scaled at the component layer versus across all components.
Knowing what to avoid when implementing AIOps is just as important as knowing what to look for. At the end of the day you want a robust platform that operates with various types of data, that does not require significant retooling of your architecture or continual retaining the algorithms, and that can scale as your data increases. Keep focused on the objectives you want to accomplish with an AIOps platform and insist on real technology, not marketing messages or limited add-ons. These real solutions exist and, once implemented, can make a considerable contribution to your Operations team.
Sign up for the free insideAI News newsletter.