Unlocking AI’s Potential: How to Build High-quality Data Foundations

“Garbage in, garbage out.” In the rapidly growing field of artificial intelligence (AI), this adage has never been more pertinent. As organisations explore AI to drive innovation, support business processes, and improve decision-making, the nature of the AI’s underlying technology and the quality of data feeding the algorithm dictates its effectiveness and reliability. This article examines the critical relationship between data quality and AI performance, highlighting why exceptional AI cannot exist without excellent data and providing insights into how businesses prioritise and handle data for optimal AI implementation.

AI is forcing many companies to evolve and rethink how to govern and analyze data. A global Gartner survey of 479 top executives in data and analytics roles reveals that 61% of organizations are reassessing their data and analytics (D&A) frameworks due to disruptive AI technologies. 38% of these leaders anticipate a complete overhaul of their D&A architectures within the next 12 to 18 months to stay relevant and effective in the evolving landscape.

Ensuring good data quality is paramount during any AI adoption journey and when building products underpinned by AI technologies, especially when generating actionable insights from the data. Good data is accurate, complete, and well-structured; comes from a reliable source; and is regularly updated to remain relevant. In fast-changing environments, the absence of this quality or consistency can lead to poor outputs and, in turn, compromised decisions.

The quality of the data during initial model training determines the model’s ability to detect patterns and generate relevant, explainable recommendations. By carefully selecting and standardizing data sources, organizations can enhance AI use cases. For example, when AI is applied to managing the performance of IT infrastructure or improving an employee’s digital experience, feeding the model with specific data – such as CPU usage, uptime, network traffic, and latency – ensures accurate predictions about whether technologies are operating in a degraded state or user experience is being impacted. In this case, AI analyses data in the background and preemptive fixes are applied without negatively impacting the end user, leading to better relationships with work technology and a more productive day.

This example of predictive maintenance uses Machine Learning (ML), a type of AI that creates models to learn from data and make predictions, thereby allowing technical support teams to gain early insights. This predictive approach enables proactive issue resolution, minimizes downtime, and enhances operational efficiency.

Unfortunately, not all organizations have access to reliable data to build accurate, responsible AI models. The challenge of poor data quality affects 31% of firms according to a recent ESG whitepaper on IT-related AI model training, highlighting the critical need for robust data verification processes. To address this challenge and build trust in data and AI implementations, organizations must prioritize regular data updates.

High-quality data should be error-free, obtained from reliable sources, and validated for accuracy. While incomplete data and/or inconsistent input methods can lead to misleading recommendations, the impact of poor data can also be felt in further AI implementation challenges such as high operational costs (30%) and difficulties in measuring ROI or business impact (28%).

Concerningly, AI processes any data it is given but cannot discern quality. Here, sophisticated data structuring practices and rigorous human oversight (also called “human in the loop”) can plug the gap and ensure that only the highest quality data is used and acted upon. Such oversight becomes even more critical in the context of proactive IT management. While ML, supported by extensive data collection, can boost anomaly detection and predictive capabilities in, for example, a tech support situation, it is human input that ensures actionable and relevant insights.

Most enterprise IT vendors are introducing some level of AI into their solutions, but the quality and range of data used can differ significantly. Great AI does not just come from collecting data from multiple endpoints more frequently but also from how that data is structured.

An AI specifically designed for IT operations demonstrates this effectively. For example, one such product might analyze and categorize performance data, collected from more than 10,000 endpoints using more than 1,000 sensors every 15 seconds. With this scale of data, ML can efficiently detect anomalies. It predicts future outages or IT issues proactively, while simultaneously enhancing employee productivity and satisfaction.

By plugging this vast dataset into ML, specifically a large language model, IT teams can also efficiently manage large-scale queries using natural language. Examples include analysis of average Microsoft Outlook use or identifying employees who are not using expensive software licenses that were rolled out across the whole organization without regard to whether each employee really needed the software. In effect, the AI becomes a trusty copilot for technology teams, from C-level and IT support agents to systems engineers.

Buyers need to prioritize AI-driven software that not only collects data from diverse sources but also integrates it consistently, ensuring robust data handling and structural integrity, Depth, breadth, history, and quality of the data all matter during vendor selection.

As AI continues to evolve, a foundation of high-quality data remains crucial for its success. Organizations that effectively collect and manage their data empower AI to enhance decision-making, operational efficiency, and drive innovation. Conversely, neglecting data quality can severely compromise the integrity of AI initiatives. Moving forward, organizations must diligently collect and structure vast amounts of data to unleash the full potential of their AI implementations.

About the Author

Chris Round is Senior Product Manager at Lakeside Software, which is the only AI-driven digital experience (DEX) management platform. With an excellent technical background in the end user computing space, from previous roles at BAE Systems Applied Intelligence and Sony Mobile Communications, plus a a natural ability to manage business relationships, he is responsible for understanding customer problems and matching solutions. 

Sign up for the free insideAI News newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insideainews/

Join us on Facebook: https://www.facebook.com/insideAINEWSNOW