Study Finds Data Quality is Still the Largest Obstacle for Successful AI and Greater Human Expertise Needed Across ML Ops Lifecycle

iMerit, a leading artificial intelligence (AI) data solutions company, released its 2023 State of ML Ops report, which includes a study outlining the impact of data on wide-scale commercial-ready AI projects. The study surveyed AI, ML, and data practitioners across industries, and found an increasing need for better data quality and human expertise and oversight in delivering successful AI. This is especially true as powerful new generative AI tools and continuous improvements to automation are rolled out at an increasingly rapid pace. 

The world of AI has changed dramatically over the past year. It has evolved out of the lab, entering the phase where deploying large-scale commercialized projects is a reality. The study shows true experts in the loop are needed not only at the data phase, but at every phase along the ML Ops lifecycle. The world’s most experienced AI practitioners understand that companies turning to human experts-in-the-loop achieve greater efficiencies, better automation, and superior operational excellence. This leads to better commercial outcomes for AI in the future. 

“Quality data is the lifeblood of AI and it will never have sufficient data quality without human expertise and input at every stage,” said Radha Basu, founder and CEO, iMerit. “With the acceleration of AI through large language models and other generative AI tools, the need for quality data is growing. Data must be more reliable and scalable for AI projects to be successful. Large language models and generative AI will become the foundation on which many thin applications will be built. Human expertise and oversight is a critical part of this foundation.”

The report highlights survey findings in four key areas: 

  • Data Quality is the Most Important Factor for Successful Commercial AI Projects – Three in five AI/ML practitioners consider higher quality data to be more important than higher volumes of data for achieving successful AI. Additionally, practitioners found that accurate and precise data labeling is crucial to realizing ROI. 
  • Human Expertise is Central to the AI Equation – 96% of survey respondents indicated that human expertise is a key component to their AI efforts. 86% of respondents claim that human labeling is essential, and they are using expert-in-the-loop training at scale within existing projects. The use of automated data labeling is growing in popularity, and there is still need for human oversight, as the report finds that on average 42% of automated data labeling requires human intervention or correction. 
  • Data Annotation Requirements are Increasing in Complexity, which Increases the Need for Human Expertise and Intervention – According to the study, a large majority of respondents (86%) indicated subjectivity and inconsistency are the primary challenges for data annotation in any ML model. Another 82% reported that scaling wouldn’t be possible without investing in both automated annotation technology and human data labeling expertise. 65% of respondents also stated that a dedicated workforce with domain expertise was required for successful AI-ready data. 
  • The Key to Commercial AI is Solving Edge Cases with Human Expertise – Edge cases are consuming a large amount of time. The report finds that 37% of AI/ML practitioners’ time is spent identifying and solving edge cases. 96% of survey respondents stated that human expertise is required to solve edge cases.  

Sign up for the free insideAI News newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideAI NewsNOW