Juan Sequeda, Principal Scientist at data.world, recently published a research paper, “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases.”
He and his co-authors benchmarked LLM accuracy in answering questions over real business data (from an insurance co.) and found that responses for basic queries were accurate only 22% of the time. With intermediate/expert-level queries, accuracy dropped to 0%.
Looking closer, the results become even more shocking. The LLM answered “What are all the premiums that have been paid by policyholders?” wrong 74.5% of the time. An even simpler question like “How many policies do we have?” had an incorrect response rate of 62.6%.
As enterprises invest in LLMs, these findings paint a bleak picture. Poor accuracy could render LLMs obsolete or worse – lead to erroneous analytics, forecasts, strategic business decisions, etc. There’s a lot of data to explore here.
Sign up for the free insideAI News newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideAI NewsNOW