The Shift to NLP-powered BI Will Unlock Its True Potential for Business Users

With thousands of digital-first companies amassing terabytes of data, there’s an urgent need to empower marketing, sales, and revops teams to derive the insights they need to conceive, launch, and measure the crucial initiatives that drive revenue. They need access to that data in a way that doesn’t require a degree in computer science. In response, numerous technology vendors focused on data analytics and business intelligence have emerged to provide operational teams with tools they need to make sense of and derive meaningful insights from that data.

Even though the first mention of business intelligence dates back to 1865, it only first became widely adopted in the early 1970s after the invention of relational databases by Edgar Codd and SQL by Chamberlain and Boyce. Then, due to performance reasons, data had to be stored in OLAP cubes. It was mostly used for business-critical data like finance, and it required a lot of technical expertise for slicing-and-dicing cubes.

In the 90s and early 2000s, companies such as Qlik and Tableau introduced new visual interfaces for creating reports and charts. Data started to be used more in other departments, mostly for strategic numbers, and so Balanced Scorecards and KPIs were introduced as important management tools. And, for every new group of metrics, a new dashboard was introduced—over and over again.

Dashboard creep

Despite the rise in popularity, it remains technically challenging and time consuming to create dashboards. It’s a task reserved for experts who understand the data model and are responsible for translating specific business team requests into simplified dashboards—or into a simplified table view. Because dashboards are typically created for a specific purpose, they offer a relatively static, passive viewport into data and are constrained to a particular need that arose at a specific point in time.

All this effort is tolerable when data analytics are needed for overall strategic purposes. After all, creating a new dashboard may include an important discussion on what are the relevant metrics that the company should be focusing on in the respective month, quarter, or year. But as data becomes ubiquitous and the realms of data usage extend beyond strategy and into daily operations, the speed and ease with which users can interact with data and find answers becomes even more critical.

And it is here that dashboards struggle to meet the needs of modern business users, especially as businesses strive to be agile and respond to a highly dynamic business environment. Dashboards are limited, so business teams need to request new charts way too often to address a new series of constantly changing questions.

Which means data teams need to create more and more dashboards. To save time as support ticket times explode, they start investing in ever more sophisticated dashboards – with filters and functionalities for a wider array of cases. Eventually, the enterprise amasses thousands of reports and dashboards—many of which are soon outdated. And all that time and effort becomes a colossal waste of resources. Is there no better way?

The information-complexity trade-off

Before we can discuss a solution, we need to get to the root of the problem: the nature of information itself. Unlike traditional natural resources, information grows exponentially, virtually without an end, limited only by storage (and even that is cheap). And while data is clearly a powerful, valuable resource, the question is: how do you use it? How do mere mortals interface with and derive value from something as vast and ineffable as data? How do you empower (rather than overwhelm) users with a fast and easy way to access the information that they need without distracting them with all the surrounding noise?

This is an endless struggle, and it is one designers, engineers, and product managers struggle with daily. They need to decide to either limit the amount of information (and power) that they give users, or they risk overwhelming people with too much information that they won’t absorb. Think about how few news articles appear on the cover of a newspaper as opposed to the glob of classified ads; or how few buttons a Tesla has as opposed to an Airbus A380 cockpit. That’s the information-complexity trade-off: an interface to information can’t be both simple and powerful at the same time. When it comes to business intelligence, if you want to provide powerful, flexible access to the entire database, you will end up with an absurd number of drop downs, filters, charts and texts. Which is anything but simple.

But is this always true? Or can we empower users to access information without overwhelming them?

Ask me a question — any way you like

Questions are the most natural way for us to request information. Questions are powerful, they were one of the most important drivers of progress in human history. Yet, they are simple; everyone knows how to ask questions—it is how we are evolved to think about the world and learn new things.

Questions—and more broadly speaking, natural language—is the only paradigm that overcomes the information-complexity trade-off. That’s why search engines came to dominate the internet, providing the most powerful way to browse through the infinite world wide web, and that’s why the next generation of business intelligence technologies will be based on questions.

People should be able to ask their database a question the same way they would normally talk to their data team.

Unfortunately, that is no easy task. The system needs to be able to deal with what makes natural language so natural—its vagueness, ambiguity, its implicit meaning—and still understand our intent.

And understanding the true intent behind a question is a hard nut to crack. Sometimes, it is impossible. If someone asks the percentage of Brazilian customers with more than two orders this year, does she mean “among all the Brazilian customers, how many had more than two orders?” Or “of those that had more than two orders, how many are Brazilian?” Or it could even be “of all our customers, what is the percentage that are Brazilians and have more than two orders?”

From a product perspective, it’s tempting to solve this problem by requiring the user to learn a specific syntax. For example: “PERCENTAGE OF (brazilian customers with more than 2 orders) AMONG (all customers).” That’s much easier for the system to parse, so if you just provide a syntax guide and some user training videos, the mission is accomplished, right?

Well, no—not if you want to simplify things for users and allow anyone to access data. If you’re not careful, you soon end up requiring users to structure their questions in what effectively starts to resemble a “structured query language” that users must master to get the results they want.

And then there’s another problem we need to keep in mind. Employees have plenty of other business priorities and tasks besides data analytics. Time is everything and if they can’t figure out how to get the answers they want fast, they won’t be happy with the results. Pretty soon, that means no users.

Throughout our development process, interacting with users, learning from trial and error and experimenting with competing solutions, we came to the conclusion that there are four main principles that a natural language BI must follow to succeed in the long-term:

  1. It must be designed to understand the users’ way of asking, while helping them formulate their questions in a more precise way without compromising the natural aspect of language.
  2. It must not assume there is always one way to understand the question, but instead allow the user to interactively get to the true meaning behind the question.
  3. It needs to explain to the user what it understood in a way that is precise but understandable.
  4. It must guide the user through the data, giving context and helping the user become more knowledgeable about the data available and what questions can (or cannot) be answered.

Any other approach for the future of business intelligence will result in a solution that will either lack simplicity, power, or a mixture of both. One approach is to build a solution natively around a natural language interface (not as a non-optimized add-on to some existing data product), so that anyone can get real-time access and valuable insights from data in seconds, no coding or frustrating data team dashboard requests required.

What changes when you shift to NLP?

The biggest challenges for the adoption of question-based BI are still organizational. The common perception is that most employees are not data literate and, since they are not able to analyze and interpret data correctly, there must be someone who performs the task of summarizing the most important points. Given the limitations of traditional tools, companies have structured themselves with analysts that do the work of an English-to-SQL translator—effectively, rewriting requests from business teams into database queries. The processes revolve around preparing dashboards and serving the business with the answers they need.

This is not only inefficient but also underestimates the skills of people in the front who are able to come up with critical questions that help the company leap forward.

With the adoption of analytics based on conversational-interfaces, companies can restructure themselves into a more efficient and productive organization. Instead of working as an English-to-SQL translator, it becomes possible to manage data teams in the same way as product teams, building self-service analytics products that work scalably within the organization. Instead of answering data requests on a one-off basis, their effort can be put into modelling the semantics of the data at a higher abstraction layer. This way, everything that they do can be reused and they create value for the company on a long-term basis.

Also, the increased simplicity of such interfaces enables a shift in the skills that companies require. Instead of looking for people that understand SQL and are technically inclined, they are able to focus on the skills that really matter for an analyst, such as asking great questions and being able to take conclusions from data correctly. The pool of professionals that fulfill these requirements is much larger (coming from various fields in humanities) and contributes to a much more productive and diverse company.

Finally, as AI and NLP technology evolves, the capabilities of such natural-language interfaces to data become exponentially more powerful and there will be fewer use cases where traditional interfaces will remain superior. First, because of the information-complexity trade-off mentioned above, but second, because, instead of being static interfaces, these systems will be able to learn from our questions asked, from our data and understand increasingly better the underlying business domain. They will suggest which questions we should be asking and be able to answer more high-level questions, presenting us with a summary of the most important information.

With time, the questions we ask of these systems will evolve to cover the whole spectrum of descriptive (‘what happened’), diagnostic (‘why it happened’), predictive (‘what will happen next’) and prescriptive (‘what should I do about it’) insights. Now that’s a really useful BI system.

About the Author

Marcos Monteiro is CEO and co-founder of Veezoo. Born and raised in Rio de Janeiro, Marcos moved to Zurich to study mathematics & statistics at ETH Zurich. He specialized in computational statistics and artificial intelligence, researching on how the brain encodes information in the primary visual cortex V1, and graduating with distinction from ETH Zurich. Together with João Pedro Monteiro and Till Haug, they founded Veezoo AG with the ambition to make business-critical data easily accessible.

Sign up for the free insideAI News newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1