In this special guest feature, Matt David, Editor of the Data School at Chartio, discusses how companies can make data accessible and useful by overcoming the a number of important challenges, which opens the door for business users to think and decide like analysts. Matt believes data has become a part of more and more non-data jobs and he is passionate about making data concepts more easily understood. Chartio is a cloud-based business intelligence and analytics solution that enables everyone to analyze their data from their business applications.
Data is a severely underutilized asset.
In most companies, data sits on a database that’s unclean and rarely queried or analyzed. The data is hard to leverage because of the following reasons:
- The Database/SQL
- Bad data
- Aggregations
Oftentimes only a few engineers or data analysts are confident enough to query data to answer business questions, hindering opportunity and efficiency across the organization. Business users are excluded from being able to ask their own questions, instead having to submit tickets to get insights.
In order for organizations to be informed at all levels, they have to shift how they approach data management and they have to get data right. “We see across the board that ease of use and approachability are a critical and frequently overlooked aspect of successful data tooling. Using a B.I. product is challenging and building one that’s easy to use is even harder. The ability to test assumptions, get regular user feedback and quickly understand what is and isn’t working with a new product is critical to its success,” said Janelle Estes, Chief Insights Officer at UserTesting.
Companies can make data accessible and useful by overcoming the following challenges, which opens the door for business users to think and decide like analysts.
The Database / SQL
First, make a read-only replica of your production database so that when you open it up to more users and their queries, you do not impact the performance of your application. Second, SQL is a relatively simple query language. By holding regular or semi regular internal trainings around how to query your database, you will bring many more people into the fold. People are much more motivated to try and learn if they are doing it with their peers, instead of alone at home.
Another strategy to overcome the intimidation of writing SQL is to adopt a flexible BI tool that is built for both business users and data experts. Traditional Enterprise solutions require sizable teams of data analysts and scientists to set up, so look for a platform that supports any type of user to query and visualize data easily. These are the type of BI tools that build data confidence more quickly, rather than having to export tables into excel before being able to manipulate them.
Bad Data
Companies have a lot of bad data. So even if you train people on how to use SQL, there is still a significant gap between querying a clean table versus a messy one. A few examples of this are:
- Nulls
- Duplicates
- Data types
- Manually entered fields
These will produce unexpected results or potential errors even if the query is syntactically correct. There are three approaches to overcome this. First, you can provide more training on how to detect these issues in the data when querying. Second, you can try and document all the places these issues exist in the data. Or third, you can clean the data prior to providing it to everyone else.
Go the cleaning route. Education and documentation are both important but you have to get the sequence right and having clean data takes priority. By doing so, you can eliminate the repetitive processes you’d have to implement before each query to ensure your metrics are accurate. This is the best way forward. Once your data is dummy-proof and clean, education and documentation become more powerful and evergreen as they are based on accurate information.
Aggregations
The last step to making data accessible is to make sure people understand the data they are querying. Jonathan Rosenberg, former SVP of products at Google, quote from 2018 still reigns true, “Data is the sword of the twenty-first century, those who wield it well, the samurai.” In most cases data will be aggregated in some way due to the total amount of data being analyzed. You may have a million rows of transactions but you want the sum of the amounts or the average. At first this seems easy enough, but without digging deeper people can come to false conclusions.
Aggregations mean compressing all of that data into a single value and that value can be quite misleading. For instance, averages can hide the fact that the values are not normally distributed.
Do we really think the average, which would fall in the middle, accurately represents the underlying data?
To overcome this companies need to continue to educate universally — that means training business users like you would train an analyst. Teach everyone to look at the distribution behind every aggregation. It is also important to look at aggregations over time. Being an informed company goes beyond just accessing data, it’s about understanding its’ narrative and being able to define the story the data is telling you.
And it is important to segment the aggregation.
After looking into an aggregation in these ways you can be much more confident in using the single number to communicate an insight.
Conclusion
Making data accessible to everyone in your company enables all business users to get the answers to their questions without burdening data or IT teams. Setting up a data program requires some technical implementation and education, but the payoff is worth it. You’ll become a company capable of making informed decisions at every level, across every business unit. Get your data in a read only database, preferably attach a BI tool, and conduct trainings on how to use SQL and how to avoid analysis mistakes.
Your employees want to ask business questions to do great work, help make it easy for them to do so.
Sign up for the free insideAI News newsletter.
Matt,
Nice article
Joseph Yacura
Founder – International Association for Data Quality, Governance and Analytics
http://www.iadqga.com