In this special guest feature, Armon Petrossian, CEO & co-founder at Coalesce, discusses the bottlenecks disrupting enterprises from accomplishing data transformations to support the continued push from on-premises to cloud platforms. Armon created Coalesce, the data transformation tool built for scale. Prior, Armon was part of the founding team at WhereScape, a leading provider of data automation software (acquired by Idera Software). At WhereScape, Armon served as national sales manager for almost a decade.
Data transformations—the process of getting data prepared so that it can be consumed properly—has gone largely neglected until recently.
This is partly because legacy vendors in the space have primarily been focused on moving data from point A to point B (i.e. data integration) rather than transformations. It makes sense that data integration has taken center stage up until now: The ability to access data was number one on businesses’ priority lists. But now that the industry has matured and getting access to data is fairly straightforward, it begs the question: What’s next?
The answer is data transformations. Every business wants to get the most from its data, and it has become clear that data transformations are the biggest bottleneck to becoming data driven.
Let’s take a look at how enterprises have been coping with the data transformation bottleneck, and what possibilities the future might hold.
Companies are struggling to keep up
Currently, there are two main ways companies are transforming data. The first is through GUI (graphical user interface)-based tools built primarily for data integration that lack the focus of transforming data properly. These widget-based transformations compromise flexibility and do not provide documentation or lineage. The second option is newer, code-centric data transformation tools that offer flexibility, but become increasingly more difficult to support enterprise-grade workflows without relying on teams of engineers.
Neither of these strategies can meet the demands of today’s enterprises. While these options may help small companies and startups get off the ground, they can’t scale or meet the long term data needs of organizations in spaces like insurance, finance, manufacturing, and healthcare. Large enterprises have the most complex needs for transforming data, and it is a bottleneck that has gone largely unsolved.
The process of getting value from raw data is painfully time-consuming and requires a disproportionate amount of resources compared to other parts of the analytics stack: Herein is where the bottleneck lies.
If an organization wants to take on additional projects, or accelerate an existing data project, this process will demand even more resources. Within the current paradigm, the only way for companies to fast-track projects is by bringing on more data engineers, architects, and/or analysts. This is costly at best, and may even be impossible due to the critical talent shortage affecting data teams everywhere.
Companies are also employing the aforementioned code-centric data transformation tools to accelerate their initiatives, but oftentimes the benefits of these tools are offset by the skill sets required to use them. Any way you look at it, the process of data transformations is still tedious and difficult to scale.
The role of automation
Every enterprise is looking for more talent to add to its data team. Simultaneously, expectations from the business are only growing when it comes to data. There is mounting pressure on IT teams to figure out how to do things more accurately and efficiently without requiring more talent.
Automating data transformations is a vital step in making this possible. Businesses need to be thinking about how to automate these processes in the same way that they’ve automated the rest of the analytics stack.
Automation has already impacted many facets of data: Snowflake’s automated performance tuning revolutionized database administration; Fivetran’s data integration tools made it incredibly easy to move data; BI tools like Tableau empowered anyone to build a dashboard. Enterprises should apply this same thinking towards data transformations: Where can we use automation to make transforming data as easy as possible and also support an enterprise scale?
Efficiency for everyone
In addition to automating data transformations, enterprises need tools that are flexible and allow data teams to work in the way that’s most effective for them.
For example, many companies would historically transform data via GUI-based tools. Although these legacy tools featured what was seemingly an intuitive interface, we’re now seeing that these GUIs lacked flexibility.
Feeling burned by GUIs in the past, some data teams opted to steer clear by using more code-centric tools, thinking this was the only solution. But organizations who’ve taken this programmatic approach have run into massive issues with time constraints, shortage of resources, and governance problems.
Thankfully, the new era of tools won’t make companies choose between GUI and SQL. GUIs will provide enterprises with the power of automation, scalability, and an intuitive interface that’s easy to interact with, while the ability to code will lend flexibility and the option to customize, edit, and create a standard that scales.
Overcoming the bottleneck of data transformations will require automation and flexible, modern tools. Companies need to be more data driven than ever to keep up in today’s world, and data insights are essential for empowering organizations to make decisions and innovate faster. Between the growing pressure on data teams and the current talent shortage, now is the time for enterprises to adopt solutions that will ease these burdens and propel the business forward.
Sign up for the free insideAI News newsletter.
Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1