Beyond CRUD – Why Data-Driven Insight is Taking its Next Leap Forward

In this special guest feature, Darin Briskman, Director of Technology, Imply, discusses the history of database evolutions of relational CRUD (create, read, update, delete) from data warehousing into the modern era, and why there is a need for architecture to succeed beyond CRUD. Darin helps developers create modern analytics applications. He began his career at NASA in the 1980s (ask him about rockets!), and has been working with large and interesting data sets ever since. Most recently, he’s had various technical and leadership roles at Couchbase, Amazon Web Services, and Snowflake. When he’s not writing code, Darin likes to juggle, blow glass (usually not while juggling), and working to help children on the autism spectrum learn to use their special abilities to work better with the neuronormative.

Making better decisions using data-driven insight is now front and center in the race for growth and success. Whether the objective is tracking customer behavior, improving R&D or achieving competitive advantage, organizations all across the world are racing to harness the value of data, investing in people, techniques, and technology. 

The value received from data-driven insight has been steadily increasing. According to Accenture, only 3 of the 10 most valuable enterprises were actively taking a data-driven approach in 2008, but now 7 out of 10 do. Across organizations large and small, the annual growth rate achieved by those who are ‘data driven’ sits at over 30% – over 10 times the average business growth rate in the US.

Getting to this point has been an interesting journey. Fifty years ago, IBM published the foundational model for Relational Database Management Systems (RDBMS), with the familiar tables, rows and columns format still widely used to this day. Further innovation, such as the development of Structured Query Language (SQL) made it much easier to manage CRUD (Create, Read, Update, and Delete) data. This made it more practical to build and maintain large data sets, driving growth of databases and computing.

Fast forward to the Internet era and organizations across the world capitalized on ubiquitous connectivity to bring about exponential growth in the collection, storage and management of data. This also revealed the capacity and cost shortcomings of the CRUD processing technologies available at the time.

More recently still, the arrival of cloud computing has broadened the availability of affordable technology infrastructure, bringing with it another step forward in the development and delivery of analytics. In contrast to legacy on-premise strategies, where infrastructure and applications were expensive to implement and scale, the cloud has allowed IT teams to add or remove both compute and storage on demand. The result? Analytics is now both much more scalable and much less expensive as new offerings from upstart vendors now service the increasingly demanding requirements of data-centric organizations the world over.

Cloud computing also enables huge growth in the availability and power of affordable applications that level the playing field so everyone can support huge numbers of users. But this wasn’t enough to enable interactive conversations with high volume data streams from the web, the Internet of things, and other sources.

The answer is to analyze the data stream instead of converting everything to relational CRUD. For this to happen, organizations need a modern database which comes in the form of Apache Druid®, an open source project which has been adopted across a wide range of use cases and teams looking to analyze streams or a combination of stream data and historical data. Its advantages over traditional approaches – from faster time to production, better productivity and performance – has made Druid a leader for modern analytics applications.

Looking ahead

So where are we today and what further developments in data-driven insights can we expect to emerge in the near term? Many organizations are seeing a growing need for solutions that can deliver sub-second response times for questions across billions of data points (and for both historic and streaming data). With perhaps hundreds of people asking questions at the same time, concurrent performance is also crucial to deliver the kind of capabilities organizations need.

These capabilities must be delivered via affordable solutions where cost translates into better decision-making. Even though storage and computing still add an important cost base to any data insight solution, they are not as significant as developer time. Technology advances have rapidly reduced the costs of infrastructure and software, but humans aren’t any smarter than we were fifty years ago, so the percentage of IT costs that pays developers and other humans is continually increasing. Solutions that accelerate developer productivity and help leaders make better decisions more quickly deliver value that matters.

In recent decades, data insight has advanced almost beyond recognition. From relational databases’ pioneering impact to modern data warehousing, there is no let-up in the need for innovation. The CRUD approach, which has served as the foundation for data analytics up to this point is no longer enough and demonstrates why current stream data needs to evolve to meet the needs of organizations in the years ahead.

Granted, there remain relevant uses of analytics with relational CRUD – many organizations still require quarterly and annual reporting, for instance – a requirement that is well suited to CRUD. But increasingly, teams need to conduct meaningful interactive conversations with data and to attempt this with a CRUD data pipeline simply costs too much and takes too long.

The solution is a new class of real-time analytic databases that mix CRUD and streams for high concurrency and sub-second response rates across billions of data points. Organizations implementing these capabilities are already embracing the new era of data-driven insight, powered by technology built from the ground up that meets growing needs for faster, better decisions.

Sign up for the free insideAI News newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1