In order to give our valued readers a pulse on important new trends leading into 2016, we here at insideAI News reached out to all our friends across the big data industry to get their insights and predictions for what may be coming. We were very encouraged to hear such exciting perspectives. Even if only half actually come true, Big Data in the next year is destined to be quite an exciting ride. Enjoy!
Daniel D. Gutierrez – Managing Editor
Analytics
In 2016, more businesses will see that customer success is a data job. Companies that are not capitalizing on data analytics will start to go out of business, and the enterprise will realize that the key to staying in business and growing the business is data refinement and predictive analytics. The combination of social data, mobile apps, CRM records and purchase histories via advanced analytics platforms allow marketers a glimpse into the future by bringing hidden patterns and valuable insights on current and future buying behaviors into light. In 2016, businesses will increasingly leverage these insights to make sure that their current and future products and services meet customer needs and expectations. — Ashish Thusoo, CEO and co-founder of Qubole
As enterprise knowledge bases grow, the implementation of a consistent and effective metadata strategy is becoming a critical aspect of knowledge
creation, sharing, distribution and enterprise search. Semantic technology is an important cornerstone of any metadata strategy because of its
essential role in various phases of the metadata process. Not only does semantics support creating pragmatic taxonomies based on an analysis
of the available content, it identifies relevant dynamic tags for each piece of content, and ensures strong support in the implementation phase
through automatic classification. — from 2016: Top 10 Semantic Technology Trends that will Help Speed Time to Business Value industry report by Expert System
One could already argue that the ROI of advanced analytics is highest when applied to targeted, vertical market use cases. This will continue to be the case in 2016 and beyond, with manufacturing – particularly regulated manufacturing – leading the way. Advanced analytics platforms will be increasingly relied on not only to uncover insights that help optimize processes, but to verify and validate those insights in accordance with regulatory requirements. — Shawn Rogers, Chief Research Officer, Dell Statistica
Got true Big Data analytics? Likely not. Today, most analytics projects start from the wrong place, end too soon, take too long – and still fall short. The reason: they start with available data sources (and some inferences about them) as the primary constraint. The solution: start with the business questions you want to answer, liberating you from artificial constraints. Moving forward, new best practices and more automated tools (such as enterprise-class metadata cataloging) will enable this approach. — Tamr
Prescriptive Analytics will drive optimization to maximize ROI – The true value of analytics will be realized when ROI is maximized by analytics that tell you what to do. Prescriptive analytics will become more mainstream with tools that recommend actions based on Smart Pattern Discovery. — BeyondCore CEO, Arijit Sengupta
In 2016, automated personalization will be a critical business benefit that big data analytics will begin to deliver in the coming year. Companies will continue to seek competitive advantage by adopting new big data technologies and allowing machines to simulate subjective ‘squishy’ data – including human communication cues such as nonverbal behavior, facial expressions, tone of voice. Big data analytics makes this possible by assimilating vast amounts of information, including the types of data that were too slow and expensive to collect and analyze in the past, such as communications and case records for knowledge workers. As the machines get better at interpreting a variety of data types (so-called ’unstructured’ data) and collating it with vast quantities of structured data, they can begin to improve and accelerate both employee-owned business processes and customer-facing experiences. Employees’ work can be recorded and compared with what is considered ideal and can then receive personalized decision support so to execute their tasks faster and more effectively. For example, when looking at the customer service function, big data analytics enables a frictionless experience providing full issue context and history with every point of contact and increasing customer satisfaction. — Lewis Carr, Senior Director Transformation Solutions Marketing at Hewlett Packard Enterprise
Converged Approaches Become Mainstream. For the last few decades, the accepted best practice has been to keep operational and analytic systems separate, in order to prevent analytic workloads from disrupting operational processing. HTAP (Hybrid Transaction / Analytical Processing) was coined in early 2014 by Gartner to describe a new generation of data platforms that can perform both online transaction processing (OLTP) and online analytical processing (OLAP) without requiring data duplication. In 2016, we will see converged approaches become mainstream as leading companies reap the benefits of combining production workloads with analytics to adjust quickly to changing customer preferences, competitive pressures, and business conditions. This convergence speeds the “data to action” cycle for organizations and removes the time lag between analytics and business impact. — MapR CEO and Cofounder, John Schroeder
Modern analytic workloads require modern analytic databases, and IT leaders who’ve been clinging to familiar DBMS systems—over half of whom are already feeling the strain—will openly recognize this in 2016. Oracle, Teradata, Netezza are irretrievably broken and it shows (this is exemplified by Oracle missing a succession of earnings estimates over the past year+). In 2016, we’ll see that: (i) the “big data” database wars will end with columnar analytic databases winning; (ii) the gates will be thrown open on migration of enterprise BI to Hadoop; (iii) IT leaders will finally give up on Impala and Hive, embracing the 7 Stages of Grief and accepting that they just aren’t working for production workloads. — Actian CTO Mike Hoskins
Simply put, the term hyper-distributed data environments describes the large volume of data within an organization’s environment and the wide array of locations in which that data exists (i.e. distribution). In almost every industry, data is being created in places it never has before, which is producing these “hyper-distributed data environments.” But data itself is no longer the number one problem; connected data is the problem. It is becoming increasingly difficult to reach that data, secure that data, much less draw insight and enable a person or process to take action on the data. For example, data created by a retailer’s in-store video camera can be highly beneficial to learn about customer behavior and buying preferences in real-time. Yet, a store employee can only act to help influence a purchase if he/she is empowered with insight while customers are in the store. To overcome this challenge, organizations need to add edge analytics to their existing strategy, analyzing data close to its source instead of sending it to a central place for analysis. — Mike Flannagan, Vice President, Data and Analytics, Cisco
Organizations will opt for database technologies that can provide analytics at the same speed as their main business – that can not only process large volumes of transactions at extremely low latencies, but also allow for in-memory analysis and instantaneous decision making. — Leena Joshi, VP of Product Marketing, Redis Labs
Big Data
Big data is no longer a buzzword. It’s mainstream, says Gartner. The definition has changed where big is more influenced by data variety and velocity versus volume. When it comes to volume of data, big is relative. In 2016, companies will move away from irrelevant data noise, acknowledge that the variety and speed of data can be daunting, and will take a more thoughtful approach to analyzing “useful” data to reach fast, meaningful, holistic insights. Rather than investing time and money in IT infrastructure to manage high volumes of data, the trick will be managing the data diversity and speed at which data streams to glean valuable insights and to do something worthwhile with them. — Sharmila Mulligan, CEO/founder of ClearStory Data
The veracity and freshness of data will be a top concern in 2016 as self-service data analysis solutions proliferate. Based on a fresh Strata + Hadoop World poll from October (2015), only 12 percent completely trust pre-wired data analysis dashboards to be accurate, timely, and up-to-date. More self-service data discovery will require more data governance, including an ability to audit who has access to what data and insights collaboration and what they did with it, the quality and timeliness of data blends, and whether dashboards are dynamic vs. static so data can be “verified as fresh.” Storytelling and data collaboration will be new model for businesses to consume information fast and make decisions. Static dashboards will take a back seat in 2016. — Tim Howes, CTO at ClearStory Data
A collection of 2016 predictions from our friends at Blazent: (i) Chief Data Officers (CDOs) will become the new “It” girl of IT, creating tension in the C-suite; (2) Self-service big data (BDaaS) portals will bridge the gap between data scientists and business analysts, as deep learning insights become more readily available across all areas of the enterprise; (3) Increasing cloud adoption will push major cloud providers to consider regionalized service approaches due to network bandwidth concerns among enterprise customers; (4) Big Data and Machine Learning will leave the “trough of disillusionment” and move to the “slope of enlightenment” on the Gartner Hype Cycle. — Michael Ludwig, head of Blazent’s Office of the CTO
Think you’ve solved your Big Data Variety “data silo” problem by moving everything into a data lake? Think again. Data silos are a people/process/technology silo problem more than anything else. The solution? Smart companies get this, and prevailing enterprise computing infrastructure trends are in their favor. Incremental, “spray and pray” investments in data warehouses and other traditional data analytics infrastructure will start to give way to strategic investments in data systems that are broad in scope (embracing all enterprise silos), provide distributed data infrastructures, use open source software, and support both “native” and declarative languages/SQL access. — Tamr
The Pendulum Swings from Centralized to Distributed. Tech cycles have swung back and forth from centralized to distributed workloads. Big data solutions initially focused on centralized data lakes that reduced data duplication, simplified management and supported a variety of applications including customer 360 analysis. However, in 2016, large organizations will increasingly move to distributed processing for big data to address the challenges of managing multiple devices, multiple data centers, multiple global use cases and changing overseas data security rules (safe harbor). The continued growth of Internet of Things (IoT), cheap IoT sensors, fast networks, and edge processing will further dictate the deployment of distributed processing frameworks. — MapR CEO and Cofounder, John Schroeder
Automation will deliver on the promise of big data: Time and again, we’re seeing big data initiatives fail because of how companies are organizing their data. But in order to capitalize on big data investments, companies need to transform insights into actions. We’re already seeing big data automation being used to streamline and eliminate processes, but in 2016, it will be more widely used to accentuate the unique human ability to take complex problems and deliver creative solutions to them. Google open sourcing its AI engine TensorFlow is a big step in this direction, enabling more companies to apply automation to their big data. — Abdul Razack, SVP & head of platforms, big data and analytics at Infosys
We predict that in the coming year, big data and analytics projects will require unprecedented levels of two things: speed and specificity. Data volumes and sources will continue to explode in 2016, creating a need for “in-the-moment” intelligence that can be gleaned immediately, while still leveraging existing systems and disparate data sources. Up to 80% of companies will adopt new fluid big data infrastructure approaches in the coming year to get a handle on big data. And they will require specific solutions that are tailored to their industry and their particular business challenges to achieve successful business outcomes from big data projects. — Suresh Katta, Founder and CEO of Saama Technologies
Rather than having one large centralized IT team, companies will establish a smaller centralized IT team with individual technical teams to collaborate with each line-of-business group. This allows for a two-tier support structure with universal requests serviced centrally and specialized requests serviced in-situ for speed, goodness-of-fit, consistency and efficiency. Initially, this will complicate the go-to-market approach of big data vendors who need to develop marketing and sales mechanisms to target newly formed LOB-IT groups. — Sean Ma, Director of Product Marketing at Trifacta
Big Data means Big Challenges when it comes to building the right technology stack to leverage this data for real-time insights to deliver real customer value. In 2016, I predict that we’ll see big advances in delivering technology stacks that integrate key components of the distributed data tier. These stacks will take away much of the technology complexity making it more consumable in the enterprise and bring operational simplicity to solving this challenge”. — Adam Wray, CEO, Basho
The idea of storing all aspects of information and deriving business intelligence is revolutionizing the IT economy. However, with the Big Bang of information, data complexity and scale are stifling this revolution. Salvation can only be found in the simplification and unification of data with one integrated solution for both structured and unstructured data. There have been some steps taken to simplify – like with HTAP practices or what Spark is doing with Hadoop – but they’re just band aids and don’t even hint at an attempt to tackle the true problem of centralize unification. Bringing these two silos of information together will be critical to be able to tap the insights we expect from Big Data. In order to achieve this, the way data is stored and presented needs to fundamentally change. Instead of trying to engineer technologies to layer on top of existing database solutions, a new science is needed to bring structured and unstructured data together from the bottom up. And, that means doing away with traditional data warehousing (which can’t keep up with the pace or variety of Big Data anyway) and storing all data in its native, raw state. A new science, making information singularity (unification with simplification) a true possibility. Possible to collect any kind of data, regardless of form or model, and get exactly what you want out of it, or view it exactly how you want, all in real-time. Only then will all the Big Data insights we keep anticipating be possible. — Thomas Hazel, Founder and CEO of Deep Information Sciences
Big Data has become so ubiquitous and integrated into everything we do — from reading a map to shopping online or in a store — that it is no longer a separate topic of discussion. In 2016, technologists will shift their attention from Big Data to machine learning and providing proactive insights. Active intelligence” will become the new focus, whereby companies will leverage technology like predictive analytics and machine learning to provide solutions that are actively analyzing data 24/7 and alerting us when significant events happen. — Tim Barker, CEO, DataSift
Visualization will no longer be synonymous with self-service. Despite the initial investment in Hadoop, self-service visualization products, and data warehouses and marts, enterprises will continue to struggle with satisfying the analytical requirements of everyday business users and analysts. Organizations will contend with problems ranging from data literacy — knowing how to use the data, analytical productivity — time to discovering the insight, data quality and data availability. Additionally, given the massive volume of new data, more than any organization could reasonably accommodate, enterprises will still struggle with the proverbial ‘needle in the haystack’ problem of stewarding such a large amount of data. A variety of tools will come in to the self-service mainstream including data catalogs to help people discover data, data preparation tools to help people manipulate data and advanced analytical packages to help people leverage the power of data for discovery and prediction. Let the consolidation acceleration begin (or rather, continue). Data infrastructure players will consolidate further in 2016. As technology valuations decline, cash-rich enterprise software companies like Oracle and Microsoft will start to buy up many of the cash-poor data companies, adding accretive growth streams to the mega vendors. Keep an eye on companies with price to sales ratios of less than five as potential targets for this year’s buys. From data sprawl and CDO’s move to the c-suite to an increase in market consolidation, 2016 looks to be a banner year in data once again. As we forge on in the ever-changing landscape, one thing rings true: data’s value in every business decision will finally take hold and the ability to find, transact and interact with data will become analogous to writing an email. So, we cheer to 2015 and welcome the new year with open arms, ready to embrace the data landscape ahead in 2016. — Satyen Sangani, CEO & Co-Founder, Alation
Business Intelligence
In 2016, we will see business intelligence (BI) and analytics reach new heights. As advanced data technologies emerge, businesses will process and store more information than ever before. As a result, they will be looking for a next-generation BI and analytics platform that helps them tap into the power of their data, whether in the cloud or on-premises. This “Networked BI” capability creates an interwoven data fabric that delivers business-user self-service while eliminating analytical silos, resulting in faster and more trusted decision-making. — Brad Pete
BI Adoption WON’T climb above 25%. Currently only 22% of employees actually use BI tools, according to the annual BI Scorecard survey. That’s because despite all the hype around self-service, BI tools still need data experts to bring data to the average business user. The result is that products keep getting more technical and adoption rates don’t improve. — ThoughtSpot CEO Ajeet Singh
Cloud
We will see the convergence of data storage and analytics, resulting in new smarter storage systems appearing in the market. These systems will be optimized for storing, managing and sorting massive petabyte+ data sets. — Michael Tso, CEO and co-founder of Cloudian
In the year ahead, we can expect to see the cloud-based big data ecosystem continue its momentum in the overall market at more than just the “early adopter” margin. Some of the leading enterprises have already begun to split workloads in a bi-modal fashion and run some data workloads in the cloud in the second half of 2015. Many expect this to accelerate strongly through next year as these solutions move further along the adoption cycle. — Ashish Thusoo, CEO and co-founder of Qubole
In 2015, people began embracing the cloud. They realized putting data in the cloud is easy and highly scalable. They also saw that cloud analytics allows them to be agile. In 2016, more people will transition to the cloud thanks, in part, to tools that help them consume web data. Early adopters are already learning from this data, and others are realizing they should. And more companies will use cloud analytics to analyze more data faster. They’ll come to rely on it just like any other critical enterprise system. — Tableau Software
In 2016, the movement of storage from behind the firewall to the cloud will increasingly become a disruptive trend. I predict the big storage catchphrase of 2016 will be “hybrid cloud,” as a large number of companies have mixed storage between their own datacenter and cloud. We see evidence of this in our interactions with clients, and because our sales in this arena are going gangbusters. There is a ‘Battle Royale’ between cloud vendors right now as they compete heavily on price for cloud-based application such as Spark running on Hadoop. — WANdisco’s CEO David Richards
We expect a new category of value-added data services on top of the public cloud, named “database platform-as-a-service” by Gartner and “big data processing as a service” by Forrester, to make major headway now that vendors address data movement, security and other early barriers through end-to-end automation and controls. These services will drive a shift in focus from individual database technologies, such as Hadoop or MPP SQL, to platforms that offer a number of these workload engines in order to provide the best cost and performance across disparate data types and use cases. The responsibility of choosing and applying the appropriate technology for specific types of analysis will no longer rest with the enterprise, but instead be automated by these services. As such, cloud-computing solutions such as Hadoop as-a-Service, Spark-as-a Service and Data Warehouse-as-a-Service will consolidate as Big Data-as-a-Service. — Cazena‘s CEO and founder Prat Moghe
Cloud/on-prem analytics distinction becomes a game changer. Right now, a few cloud-only Hadoop players exist, and other vendors offer rather distinct on-prem and cloud editions of their products. In 2016, as companies recognize the advantage of side-stepping Hadoop hardware requirements, which becomes outdated every 18 months, cloud adoption will surge. Vendors, particularly distributors, will pivot their offerings in order to keep up with demand. — Stefan Groschupf, CEO of Datameer
Salaries for both data scientists and Hadoop admins will skyrocket in 2016 as growth in Hadoop demand exceeds the growth of the talent pool. In order to bypass the need to hire more data scientists and Hadoop admins from a highly competitive field, organizations will choose fully managed cloud services with built-in operational support. This frees up existing data science teams to focus their talents on analysis instead of spending valuable time wrangling complex Hadoop clusters. — Mike Maciag, COO, Altiscale
When it comes to the cloud, enterprises are in an awkward tween stage — somewhere between the old world and new. As we enter 2016, CIOs will continue to adopt cloud applications and seek better ways to connect on-premises systems and the cloud. Hybrid IT is now the reality for many enterprises and many are going through a refresh of their platforms, both business and technology. They are looking for scalable ways to connect and move data to the cloud, on-premises and back again as needed. There is a big emphasis on APIs to unlock data and capabilities in a reusable way, with many companies looking to run their APIs in the cloud and in the data center. On-premises APIs offer a seamless way to unlock legacy systems and connect them with cloud applications, which is crucial for businesses who want to make a cloud-first strategy a reality. More businesses will run their APIs in the cloud, providing elasticity to better cope with spikes in demand and make efficient connections, enabling them to adapt and innovate faster than competition. This summer we surveyed 300 IT decision makers and found that their biggest integration priority for the next year was cloud software and applications. — Ross Mason, founder and VP of product strategy at MuleSoft
With the movement from on-premise email to the cloud already in motion, we’ll see a rise in business applications doing the same. Virtual desktop, collaboration, analytics and back-end office apps (account / HR systems, expensive management systems, etc.) will mark 2016 as the year that the “cloud-first” mentality comes to business. — Intermedia CTO Jonathan Levine
As people use additional phones, tables, PCs and devices — these will spur the commoditization of enterprise storage, leading to price reductions. Basically, the cloud is consumerizing storage. Storage will be like an iPhone — a consumed application with SLAs. You will pay for a service to work as an application, just like you get camera functionality with the iPhone to use with photo or video-related applications. The iPhone is the new cloud — and the place where you merge various services and SLAs into one, single system. The consumerization of cloud storage will see businesses focusing less on managing the infrastructure to contain and hold data and more on building out services that bring value by leveraging the volumes of available stored data. — Druva‘s CEO and Founder, Jaspreet Singh
When I look at the horizon and the future of Cloud (including IaaS, PaaS, and SaaS) for 2016 and beyond, I see several emerging trends. First and foremost, I expect a degree of buyer’s remorse. Everyone is cloudizing their IT operations with such aggressiveness, that there will be a moment of reality coming with subsequent level-setting as companies clear out the clutter and focus on the products and solutions they truly need to be successful. I also see a natural and increasingly fast paced shift to outside providers to deliver the Cloud services foundation – moving the infrastructure, technologies, and applications out of the enterprise facilities. I think new “hot spots” for IT offshoring and solution delivery will emerge in the world. In recent history it has been India, but going forward be sure to keep your eye on Vietnam. Finally, I ultimately see the greater emergence of a public Cloud experience in the enterprise sector around the globe where corporate customers will seek to enjoy the buying experiences and operational agility that the small to mid-sized organizations have been consuming at an unimaginable pace. The next few years promise to be very exciting for Cloud technology and innovation in all sectors, all industries. — Jim Cole, Senior Vice President, Cloud Solutions Practice, Hitachi Consulting Corporation
Databases in the cloud – no Swiss Army Knife — It’s no secret that newer forms of database technology – such as NoSQL and graph databases – are inviting interest from enterprises (and beginning to climb in popularity and adoption) thanks to their ability to handle social, mobile and next generation applications. Enterprises are increasingly engaging in the new era of distributed, big data, cloud-native applications such as customer analytics, security analytics, IoT, and digital advertising. At the same time and in sync with this trend is the emergence of cloud computing as a disruptor to traditional IT – where database administrators see cloud computing as an inevitable and major shift in their organizations. Case in point: no wonder Amazon AWS, the largest and biggest cloud provider in the world, has core of its infrastructure (EC2, EBS, etc.) all running on next-generation distributed databases such as Amazon DynamoDB. So what does the intersection of new database technology and cloud mean? It means that the days of database vendors trying to fit multiple technologies into a single product, and hawk one-size-fits-all systems, are coming to an end. Not only are organizations grappling with new concepts like availability zones, but they are also quickly realizing they must have multiple ‘right tools in the right place’ to proactively monitor and manage increasingly complex, distributed data environments – environments that overlap on-premise and cloud ecosystems. In other words, 2016 will see an increasing need for data management products and solutions (versioning, etc.) that are able to help enterprises manage complexity while simultaneously allowing them to confidently take the leap and fully embrace the next generation applications needed to advance their businesses. — Tarun Thakur, CEO and co-founder, Datos IO
Data Governance
Many people have considered governance and self-service analytics to be natural enemies. Maybe that’s why those people were surprised to see governance and self-service having a drink together. The war is over, and the cultural gap between business and technology is closing. Organizations have learned that data governance, when done right, can help nurture a culture of analytics and meet the needs of the business. People are more likely to dig into their data when they have centralized, clean, and fast data sources, and when they know that someone (IT) is looking out for security and performance. — Tableau Software
Democratization of governance will drive collaboration – Data Governance is no longer just the domain of IT and compliance teams. Today self-service and collaborative data management dictates that everyone has a shared responsibility for ensuring the quality and security of information across the enterprise. Business users will get involved with the quality and governance of data, as partners with IT, by adding value through social collaboration on data sets through the course of their day-to-day activities. — Manish Sood, CEO and founder of Reltio
In 2016, organizations will work to incorporate a culture of compliance alongside traditional corporate values such as profitability, customer service and employee ethics. As they do so, they will face multiple challenges managing compliance with evolving business demands, multiplying communication channels and the unchecked growth of communication data. Demands from customers and employees will increase the adoption of new modes of communication such as social media and online chat. The resulting increase in communication data will enable organizations to to drive innovation and make strategic business decisions. However, as businesses incorporate these new communication channels, senior management and industry regulators will make cybersecurity and data privacy a top priority. Increased scrutiny for supervisory procedures and evolving regulations, such as the new FRCP amendments, will require organizations to adopt new processes and procedures while also demonstrating ongoing vigilance over every communication channel. — Kailash Ambwani, President and CEO of Actiance
Data Science
Data civilians operate more and more like data scientists. While complex statistics may still be limited to data scientists, data-driven decision-making shouldn’t be. In the coming year, simpler big data discovery tools will let business analysts shop for datasets in enterprise Hadoop clusters, reshape them into new mashup combinations, and even analyze them with exploratory machine learning techniques. Extending this kind of exploration to a broader audience will both improve self-service access to big data and provide richer hypotheses and experiments that drive the next level of innovation. — Oracle
Shortage of data scientists will drive demand for automated statistical validation tools – Making conclusions based on trends in graphs without statistically validating results can result in erroneous decisions. Distinguishing real insights from false patterns requires an advanced set of statistical skills, but the US alone faces a shortage of ~200,000 people with this analytical expertise (Source: Mckinsey & Co.). Solutions that offer automated statistical validation will bridge this talent gap and we will see an increased market demand for these automated solutions. — BeyondCore CEO, Arijit Sengupta
To leverage big data, companies will have to overcome an enormous skills gap in the talent market. While Glassdoor reported ‘Data Scientist’ to be one of the best jobs in 2015, we at Absolutdata feel ‘Data Scientist’ to be the job of the century. — CEO of Absolutdata, Anil Kaul
You WON’T hire enough Data Scientists or Data Analysts. The reporting backlog is only going to grow for companies that rely on elite teams of BI analysts. McKinsey predicts that by 2018 there will be a shortage of 1.5M data experts and a recent TDWI survey found that the average business user is waiting 3-4 days just to get a report updated. In fact Gartner says that the number of citizen data scientists, aka non-technical folks who want to experiment on their own with data, will grow five times faster than the number of expert data scientists. So expect that backlog to keep growing. — — ThoughtSpot CEO Ajeet Singh
2016 will see the Chief Data Officer emerge as the catalyst for organizational effectiveness and competitive success. Driven by the increased need for faster and more accurate decision making, the Chief Data Officer will be responsible for harnessing the full value of all of an organization’s information, driving innovation and cost take-outs, improving competitive advantage, and developing a full plan on how to make this data more widely accessible. Businesses can no longer afford to wait for insights to bubble up to the top. By investing in the Chief Data Officer today, companies are prioritizing the overall plan for using data, launching analytics initiatives, and measuring their effectiveness. — Stephen Baker, CEO at Attivio
As the demand for data scientists increases, we’ll see organizations across industries create diverse big data teams to get the most out of data analytics. Instead of searching for the elusive data science unicorn that can do it all, organizations will bring people together from different backgrounds including programmers, those with technical skills in math and statistics and others with an understanding of business needs in order to achieve the greatest data insight. The ability to work collaboratively and creatively with others who reflect a variety of skillsets will be among the most important assets for data scientists’ success. — Bob Rogers, Chief Data Scientist for Big Data Solutions, Intel
Data science (and its technology complex analytics) will break out in 2016. How to integrate this technology into DBMSs will emerge as a major issue in this space. — Mike Stonebraker ─ Tamr Co-Founder/
The CDO will move closer to the head of the table. In 2016, Chief Data Officers (CDOs) will shift their attention from risk mitigation and traditional governance to monetizing data through faster, more reliable insight generation and the development of data science products. A key data-based decision or a new recommendation engine that generates revenue could catapult a CDO into a strategic role, beyond the regulatory box-checking that might characterize their initial stint on the job. Reports will be just the beginning. In years past, the conclusion of an analytics project was often marked by the creation of a report, which decision makers could view and consult periodically. But such reports will be insufficient to drive results and beat the competition in 2016 and beyond. As business leaders yearn to know not just what’s happening, but why, ad-hoc analysis techniques will need to be employed faster and by a wider group than just the technical elite. Organizations with software enabling such data democratization and drill-down will excel. — Aaron Kalb, Head of Product & Co-Founder, Alation
Hadoop
Ingestion into Hadoop. Moving data into Hadoop is a challenge that has been mostly ignored to date. In 2016, we’ll see a greater focus on successfully ingesting data from various sources, leading to faster adoption of Hadoop. As interests shift from early experimentation toward critical use cases, stream processing technologies designed for enterprise deployment, high performance, low latency, scalability and fault tolerance, will gain adoption. — Phu Hoang, DataTorrent CEO and co-founder
Hadoop for Mission Critical workloads. In 2016, Hadoop will be used to deliver more mission critical workloads — beyond the “web scale” companies. While companies like Yahoo!, Spotify and TrueCar all have built businesses which significantly leverage Hadoop, we will see Hadoop used by more traditional enterprises to extract valuable insights from the vast quantity of data under management and deliver net new mission critical analytic applications which simply weren’t possible without Hadoop. — Scott Gnau, Chief Technology Officer, Hortonworks
In 2016, I predict many companies will use Hadoop for wide-scale, large deployments as they move out of a ‘lab’. Gone are the days when Hadoop deployment was limited to small-scale trial environments. — WANdisco’s CEO David Richards
Hadoop projects mature! Enterprises continue their move from Hadoop Proof of Concepts to Production. In a recent survey of 2,200 Hadoop customers, only 3 percent of respondents anticipate they will be doing less with Hadoop in the next 12 months. 76 percent of those who already use Hadoop plan on doing more within the next 3 months and finally, almost half of the companies that haven’t deployed Hadoop say they will within the next 12 months. The same survey also found Tableau to be the leading BI tool for companies using or planning to use Hadoop, as well as those furthest along in Hadoop maturity. — Dan Kogan, Director of Product Marketing at Tableau
Hadoop will begin to impact lean marketing. As Hadoop becomes more accessible to non-data geeks, marketers will begin to access more data for better decision making. Hadoop’s deeper and wider view of data will enable marketers to capture behaviors leading to decisions and understand the processes underlying customer journeys. — Bruno Aziza, Chief Marketing Officer of AtScale
Organizations hit reset on Hadoop. As Hadoop and related open source technologies move beyond knowledge gathering and the hype abates, enterprises will hit the reset button on (not abandon) their Hadoop deployments to address lessons learned – particularly around governance, data integration, security, and reliability. — Dan Graham, General Manager of Enterprise Systems at Teradata
Hadoop will get thrown for a loop – It’s hard to believe that Hadoop is over 10 years old. While interest remains strong and usage is maturing, there continues to be new options that either complement or provide an alternative to Hadoop to handle Big Data. The rapid ascension of Apache Spark, and Apache Drill are examples. We’ll continue to see more options in the New Year. — Manish Sood, CEO and founder of Reltio
Machine Learning
Machine learning will drastically reduce the time spent analyzing and escalating events among organizations. Today’s operations centers struggle with an extremely high volume of events coming in requiring human analysis, which is unsustainable. While machine learning, data lakes and big data were previously seen by many as exploratory projects without clear purpose, in 2016 we will see organizations focus on using machine learning to significantly reduce the number of events requiring analysis down to the most critical. Enterprises will transition from investigating and exploring big data possibilities to becoming laser-focused on business outcomes. — Snehal Antani, Splunk’s CTO
Big data gives AI something to think about. 2016 will be the year where Artificial Intelligence (AI) technologies, such as Machine Learning (ML), Natural Language Processing (NLP) and Property Graphs (PG) are applied to ordinary data processing challenges. While ML, NLP and PG have already been accessible as API libraries in big data, the new shift will include widespread applications of these technologies in IT tools that support applications, real-time analytics and data science. — Oracle
More jobs will be changed by AI than ever before and the “Data Jedis” will become the most sought after employees: Machine learning+human insights will infiltrate new industries including healthcare and security and employees will need to adapt to providing a different service or get left behind in 2016. While web developers were the previous heavy weights in the tech universe, “Data Jedis,” aka those that can translate and manage Big Data, will be the ones wielding light sabers in 2016. — Matt Bencke, CEO of Spare5
Artificial intelligence & Machine Learning will come to the fore. Artificial intelligence is a vital part of the fight against fraud to help handle the vast quantity of transactions going through online systems. Machine learning technology, supported by data science, helps to intelligent link transactions, giving the clearest possible picture of an organization’s online fraud problem, which can then be interpreted and prevented by empowered fraud managers. — Roberto Valerio, CEO of Risk Ident
Machine Learning will be the new buzz word. Much how Big Data was in the 90s, in 2016, every company will want to get on the machine learning bandwagon, but without the right people, many won’t have the expertise to do it. Expect to see the development of turnkey databases that allow developers to build predictive models without having a Ph.D. — Monte Zweben, co-founder and CEO of Splice Machine
Machine learning will invisibly transform our lives: 2016 is the year machine learning will make the leap from the workplace to the consumer. We’re already seeing it happen with self-driving cars from Tesla and Amazon Echo’s voice commands. Next year, machine learning will quietly find its way into the household, making the objects around us not just connected, but smarter every day. — Abdul Razack, SVP & head of platforms, big data and analytics at Infosys
Companies around the globe are starting to leverage machine learning techniques and cognitive computing to experiment with their data and experience how their business can benefit from these new approaches –we see this trend continuing in 2016. These approaches are enabling companies to pursue scalable innovation, incorporate inputs from millions of customers or transactions to create personalized offerings, and develop engagement strategies based on individual or household behavior. In the commercial space, these techniques in combination with externally available public data are creating insights for lead management, risk monitoring and offering design. Additional examples include building fault prediction capabilities to reduce asset downtime, intelligent remote monitoring of large-scale industrial and agricultural assets, and leveraging cognitive engines for investment decision making. As we are observing, the machines are not only learning patterns, but increasingly sophisticated models can allow the machines to carry out more advanced and customized activities. These machines are helping to fill the gap of large-scale human-like analytics, and combined with human judgment of the existing workforce, companies are enabled to improve their decision making to more effectively run their business as well as innovate on new business models. — Sharad Sachdev, managing director, Accenture Analytics – Innovation Lead
Machine learning will be crucial to cybersecurity when combatting sophisticated attacks. In 2016, we’ll see a greater push toward using machine intelligence for security monitoring and response to help security analysts not only find anomalies, but focus on the threats that matter. Machine learning techniques serve as the foundation for user and entity behavior analytics (UEBA), a crucial capability for organizations facing the increasingly sophisticated attack landscape. With these tools in place, security teams will become more productive and monitoring and response will be elevated as key elements alongside real-time attack and detection for a meaningful security posture. — Karthik Krishnan, VP of Product Management, Niara
NoSQL
In today’s digital economy, NoSQL databases are a far better fit than traditional relational databases for supporting Web, mobile and IoT applications. However, it’s no secret that there is a skills gap in enterprise IT for building new data management platforms. It is incumbent that database developers evolve their skills in order to meet these new platforms but the technology innovators must remove much of the friction to make the transition from relational databases to NoSQL as easy as possible by extending traditional tools and languages. This will be done in 2016 in both the private sector and in academia, further fueling the growth of enterprise NoSQL deployments. — Ravi Mayuram, SVP of Products and Engineering, Couchbase
The database market has suffered from insulation for some time, with long-reigning database leaders such as Oracle often seen as the best bet. However, companies that move slowly to address technology transitions are starting to show cracks and relational databases are ceding ground to distributed databases, specifically the NoSQL players. The ability of NoSQL database solutions to address the growing big data needs of businesses has resulted in massive market conflation – NoSQL brings speed, agility and overall faster time to market for critical applications. With more than 30 companies listed in the 2015 Gartner Magic Quadrant for Operational Databases, the number of projects and vendors vying for top seed has increased at a rapid pace. In coming years, we will see players drop out and the term NoSQL will no longer be specific enough to talk about the types of database technologies serving today’s businesses. The install base loyal to their application vendor will look at the specifics of what they need to move faster and consider the variety of NoSQL technologies, ultimately seeking guidance on where the differences are in these solutions. As leading database vendors (likely those with a strong open source play) survive the market cleanup, these vendors will need to look at how they address the variety of NoSQL use cases and get far more specific than NoSQL vs. relational, a battle that has already been decided. — Patrick McFadin, Chief Evangelist for Apache Cassandra at DataStax
Security
Data and information will continue to be weaponized: Use of data as weapon will be a major problem in 2016. In the past, data has been taken, destroyed or encrypted, but increasingly we’re seeing breaches during which data is leaked publicly in order to cause significant damage to a business, reputations, or even the government (e.g., Sony, Ashley Madison, etc.). Criminals and hacktivists are now stealing data and threatening to place it on public websites for others to see. In conjunction with this, hackers are building massive databases that include multiple types of data (insurance, health, credit card) to present a “full picture” of an individual. It’s one thing to have your data stolen and another to have it used against you. We’ll continue to see individuals’, corporations’ and public entities’ info used against them as a weapon in 2016. — Dmitri Alperovitch, CTO and Co-founder, Crowdstrike
Prediction will emerge as the new holy grail of security. Up until 2014, the cybersecurity industry considered prevention to be their sole objective. Sophisticated enterprises then began to complement their prevention strategies with detection technologies to get the visibility on their infrastructure they lacked. In 2016, prevention will emerge as a new priority with machine learning becoming a key tool for organizations that want to anticipate where hackers will strike. — Richard Greene, CEO of Seculert
Spark
Apache Spark lights up Big Data. Apache Spark has moved from a being a component of the Hadoop ecosystem to the Big Data platform of choice for a number of enterprises. Spark provides dramatically increased data processing speed compared to Hadoop and is now the largest big data open source project, according to Spark originator and Databricks co-founder, Matei Zaharia. We see more and more compelling enterprise use cases around Spark, such as at Goldman Sachs where Spark has become the “lingua franca” of big data analytics. — Dan Kogan, Director of Product Marketing at Tableau
Spark will kill Map Reduce, but save Hadoop. Map Reduce is quite esoteric. Its slow, batch nature and high level of complexity can make it unattractive for many enterprises. Spark, because of its speed, is much more natural, mathematical, and convenient for programmers. Spark will reinvigorate Hadoop, and in 2016, nine out of every 10 projects on Hadoop will be Spark-related projects. — Monte Zweben, co-founder and CEO of Splice Machine
Spark divorces Hadoop. Hadoop is a complex beast. The software equivalent of LEGOs, Hadoop is a collection of open source projects. Considerable assembly is required and multiple outcomes are possible. Like LEGOs, the end result might look cool but it is probably less functional than a purpose-built solution. Spark is different. It provides an efficient, general-purpose framework for parallel execution. This is very useful in today’s world where data analysis often requires the resources of a fleet of machines working together. While Spark is still relatively immature, it has the potential to evolve into the standard framework and API for parallel algorithmic analytics and machine learning. Today, Spark is part of Hadoop distributions and is widely associated with Hadoop. Expect to see that change in 2016 as Spark goes its own way, establishing a separate, vibrant ecosystem. In fact, you can expect to see the major cloud vendors release their own Spark PaaS offerings. Will we see an Elastic Spark? Good chance. — Bob Muglia, CEO Snowflake Computing
In 2016, technologies like Apache Spark, Kafka and System ML will make a real impact in the enterprise. Open source technologies allow enterprises to innovate faster and move quickly to stay relevant, without needing to build key infrastructure from scratch. Notably, with Apache Spark and Kafka, businesses will be able to tap the 80 percent of unstructured information stored across their databases that remains untouched – unlocking untold potential. — Derek Schoettle, GM, IBM Cloud Data Services
Storage
In 2016 cloud storage will go hybrid – Fact: public cloud storage can be as little as a few pennies per gig per month. Also a fact: network transit, retrieval, and various security and performance add-ons balloon that cost. The net result is that public cloud storage gets expensive, quickly. In fact companies with more than a petabyte of cloud storage are finding it cheaper to deploy on-premises software-defined storage clusters. The cost of the data center, infrastructure, power, and cooling can be cheaper than public clouds. Expect to see companies use intelligent cloud gateways and software-defined storage that support hybrid cloud. Hot and warm data will remain in private clouds, while older, colder data gets migrated to public clouds. The catch? These solutions need to be smart enough to do this dynamic migration automatically. — Avinash Lakshman, Hedvig CEO
The coolest new technology of 2016 will be invisibility. The past five years have seen a massive number of new technologies — both hardware and software — become available to enterprise environments: cloud offerings, hyperconverged appliances, all-flash silos. I expect 2016 will start to see a shift away from technology and form factor (e.g. hyperconverged or AFAs) and towards solutions: technologies that actually make enterprise storage and data protection more valuable, integrated, and easy to use. The winning technologies in this category will be the ones that work so well that you don’t see them. — Andrew Warfield, CTO and co-founder of Coho Data
Prescriptive, expert analysis will enhance data science and operational intelligence for enterprises. Data science has always been about turning data into actions, and operational intelligence will become one of the most important criteria for CIOs when making strategic decisions. But with the complexity of today’s data center — where multiple teams support applications, compute platforms, networking, storage and often converged infrastructure — enterprises can no longer rely solely on individual teams to gather all of the complex operational data and turn it into useful actions. In 2016, I believe vendors who have intimate knowledge of interactions across the entire data storage stack will need to be responsible for providing operational intelligence back to data center teams. There will be no room for guesswork – instead, prescriptive actions based on sound scientific data analysis will become the order of the day. — Rod Bagg, VP of Customer Support at Nimble Storage
Sign up for the free insideAI News newsletter.
I have been tracking NoSQL databases for several years, collecting publicly available data on skills and vendors. The NoSQL market is still tiny. Considerations and summary of data in Section 2 of this very large slide deck: https://speakerdeck.com/abchaudhri/considerations-for-using-nosql-technology-on-your-next-it-project-1 Slides regularly updated with new data as I find it.