The huge volume of Big Data produced by sensors, genomic sequencers, electronic exchanges, and connected devices continues to generate headlines but it’s the diverse types of data, not the volume, that’s a bigger challenge to data scientists and is causing them to “leave data on the table.”
According to a new survey by computational database company Paradigm4, nearly three-quarters of data scientists – 71 percent – said Big Data had made their analytics more difficult and data variety, not volume, was to blame. The survey also showed that 36 percent of data scientists say it takes too long to get insights because the data is too big to move to their analytics software. These issues cause data scientists to omit data from analyses and prevent them from maximizing the value of their work.
The increasing variety of data sources is forcing data scientists into shortcuts that leave data and money on the table,” said Marilyn Matz, CEO of Paradigm4. “The focus on the volume of data hides the real challenge of data analytics today. Only by addressing the challenge of utilizing diverse types of data will we be able to unlock the enormous potential of analytics.”
The Paradigm4 survey, which included responses from 111 data scientists, also found:
- Despite the hype around the Hadoop software platform, fewer than half (48 percent) have used Hadoop or SPARK — and of those, 76 percent said it was too slow, took too much effort to program or had other limitations
- 91 percent said they’re using complex analytics on their Big Data now or plan to within the next two years
- Nearly half of data scientists (49 percent) said they’re finding it more difficult to fit their data into relational database tables
- 39 percent said their job had become more stressful with the growth of Big Data
For more detail and analysis of the survey findings, the Paradigm4 Data Scientist Survey report is available HERE.
Methodology
The Paradigm4 Data Scientist Survey was fielded by the independent research firm Innovation Enterprise from March 27 to April 22, 2014. The responses were generated from a survey of 111 people who self-identified as data scientists based in the United States.
Sign up for the free insideAI News newsletter.