Survey Indicates Apache Spark Gaining Developer Adoption

Spark_logo_featureTypesafe, provider a leading Reactive platform and the company behind Play Framework, Akka, and Scala, released the findings of a survey of more than 2,100 enterprise developers, data scientists, executives and system architects, analyzing adoption patterns around the Big Data processing engine, Apache Spark.

  • Spark awareness and adoption are seeing hockey-stick-like growth. Google Trends confirms this finding and the survey shows that 71 percent of respondents have at least evaluation or research experience with Spark—35 percent are using it or plan to adopt soon. Of the survey respondents running Big Data applications in production, 82 percent indicated that they are eager to replace MapReduce with Spark as the core processing engine.
  • Faster data processing and event streaming are the focus for enterprises. By far the most desirable features are Spark’s vastly improved processing power over MapReduce (over 78 percent mention this) and the ability to process event streams (over 66 percent mention this), which MapReduce cannot do.
  • Perceived barriers to adoption are not major blockers. When asked, respondents mentioned lack of in-house experience and perceived immaturity of some Spark components and integrations with other middleware and management tools. Also cited are needs for better commercial support options and for more comprehensive documentation and advanced examples. Some respondents mentioned that their organizations aren’t currently in need of “big” data solutions at this time.

The need to process Big Data faster has largely fueled the intense developer interest in Spark,” according to Dr. Dean Wampler, Big Data Architect at Typesafe. “Hadoop’s historic focus on batch processing of data was well supported by MapReduce, but there is an appetite for more flexible developer tools to support the larger market of ‘mid-size’ datasets and use cases that call for real-time processing.”

Developers across all industries have been turning to Typesafe to build Reactive applications, of which Big Data is a core component. Because it is built with Scala, it was a logical choice for Typesafe to add full lifecycle support for Apache Spark to the Typesafe Together Project Success Subscription program to accelerate developer adoption and success in building Reactive Big Data applications.

Coming directly from developers, this survey reiterated the rapid adoption of Spark for large-scale data processing. I’m especially excited by the breadth of use cases seen, which range from batch jobs to streaming and machine learning,” said Matei Zaharia, CTO at Databricks and Vice President of Apache Spark. “It’s this type of direct feedback and dialogue with our community that enables us to continue to improve the usability, performance and built-in libraries of Spark.”

 

Sign up for the free insideAI News newsletter.