Hadoop Summit 2016 News
H2O.ai, the company bringing AI to business, today announced the availability of Sparkling Water 2.0. Sparkling Water 2.0 builds off the popularity of Sparkling Water, H2O.ai’s API for Apache Spark, with additional features and functionality. New features include the ability to interface with Apache Spark, Scala and MLlib via H2O.ai’s Flow UI, build ensembles using algorithms from both H2O and MLlib and give Spark users the power of H2O’s visual intelligence capabilities.
Sparkling Water was designed to allow users to get the best of Apache Spark – its elegant APIs, RDDs and multi-tenant Context – along with H2O’s speed, columnar-compression and fully-featured machine learning algorithms. Sparkling Water also allows for greater flexibility when it comes to finding the best algorithm for a given use case. Apache Spark’s MLlib offers a library of efficient implementations of popular algorithms directly built using Spark. Sparkling Water empowers enterprise customers to use H2O algorithms in conjunction with, or instead of, MLlib algorithms on Apache Spark.
Enterprises are looking to take advantage of a variety of machine learning algorithms to address an increasingly complex set of use cases when determining how to best serve their customers,” said Matt Aslett, Research Director, Data Platforms and Analytics at 451 Research. “Sparkling Water is likely to be attractive to H2O and Spark users alike, enabling them to mix and match algorithms as required.”
Sparkling Water 2.0 includes the following improvements and functionality:
-
Support for Apache Spark 2.0 and backwards compatibility with all previous versions.
-
The ability to run Apache Spark and Scala through H2O’s Flow UI.
-
Support for the Apache Zeppelin notebook.
-
H2O feature improvements and visualizations for MLlib algorithms, including the ability to score feature importance.
-
Visual intelligence for Apache Spark.
-
The ability to build Ensembles using H2O plus MLlib algorithms.
-
The power to export MLlib models as POJOs (Plain Old Java Objects), which can be easily run on commodity hardware.
-
A toolchain for building machine learning pipelines on Apache Spark
-
Production support for machine learning pipelines and the operationalization of MLlib through H2O scoring engines.
-
Realtime machine learning for data products using Spark Streaming and H2O.
-
Model and data governance through Steam.
-
Bringing H2O’s powerful data munging capabilities to Apache Spark.
In addition to offering a greater degree of functionality and choice to Apache Spark, Sparkling Water 2.0 also delivers a new visualization component to MLlib. H2O.ai has recently built out a team of data visualization experts whose sole focus is to make AI and machine learning algorithims easily consumable. The progress made by H2O.ai’s visualization team will come to Apache Spark via Sparkling Water, allowing users to enjoy beautiful and easy to understand visualizations of algorithmic results.
Beauty and functionality are essential to everything we do at H2O.ai,” said H2O.ai CEO Sri Ambati. “We’re totally committed to the open source movement and doing everything we can to bring visually appealing, and easy to comprehend, AI-driven insights to enterprise users. That’s true regardless of whether a piece of technology was built by us or not. Everything we do will be contributed back upstream to the community codebase.”
Sign up for the free insideAI News newsletter.