ODPi, a nonprofit organization accelerating the open ecosystem of big data solutions, announced the availability of ODPi 2.0, which includes the first release of the ODPi Operations Specification and the Runtime Specification 2.0, to standardize the development model for big data solution and application providers and help enterprises improve installation and management of Hadoop-based applications.
With more than 30 members, including recently announced DriveScale, Redoop, and Xavient Information Systems, ODPi is focused on simplification and standardization within the big data ecosystem and further advancing the work of the Apache Software Foundation. Designed to make it easier to create big data solutions and data-driven applications, ODPi adds Apache Hive and Hadoop Compatible File System support (HCFS) as part of the ODPi Runtime Specification 2.0. Additionally, the ODPi 2.0 release includes Operations Specification 1.0, which provides standard guidelines for application management tools serving as reference platforms; including Apache Ambari.
With the release of the first Operations Specification, ODPi is moving standardization forward for Apache projects in a pragmatic, fluid way that embraces developer input,” said John Mertic, Director of ODPi. “ODPi specifications are based input from developers and enterprises and how they are actually big data technologies in production environment and address real issues they’ve encountered. Our technical team developed this latest release knowing that the SQL layer, backend storage, and how applications should be installed, managed and configured in a Apache Hadoop cluster are important to them. We’ll continue to iterate on previous releases and seek industry input to ensure that we are tackling the critical issues that benefit the wider big data ecosystem.”
Key ODPi Operations Specification 1.0 Technical Features
The ODPi Operations Specification 1.0 provides standard guidelines for application management tools, with Apache Ambari as a reference platform, the Apache Software Foundation project for provisioning, managing, and monitoring Apache Hadoop clusters. By providing common expectations in guidelines, developers are able to create data-driven applications for all management tools used by platform providers. For big data solution and application providers, this minimizes the complexity, cost and training needed to build big data applications.
ODPi community worked closely with the Apache Ambari community to develop the Operations Specification, ensuring backward compatibility with the standardization and alignment with the community’s needs. ODPi community further designed this spec such that other management tools could attain compliance.
Similar to Spark, Ambari is a rapidly changing project. In working on the latest release, ODPi’s technical team collected substantial Ambari institutional knowledge, which they’ve contributed to Ambari. The reference manual will help developers more easily write an application for Ambari to manage their applications.
There is a major shift occurring on how data is treated within their organization,” said Ritika Gunnar, Vice President of Offerings, IBM Analytics. “Fundamentally, it is no longer about the persistent stores, data in Hadoop, data in operational database and real-time streaming. It is about how that data is accessed in trust and used within an organization. By working with ODPi and committing to provide these organizations with a compliant platform they can count on and interoperable software that sits on top of Hadoop, including IBM Big SQL, IBM SPSS Analytic Server, IBM Big Replicate, and others, we are helping our customers build their businesses.”
Key ODPi Runtime Specification 2.0 Technical Features
ODPi Runtime Specification 2.0 adds Apache Hive and Hadoop Compatible File System support (HCFS) components to Yarn, MapReduce and HDFS from ODPi Runtime Specification 1.0. HCFS support will enable storage and cloud vendors to leverage ODPi standards, empowering them to use their native storage solutions as part of an ODPi Runtime Compliant Hadoop Platform and reduce the incompatibilities that end users face. By including Apache Hive, ODPi will reduce SQL query inconsistencies across Hadoop Platforms. ODPi based its work on Hive version 1.2 and has included core functionality that will continue to behave in a standard way for future versions of Apache Hive. For more on this addition, read ODPi technical steering committee chair Alan Gates’ blog.
ODPi Compatibility and Interoperability
Several Apache Hadoop platform and big data solution and application providers; including Ampool, Hortonworks, IBM, Pivotal, and SAS have committed to testing against ODPi 2.0 to become ODPi Compliant and ODPi Interoperable. They have the ability to test against both the Operations Specification 1.0 and Runtime Specification 2.0 separately; offering greater simplicity for big data solution and application providers. This option provides end-users greater choice and flexibility by fostering an open big data ecosystem that transcends traditional vendor alliances.
Complying with the latest version of the ODPi specification simplifies how Apache HAWQ can query the vast quantities of data in the popular Apache Hive format, and allows us to seamlessly integrate configuration and administration through Apache Ambari,” said Jacque Istok, Head of Data Engineering, Pivotal Software. “ODPi is allowing us to roll out compatibility features with the Apache Hadoop ecosystem at a much faster pace.”
Sign up for the free insideAI News newsletter.