Pentaho 6.0 Automates the Enterprise Analytic Data Pipeline

pentaho-logoPentaho, a Hitachi Data Systems company, recently previewed version 6.0 of its big data integration and analytics platform. The new 6.0 software release includes the first version of the Pentaho enterprise-class server with features designed to automate, manage and enhance every stage of the analytic data pipeline. This version takes customers closer to Pentaho’s bold vision for governed data delivery – the ability to blend trusted and timely data of any type to power analytics at scale.

Flexible options for putting data to work

As big data matures in the workplace, a spectrum of data architectures has emerged to serve different use cases. At one end, traditional data warehouses host prepared, structured data; at the other, data lakes provide a repository for raw, native data. Data refineries, which transform raw data, and provide the ability to incorporate data sources that are too varied or fast-moving to stage in the data lake, sit between these on the spectrum. As use cases evolve, enterprises can rest assured that with Pentaho 6.0 data will be appropriately governed and efficiently delivered at any stage of the pipeline to any user or application.

Our benchmark research on Big Data Analytics reveals that 95% of organizations are either using or intend to use big data analytics. Companies, however, need software that can efficiently manage the process flow of multiple diverse data sources in a scalable manner to create the unified analytic data sets that lead to insight. This is one of the most effective ways to provide the value needed and expected by management,” said Tony Cosentino, VP and research director at Ventana Research. “In this latest release, Pentaho 6.0 addresses not only the need to manage the process flow, but also to help automate the entire analytic data pipeline.”

Making the “big blend” a reality

Pentaho 6.0 offers new data services and delivery options to:

  • Blend and virtualize datasets on the fly for faster access and flexibility when mashing up data
  • Support data blending at scale with enhanced ‘push down optimization,’ where data transformations can be ‘pushed’ to the most efficient processing resource
  • Easily shape even the most complex data
  • Track and store data lineage details each time a process executes
  • Improve collaboration capabilities for sharing data discovery findings with improved inline modeling capabilities (round-trip model editing)

Taming the analytic data pipeline

Pentaho 6.0 strengthens the Pentaho platform to manage expanding data pipelines with:

  • New data lineage capabilities to help users understand data origin.
  • Deeper integration with SNMP through major enhancements to systems monitoring for better enterprise visibility
  • Upgraded and enhanced Spring Security, the Java/Java EE framework that provides authentication, authorization and other security features for enterprise applications
  • Improved Data Services caching for optimal performance of virtualized datasets
  • Support for OSGI specification, a modular system and a service platform for Java

Pentaho 6.0 is an important milestone toward our vision of helping organizations get value out of any data in any environment, no matter how complex the architecture or how wild the data flows,” said Christopher Dziekan, Chief Product Officer, Pentaho. “Data-driven organizations can be sure that their data is appropriately governed and delivered at the point of impact, whether consumed by an internal user, external user or third party application.”

 

Download insideAI News: An Insider’s Guide to Apache Spark