Hazelcast, a leading open source in-memory data grid (IMDG) with tens of thousands of installed clusters and over 39 million server starts per month, announced the 0.5 release of Hazelcast Jet – an application embeddable, distributed computing platform for fast processing of big data sets. New functionality in Hazelcast Jet 0.5 includes the Pipeline API for general purpose programming of batch and stream processing, and fault tolerance using snapshotting with the integrated Hazelcast IMDG. The overall focus of the latest release is to increase developer productivity via an extremely simple and intuitive API. Jet is a single library with no dependencies which is therefore easily embedded and deployed, removing the need for multiple systems. Typical application use cases include online trades, sensor updates in IoT architectures, real-time fraud detection, system log events, in-store e-commerce systems and social media platforms.
The Pipeline API is the primary programming interface of Hazelcast Jet for batch and stream processing, making it more appealing to a wider Java audience. This is a major enhancement to the Hazelcast Jet low-level Core API which uses directed acyclic graphs (DAG) to model data flow – allowing detailed DAG assembly of processing jobs. The new Pipeline API is easier to use and provides developers with tools to compose batch computations from building blocks such as filters, aggregators and joiners. The Java 8 Stream API is also available in Hazelcast Jet 0.5, a well-known and popular API in the Java community which supports functional-style operations on streams of elements. The key point is that ANY Java developer will find the new Pipeline API familiar and productive.
Since its first release Jet has put the Fast in Fast Big Data with performance up to 15 times faster than Spark and Flink,” said Greg Luck, CEO of Hazelcast. “In this release we have been working on bringing Hazelcast’s legendary programming simplicity to Jet, which we think we have now achieved with the Pipeline API. Programmers, start your Jet engines.”
Also new is fault tolerance using distributed in-memory snapshots – in Hazelcast Jet 0.5 snapshots are distributed across the cluster and held in multiple replicas to provide redundancy. Jet is now able to tolerate multiple faults such as node failure, network partition or job execution failure. Snapshots are periodically created and backed up. If there is a node failure Jet uses the latest state snapshot and automatically restarts all jobs that contain the failed node as a job participant. No additional infrastructure, such as distributed file system or external snapshot storage, is necessary to ensure Hazelcast Jet is fault tolerant out of the box.
With this release, Jet is fully integrated with Hazelcast IMDG for sources, sinks and enrichment. It also has streaming as well as batch integrations. Jet benefits from being integrated with Hazelcast IMDG, accessing an elastic, in-memory storage capability. Using data already held in IMDG versus external sources gives a 5 times performance enhancement. We expect our millions of IMDG users to start leveraging Jet’s capabilities to process the data they already hold in IMDG.
Hazelcast Jet is an Apache 2 licensed open source project that performs parallel execution to enable data-intensive applications to operate in near real-time. Built on top of a one-record-per-time architecture (sometimes known as continuous operators), Hazelcast Jet processes incoming records as soon as possible, opposed to accumulating records into micro-batches, consequently lowering latency for applications. Hazelcast Jet is extremely simple to program and deploy – and can be fully embedded for OEMs and Microservices – making it is easier for manufacturers to build and maintain next generation systems.
Sign up for the free insideAI News newsletter.