Spark 101: Online Approximate OLAP in SparkSQL

The Hadoop Summit 2015 talk below introduces G-OLA, a parallel approximate query engine built on top of BlinkDB and SparkSQL, that provides a radically different “online execution” paradigm to incrementally process massive amounts of data on clusters of hundreds or thousands of machine while returning approximate answers. G-OLA presents the user with a meaningful approximate result (with error bars) that is continuously refined at a speed comfortable to the user and enables them to control the query execution on the fly. The slides for this presentation are available HERE.

 

 

Sign up for the free insideAI News newsletter.

Speak Your Mind

*

Comments

  1. Revathy Hari says

    Can spark be used for generate sequential patterns from dynamic streams of big data( especially considering dna sequences as data set)?