In this video, Bill Bain from ScaleOut Software describes how the company’s in-memory data grid technology provides operational intelligence.
With the advent of Web and application server farms and HPC compute grids, database servers have increasingly been used to hold mission-critical but relatively short-lived and fast-changing data that must be accessible across the farm. In-memory data grids offer an important storage alternative that dramatically lowers cost and boosts application performance. The foundation of an IMDG is fast, scalable distributed caching, which keeps rapidly changing data close to where it is needed and quickly accessible. By moving data closer to where it is used and taking advantage of a server farm’s inherent scalability, this technology significantly enhances overall application performance, while making more efficient use of the data center’s computational, networking, and storage resources. To achieve this, in-memory data grids add analysis capabilities, employing a server farm’s scalable computational resources to enable fast, in-depth insights into trends and patterns, and comprehensive tools for managing and archiving data.”
Full Transcript:
insideAI News: Let’s start at the beginning. Who is ScaleOut Software and who do you help?
Bill Bain: ScaleOut Software is a software company that’s based both here in Beaverton, Oregon, and also in Bellevue, Washington. And we’ve been in business now for ten years, and our goal is to help applications scale performance to handle in-memory data, so they can scale to handle big workloads and give very fast response times to customers. Most recently, we’ve been working on data analytics to provide what we’re calling “operational intelligence,” to be able to give operational sites that are handling fast-changing data the ability to have very fast, real-time feedback within milliseconds or seconds.
insideAI News: So give me a customer scenario. Who might want to use this kind of analytic software?
Bill Bain: Sure. Our traditional customer is typically an e-commerce customer that is hosting a web farm. They would want to use this technology to host their fast-changing shopping cart data, their session state in-memory, to be able to scale that to handle hundreds of thousands of customers and get very, very fast response times on their websites. Within analytics, now you have the ability to do personalized recommendations, because you can watch what’s in the shopping carts, you can enrich that with information from say a low sequel store with information about the customer’s history, preferences, and then offer out recommendations that are personalized for them and much more likely to lead to a sale.
insideAI News: So it sounds like time would be of the essence to get that done, or they’re going to go off and maybe not buy it or whatever, right?
Bill Bain: That’s right. In fact, you have to react while the customer is actively shopping and you have to get targeted recommendations which fit their needs and their desires at that moment, or you’ve missed the opportunity.
insideAI News: So you’ve got this capability. Are we talking about terabytes in-memory or more on websites?
That’s a very good question. Technically, the amount of data stored in memories is in hundreds of gigabytes to terabytes. When you get up to the petabytes, then you would use standard big-data techniques like Hadoop MapReduce, store that data on disk, and analyze it over the course of minutes or hours.
insideAI News: Like a batch job, because that’s so much information to move around?
Bill Bain: That’s right.
insideAI News: I was going to ask, we’ve got these new Intel chips, these that have very, very large memory capability. Does that help– with your own proposition, does it work together?
Bill Bain: Absolutely, because the more memory you put on a chip than more on a server, then the more data you can host in memory and be able to analyze very, very quickly, and not have to go to SSDs or a disk to hold that data. But equally interesting is multi-core system. Now that we have processors with 8 to 16 and now 32 cores, you can do analysis that much faster. In fact, in-memory computing really takes advantage of all of these capabilities. Larger memory systems and multi-core systems deliver fast response time.
insideAI News: Everything I read about Bill is about the Hadoop analytics moving to real-time with things like Apache Spark. Did you guys play in that at all?
Bill Bain: Absolutely. The difference between us and Spark – you’re absolutely right. Spark plays in real-time analytics – the difference between Spark, and Storm, and other techniques is, we’re designed to run in operational environments where we’re hosting data that reflects the state of a live of system. These might be airline reservations, they might be incoming wire transfers, they would be shopping carts that we talked about, it could be logistics, it could be military assets on a battlefield. All of these types of data need to be held in-memory in a highly available manner. Our technology, the technology of what we call in-memory data grids was designed for both scalability and integrated high availability. Whereas Spark really came out of the Hadoop community as a project of Berkeley to accelerate MapReduce and extend its operators to do data parallel computing. So, you would find it more in the back office today, although with Spark Streaming you’re starting to see it in real-time environments.
insideAI News: So, you’re giving a talk here today at the BIGDATA meetup in Portland. What are you going to be telling folks today?
Bill Bain: Well, I’m going to be showing them the power of in-memory computing to deliver what we are calling operational intelligence. We have worked with several customers and developed scenarios in which this is valuable. For example, cable TV viewers can be tracked and they can be up-sold with offers that are personalized to what they are doing at that moment with their cable television. So, as they change channels and the cable company knows about their preferences and history, they can serve up offers that are very relevant to them, just as an example.
Hi Richard,
I really don’t know much more about ScaleOut Software. Got details information here.
Thanks for the nice sharing.