Pepperdata Diagnoses Troublesome Hadoop Clusters With Complimentary Health Check

Pepperdata-logoHADOOP SUMMIT 2016 NEWS

Many organizations use Hadoop to gain competitive advantage, but because of the very nature of distributed systems, there are inherent performance limitations that even the most advanced Hadoop users will inevitably encounter. Pepperdata, the world’s experts in the performance of distributed systems at scale, today announced the availability of Hadoop Health Check, a complimentary, expert assessment that evaluates and diagnoses Hadoop clusters of 100 nodes or more, and provides full visibility into current cluster conditions. With the Hadoop Health Check program, organizations can easily determine how to improve their current Hadoop operations by solving the innate limitations of distributed systems with automated, policy-based prioritization and active management in their production Hadoop environments.

How Healthy is Your Hadoop Infrastructure?

When a company signs up for the Health Check program, the Pepperdata software is installed on a production cluster for up to 72 hours. During this time, the software collects all Hadoop performance data and provides a high-level diagnostic report with granular insight into the cause of common issues such as:

  • Problem users or jobs: Pinpoint which users or jobs consume the most resources, including CPU, RAM, disk, and network usage that could be causing performance complications. The tool can also determine individual costs of shared cluster resources based on actual usage.

  • Wasted cluster capacity: Identify where capacity isn’t fully utilized to increase throughput, and even benchmark utilization and performance against industry leaders.

  • Bottlenecks and root cause: Quickly identify where bottlenecks that hamper job completion occur. For enterprise users, Pepperdata’s Adaptive Performance Core™ uses real-time processing to observe and reshape application usage to prevent bottlenecks.

Pepperdata is rolling out a Health Check program that within days, or even hours, can can uncover and diagnose even the most insidious performance problems that creep in and hide inside large, complex big data operations.” said Mike Matchett, analyst, Taneja Group. “We expect that enterprises at the point of really scaling and growing a big data production stack will experience significant ROI with their first Health Check, and it’s a big step towards companies understanding that with tools like Pepperdata they have the capabilities they need to accelerate their big data deployment plans.”

Pepperdata Health Check Graphic

Solving Problems for Real Customers with Complete Cluster Diagnostics

Pepperdata Health Check helps companies identify performance issues that have been plaguing their Hadoop clusters, but have long gone unidentified or undiagnosed. Below are a few real world use cases that show the types of performance challenges Pepperdata has solved for companies:

  • A global storage company couldn’t figure out why a certain job that took eight hours one week only took one hour the week prior. With Pepperdata, the company was able to identify where a spike in user demand was causing jobs to queue for seven hours before they actually executed. This granular visibility also made it possible to correlate the spike in demand to a specific user and set of jobs.

  • A mobile analytics provider was having trouble determining why its cluster was running slowly. Pepperdata was able to pinpoint a widespread overload of disk input/output capacity as the cause of the problem. Because of this, the company was able to exclude the problem causing files so that jobs could complete on schedule.

  • An online real estate site was experiencing a very slow running EMR workflow. The workflow that had a 17-hour runtime was essentially a black box as all metrics were lost once the cluster terminated. Pepperdata allowed for real-time analysis of the workflow, in addition to comparative analysis between runs that identified task-level visibility inefficiencies in the code. By resolving this issue, the company reduced its job run time from 17 hours to three.

Pepperdata has helped numerous organizations obtain for the first time a full and clear picture of how their Hadoop clusters are performing in any distribution environment. With the new Health Check program we have made it much easier and accessible for more companies to see the value and improvement that our product brings,” said Sean Suchter, CEO and cofounder, Pepperdata. “Not only does Pepperdata dynamically solve the most common performance issues associated with distributed computing, but our support of all the major distributions means we can confidently offer a prescriptive data-backed assessment to guarantee optimal performance.”

Sign up for the free insideAI News newsletter.