Harry Mangalam from the University of California at Irvine has posted a detailed comparison of the Fraunhofer and Gluster distributed filesystems. Distributed filesystems are those where the files are distributed in whole or in part on different block devices, with direct interaction with client computers to allow IO scaling proportional to the number of servers. […]
Green Graph 500 Launches to Boost Energy Efficient Big Data Computing
In this special guest feature, Torsten Hoefler from ETH Zurich writes that the new Green Graph500 aims to boost energy-efficient Big Data Computing. “Big Data” can be analyzed in various ways. The most successful and prevalent programming model, MapReduce, convinces by its flexibility toadapt to hardware performance variations and faults. However, even though MapReduce covers […]
A New Era in Genome Sequencing
In the midst of all the ballyhoo surrounding Big Data and how it’s going to “transform how we live, work, and think” (a borrowing from the subtitle of the excellent book Big Data by Viktor Mayer-Schönberger and Kenneth Cukier), it’s encouraging to hear about applications that are actually living up to all the hype. Case […]
2013: The Year ZFS Goes Big
Over at the Nexenta Blog, Evan Powell predicts that in 2013, ZFS will be recognized as the most broadly deployed storage file system in the world. In fact it already is. We alone have half as much storage, we figure, under management as NetApp claims. Add Oracle and you’re already bigger than any one-storage file […]
A Contrast of Paradigms – HPCC Systems & Hadoop
Flavio Villanustre writes about the differences between two powerful open source Big Data platforms: HPCC and Hadoop. HPCC and Hadoop are both open source projects released under an Apache 2.0 license, and are free to use, with both leveraging commodity hardware and local storage interconnected through IP networks, allowing for parallel data processing and/or querying […]
A New HPC Problem: Checksums for Large Archives
Storage pundit Henry Newman writes that running checksums for large data archives is quickly becoming an HPC problem: Today, many preservation archives are well over 5PB and a few are well over 10PB with expectations that these archives will grow to more than 100PB. With archives this large, the requirements for HPC architectures for checksum […]