ABSTRACT
Power-proportional cluster-based storage is an important component of an overall cloud computing infrastructure. With it, substantial subsets of nodes in the storage cluster can be turned off to save power during periods of low utilization. Rabbit is a distributed file system that arranges its data-layout to provide ideal power-proportionality down to very low minimum number of powered-up nodes (enough to store a primary replica of available datasets). Rabbit addresses the node failure rates of large-scale clusters with data layouts that minimize the number of nodes that must be powered-up if a primary fails. Rabbit also allows different datasets to use different subsets of nodes as a building block for interference avoidance when the infrastructure is shared by multiple tenants. Experiments with a Rabbit prototype demonstrate its power-proportionality, and simulation experiments demonstrate its properties at scale.
- Hadoop. http://hadoop.apache.org.Google Scholar
- Pig. http://hadoop.apache.org/pig.Google Scholar
- David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan. FAWN: A Fast Array of Wimpy Nodes. In SOSP '09: Proceedings of the 22nd ACM Symposium on Operating Systems Principles, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Luiz A. Barroso and Urs Hölzle. The Case for Energy-Proportional Computing. Computer, 40(12):33--37, 2007. Google ScholarDigital Library
- Adrian M. Caulfield, Laura M. Grupp, and Steven Swanson. Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications. In ASPLOS '09: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, pages 217--228, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. BigTable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst., 26(2):1--26, June 2008. Google ScholarDigital Library
- Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin M. Vahdat, and Ronald P. Doyle. Managing Energy and Server Resources in Hosting Centers. SIGOPS Oper. Syst. Rev., 35(5):103--116, December 2001. Google ScholarDigital Library
- Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI '04: Proceedings of USENIX Conference on Operating Systems Design and Implementation, pages 137--150, 2004. Google ScholarDigital Library
- Sanjay Ghemawat, Howard Gobioff, and Shun T. Leung. The Google File System. SIGOPS Oper. Syst. Rev., 37(5):29--43, 2003. Google ScholarDigital Library
- Jorge Guerra, Wendy Belluomini, Joseph Gilder, Karan Gupta, and Himabindu Pucha. Energy Proportionality for Storage: Impact and Feasibility. In HotStorage '09: SOSP Workshop on Hot Topics in Storage and File Systems, 2009.Google Scholar
- Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. Quincy: Fair Scheduling for Distributed Computing Clusters. In SOSP '09: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 261--276, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Jacob Leverich and Christos Kozyrakis. On the Energy (In)efficiency of Hadoop Clusters. In HotPower '09, Workshop on Power Aware Computing and Systems, 2009.Google Scholar
- David Meisner, Brian T. Gold, and Thomas F. Wenisch. PowerNap: Eliminating Server Idle Power. In ASPLOS '09: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, pages 205--216, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. Write Off-Loading: Practical Power Management for Enterprise Storage. In FAST'08: Proceedings of the 6th USENIX Conference on File and Storage Technologies, pages 1--15, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
- Dushyanth Narayanan, Austin Donnelly, Eno Thereska, Sameh Elnikety, and Antony Rowstron. Everest: Scaling down peak loads through I/O off-loading. In OSDI '08: Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, 2008. Google ScholarDigital Library
- SColby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, and Christos Kozyrakis. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In HPCA '07: Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pages 13--24, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- Cosmin Rusu, Alexandre Ferreira, Claudio Scordino, and Aaron Watson. Energy-Efficient Real-Time Heterogeneous Server Clusters. In RTAS '06: Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium, pages 418--428, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
- Yasushi Saito, Svend Frølund, Alistair Veitch, Arif Merchant, and Susan Spence. FAB: Building Distributed Enterprise Disk Arrays from Commodity Components. In ASPLOS-XI: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 48--58, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- Sorting 1PB with MapReduce, http://googleblog.blogspot.com/2008/11/sorting-1pbwith-mapreduce.html.Google Scholar
- E. Thereska, A. Donnelly, and D. Narayanan. Sierra: A Power-Proportional, Distributed Storage System. Technical report, Microsoft Research, 2009.Google Scholar
- Nedeljko Vasic, Martin Barisits, Vincent Salzgeber, and Dejan Kostic. Making Cluster Applications Energy-Aware. In ACDC '09: Proceedings of the 1st Workshop on Automated Control for Datacenters and Clouds, pages 37--42, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang, Peter L. Reiher, and Geoffrey H. Kuenning. PARAID: A Gear-Shifting Power-Aware RAID. TOS, 3(3), 2007. Google ScholarDigital Library
- E.R. Zayas. AFS-3 Programmer's Reference: Architectural Overview. Technical report, Transarc Corporation, 1991.Google Scholar
Index Terms
- Robust and flexible power-proportional storage
Recommendations
Self-managed cost-efficient virtual elastic clusters on hybrid Cloud infrastructures
In this study, we describe the further development of Elastic Cloud Computing Cluster (EC3), a tool for creating self-managed cost-efficient virtual hybrid elastic clusters on top of Infrastructure as a Service (IaaS) clouds. By using spot instances and ...
EAD: elasticity aware deduplication manager for datacenters with multi-tier storage systems
The popularity of Big Data applications places pressures on storage systems to efficiently scale to meet the demand. At the same time, new developments like solid-state drives have changed to traditional storage hierarchy. Cloud storage systems are ...
Performance Evaluation of Software RAID vs. Hardware RAID for Parallel Virtual File System
ICPADS '02: Proceedings of the 9th International Conference on Parallel and Distributed SystemsLinux clusters of commodity computer systems and interconnectshave become the fastest growing choice for buildingcost-effective high-performance parallel computing systems.The Parallel Virtual File System (PVFS) could potentially fulfillthe requirements ...
Comments