Abstract
Energy is a growing component of the operational cost for many "big data" deployments, and hence has become increasingly important for practitioners of large-scale data analysis who require scale-out clusters or parallel DBMS appliances. Although a number of recent studies have investigated the energy efficiency of DBMSs, none of these studies have looked at the architectural design space of energy-efficient parallel DBMS clusters. There are many challenges to increasing the energy efficiency of a DBMS cluster, including dealing with the inherent scaling inefficiency of parallel data processing, and choosing the appropriate energy-efficient hardware. In this paper, we experimentally examine and analyze a number of key parameters related to these challenges for designing energy-efficient database clusters. We explore the cluster design space using empirical results and propose a model that considers the key bottlenecks to energy efficiency in a parallel DBMS. This paper represents a key first step in designing energy-efficient database clusters, which is increasingly important given the trend toward parallel database appliances.
- http://www.oracle.com/us/products/database/exadata-database-machine/overview/index.html.Google Scholar
- http://www.netezza.com.Google Scholar
- http://www.vertica.com.Google Scholar
- http://www.actian.com/products/vectorwise.Google Scholar
- http://h18013.www1.hp.com/products/servers/management/remotemgmt.html.Google Scholar
- A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. In VLDB, pages 922--933, 2009. Google ScholarDigital Library
- H. Amur, J. Cipar, V. Gupta, G. R. Ganger, M. A. Kozuch, and K. Schwan. Robust and flexible power-proportional storage. In SoCC, pages 217--228, 2010. Google ScholarDigital Library
- D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: a fast array of wimpy nodes. In SOSP, pages 1--14, 2009. Google ScholarDigital Library
- L. A. Barroso and U. Hölzle. The Case for Energy-Proportional Computing. IEEE Computer, 40(12):33--37, 2007. Google ScholarDigital Library
- L. A. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture, 2009. Google ScholarDigital Library
- Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz. Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In EuroSys, pages 45--56, 2012. Google ScholarDigital Library
- D. DeWitt and J. Gray. Parallel Database Systems: The Future of High Performance Database Processing. CACM, 35(6):85--98, 1992. Google ScholarDigital Library
- Google Inc. Efficiency measurements - Google Datacenters. http://www.google.com/corporate/datacenter/efficiency-measurements.html.Google Scholar
- G. Graefe. Database Servers Tailored to Improve Energy Efficiency. In Software Engineering for Tailor-made Data Management, 2008. Google ScholarDigital Library
- J. Hamilton. Where Does the Power go in DCs & How to get it Back. In Foo Camp, http://www.mvdirona.com/jrh/TalksAndPapers/JamesRH_DCPowerSavingsFooCamp08.ppt, 2008.Google Scholar
- S. Harizopoulos, V. Liang, D. J. Abadi, and S. Madden. Performance tradeoffs in read-optimized databases. In VLDB, pages 487--498, 2006. Google ScholarDigital Library
- S. Harizopoulos and S. Papadimitriou. A Case for Micro-cellstores: Energy-Efficient Data Management on Recycled Smartphones. In DaMoN, pages 50--55, 2011. Google ScholarDigital Library
- S. Harizopoulos, M. A. Shah, J. Meza, and P. Ranganathan. Energy Efficiency: The New Holy Grail of Database Management Systems Research. In CIDR, 2009.Google Scholar
- J. G. Koomey. Growth in Data Center Electricity Use 2005 to 2010. http://www.mediafire.com/file/zzqna34282frr2f/koomeydatacenterelectuse2011finalversion.pdf, 2011.Google Scholar
- A. Krioukov, C. Goebel, S. Alspaugh, Y. Chen, D. E. Culler, and R. H. Katz. Integrating renewable energy using data analytics systems: Challenges and opportunities. IEEE Data Eng. Bull., 34(1):3--11, 2011.Google Scholar
- W. Lang, R. Kandhan, and J. M. Patel. Rethinking Query Processing for Energy Efficiency: Slowing Down to Win the Race. IEEE Data Eng. Bull., 34(1):12--23, 2011.Google Scholar
- W. Lang and J. M. Patel. Towards Eco-friendly Database Management Systems. In CIDR, 2009.Google Scholar
- W. Lang and J. M. Patel. Energy Management for MapReduce Clusters. In VLDB, pages 129--139, 2010. Google ScholarDigital Library
- W. Lang, J. M. Patel, and J. F. Naughton. On Energy Management, Load Balancing and Replication. SIGMOD Record, 38(4):35--42, 2009. Google ScholarDigital Library
- W. Lang, J. M. Patel, and S. Shankar. Wimpy Node Clusters: What About Non-Wimpy Workloads? In DaMoN, pages 47--55, 2010. Google ScholarDigital Library
- J. Larkby-Lahet, G. Santhanakrishnan, A. Amer, and P. K. Chrysanthis. Step: Self-tuning energy-safe predictors. In MDM, pages 125--133, 2005. Google ScholarDigital Library
- J. Leverich and C. Kozyrakis. On the energy (in)efficiency of hadoop clusters. SIGOPS Oper. Syst. Rev., 44(1):61--65, 2010. Google ScholarDigital Library
- J. Meza, M. A. Shah, P. Ranganathan, M. Fitzner, and J. Veazay. Tracking the power in an enterprise decision support system. In ISLPED, pages 261--266, 2009. Google ScholarDigital Library
- R. Mueller, J. Teubner, and G. Alonso. Data Processing on FPGAs. In VLDB, pages 910--921, 2009. Google ScholarDigital Library
- D. A. Patteron. Latency Lags Bandwidth. CACM, 47(10):71--75, 2004. Google ScholarDigital Library
- A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A Comparison of Approaches to Large-Scale Data Analysis. In SIGMOD, pages 165--178, 2009. Google ScholarDigital Library
- R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu. No "power" struggles: coordinated multi-level power management for the data center. SIGOPS Oper. Syst. Rev., 42(2):48--59, 2008. Google ScholarDigital Library
- A. S. Szalay, G. Bell, H. H. Huang, A. Terzis, and A. White. Low-Power Amdahl-Balanced Blades for Data Intensive Computing. SIGOPS Oper. Syst. Rev., 44(1):71--75, 2010. Google ScholarDigital Library
- N. Tolia, Z. Wang, M. Marwah, C. Bash, P. Ranganathan, and X. Zhu. Delivering Energy Proportionality with Non Energy-Proportional Systems - Optimizing the Ensemble. In HotPower, 2008. Google ScholarDigital Library
- D. Tsirogiannis, S. Harizopoulos, and M. A. Shah. Analyzing the energy efficiency of a database server. In SIGMOD, pages 231--242, 2010. Google ScholarDigital Library
- V. Vasudevan, D. Andersen, M. Kaminsky, L. Tan, J. Franklin, and I. Moraru. Energy-efficient cluster computing with fawn: workloads and implications. In e-Energy, pages 195--204, 2010. Google ScholarDigital Library
- V. Vasudevan, J. Franklin, D. Andersen, A. Phanishayee, L. Tan, M. Kaminsky, and I. Moraru. FAWNdamentally Power-efficient Clusters. In HotOS, 2009. Google ScholarDigital Library
- Z. Xu, Y.-C. Tu, and X. Wang. Exploring Power-Performance Tradeoffs in Database Systems. In ICDE, pages 485--496, 2010.Google ScholarCross Ref
Recommendations
An Energy Efficient Clustering for Cluster-Based Wireless Sensor Networks
NCM '09: Proceedings of the 2009 Fifth International Joint Conference on INC, IMS and IDCThis paper proposes a novel energy efficient data clustering scheme to improve energy efficiency for cluster-based wireless sensor networks (WSNs). In order to reduce the energy dissipation of transmitting sensing data at each sensor, the fixed ...
Comments