ABSTRACT
Elasticity - where systems acquire and release resources in response to dynamic workloads, while paying only for what they need - is a driving property of cloud computing. At the core of any elastic system is an automated controller. This paper addresses elastic control for multi-tier application services that allocate and release resources in discrete units, such as virtual server instances of predetermined sizes. It focuses on elastic control of the storage tier, in which adding or removing a storage node or "brick" requires rebalancing stored data across the nodes. The storage tier presents new challenges for elastic control: actuator delays (lag) due to rebalancing, interference with applications and sensor measurements, and the need to synchronize the multiple control elements, including rebalancing.
We have designed and implemented a new controller for elastic storage systems to address these challenges. Using a popular distributed storage system - the Hadoop Distributed File System (HDFS) - under dynamic Web 2.0 workloads, we show how the controller adapts to workload changes to maintain performance objectives efficiently in a pay-as-you-go cloud computing environment.
- Animoto's Facebook scale-up. http://blog.rightscale.com/2008/04/23/animoto-facebook-scale-up.Google Scholar
- State of the Cloud - August 2009. http://www.jackofallclouds.com/2009/08/state-of-the-cloud-august-2009/.Google Scholar
- D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. FAWN: A fast array of wimpy nodes. In Proc. of SOSP, 2009. Google ScholarDigital Library
- E. Anderson and J. Tucek. Efficiency Matters! In Proc of HotStorage, 2009.Google Scholar
- E. J. Anderson, J. Hall, J. D. Hartline, M. Hobbs, A. R. Karlin, J. Saia, R. Swaminathan, and J. Wilkes. An experimental study of data migration algorithms. In Proc. of WAE, 2001. Google ScholarDigital Library
- M. Armbrust, A. Fox, D. A. Patterson, N. Lanham, B. Trushkowsky, J. Trutna, and H. Oh. Scads: Scale-independent storage for social computing applications. In Proc. of CIDR, 2009.Google Scholar
- P. T. Barham, B. Dragovic, K. Fraser, S. Hand, T. L. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In Proc. of SOSP, 2003. Google ScholarDigital Library
- J. Chase, L. Grit, D. Irwin, V. Marupadi, P. Shivam, and A. Yumerefendi. Beyond virtual data centers: Toward an open resource control architecture. In Proc. of VCI, 2007.Google Scholar
- J. S. Chase, D. C. Anderson, P. N. Thakar, A. Vahdat, and R. P. Doyle. Managing energy and server resources in hosting centers. In Proc. of SOSP, 2001. Google ScholarDigital Library
- S. Das, D. Agrawal, and A. E. Abbadi. Elastras: An elastic transactional data store in the cloud. In Proc. of HotCloud, 2009. Google ScholarDigital Library
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proc. of SOSP, 2007. Google ScholarDigital Library
- S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proc. of SOSP, 2003. Google ScholarDigital Library
- A. Gulati, I. Ahmad, and C. A. Waldspurger. Parda: Proportional allocation of resources for distributed storage access. In Proc. of FAST, 2009. Google ScholarDigital Library
- D. E. Irwin, J. S. Chase, L. E. Grit, A. R. Yumerefendi, D. Becker, and K. Yocum. Sharing networked resources with brokered leases. In Proc. of USENIX, 2006. Google ScholarDigital Library
- W. Jin, J. S. Chase, and J. Kaur. Interposed proportional sharing for a storage service utility. In Proc. of SIGMETRICS, 2004. Google ScholarDigital Library
- A. Kamra, V. Misra, and E. M. Nahum. Yaksha: A self-tuning controller for managing the performance of 3-tiered Web sites. In Proc. of IWQoS, 2004.Google ScholarCross Ref
- M. Karlsson, C. T. Karamanolis, and X. Zhu. Triage: Performance differentiation for storage systems using adaptive control. ACM Transactions on Storage, 2005. Google ScholarDigital Library
- A. Kimball, S. Michels-Slettvet, and C. Bisciglia. Cluster computing for Web-scale data processing. In Proc. of SIGCSE, 2008. Google ScholarDigital Library
- E. K. Lee and C. A. Thekkath. Petal: Distributed virtual disks. In Proc. of ASPLOS, 1996. Google ScholarDigital Library
- H. C. Lim, S. Babu, J. S. Chase, and S. S. Parekh. Automated control in cloud computing: Challenges and opportunities. In Proc. of ACDC, 2009. Google ScholarDigital Library
- C. Lu, G. A. Alvarez, and J. Wilkes. Aqueduct: Online data migration with performance guarantees. In Proc. of FAST, 2002. Google ScholarDigital Library
- P. Padala, K.-Y. Hou, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, and A. Merchant. Automated control of multiple virtualized resources. In Proc. of EuroSys, 2009. Google ScholarDigital Library
- P. Padala, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant, and K. Salem. Adaptive control of virtualized resources in utility computing environments. In Proc. of EuroSys, 2007. Google ScholarDigital Library
- S. S. Parekh, N. Gandhi, J. L. Hellerstein, D. M. Tilbury, T. S. Jayram, and J. P. Bigus. Using control theory to achieve service level objectives in performance management. Real-Time Systems, 2002. Google ScholarDigital Library
- Y. Saito, S. Frølund, A. C. Veitch, A. Merchant, and S. Spence. FAB: Building distributed enterprise disk arrays from commodity components. In Proc. of ASPLOS, 2004. Google ScholarDigital Library
- B. Seo and R. Zimmermann. Efficient disk replacement and data migration algorithms for large disk subsystems. ACM Transactions on Storage, 2005. Google ScholarDigital Library
- W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, S. Patil, A. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for Web 2.0. In Proc. of CCA, 2008.Google Scholar
- G. Soundararajan, C. Amza, and A. Goel. Database replication policies for dynamic content applications. In Proc. of EuroSys, 2006. Google ScholarDigital Library
- C. A. Thekkath, T. Mann, and E. K. Lee. Frangipani: A scalable distributed file system. In Proc. of SOSP, 1997. Google ScholarDigital Library
- B. Urgaonkar, G. Pacifici, P. J. Shenoy, M. Spreitzer, and A. N. Tantawi. An analytical model for multi-tier Internet services and its applications. In Proc. of SIGMETRICS, 2005. Google ScholarDigital Library
- B. Urgaonkar, P. J. Shenoy, A. Chandra, and P. Goyal. Dynamic provisioning of multi-tier Internet applications. In Proc. of ICAC, 2005. Google ScholarDigital Library
- S. Uttamchandani, L. Yin, G. A. Alvarez, J. Palmer, and G. A. Agha. Chameleon: A self-evolving, fully-adaptive resource arbitrator for storage systems. In Proc. of USENIX, 2005. Google ScholarDigital Library
- Y. Wang and A. Merchant. Proportional-share scheduling for distributed storage systems. In Proc. of FAST, 2007. Google ScholarDigital Library
- Z. Wang, X. Zhu, and S. Singhal. Utilization vs. SLO-based control for dynamic sizing of resource partitions. In Proc. of DSOM, 2005. Google ScholarDigital Library
- A. Yumerefendi, P. Shivam, D. Irwin, P. Gunda, L. Grit, A. Demberel, J. Chase, and S. Babu. Towards an autonomic computing testbed. In Proc. of HotAC, 2007. Google ScholarDigital Library
- X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant, P. Padala, and K. Shin. What does control theory bring to systems research? SIGOPS Operating Systems Review, 2009. Google ScholarDigital Library
Index Terms
- Automated control for elastic storage
Recommendations
Automated control in cloud computing: challenges and opportunities
ACDC '09: Proceedings of the 1st workshop on Automated control for datacenters and cloudsWith advances in virtualization technology, virtual machine services offered by cloud utility providers are becoming increasingly powerful, anchoring the ecosystem of cloud services. Virtual computing services are attractive in part because they enable ...
Automated control of multiple virtualized resources
EuroSys '09: Proceedings of the 4th ACM European conference on Computer systemsVirtualized data centers enable sharing of resources among hosted applications. However, it is difficult to satisfy service-level objectives(SLOs) of applications on shared infrastructure, as application workloads and resource consumption patterns ...
Automated control for SLA-aware elastic clouds
FeBiD '10: Proceedings of the Fifth International Workshop on Feedback Control Implementation and Design in Computing Systems and NetworksAlthough Cloud Computing provides a means to support remote, on-demand access top a set of computing resources, its ad-hoc management for quality-of-service and SLA poses significant challenges to the performance, availability and economical costs of ...
Comments