Survey PaperCloud monitoring: A survey
Introduction
Cloud Computing [1] has rapidly become a widely adopted paradigm for delivering services over the Internet. This is due to a number of technical reasons, including: improvement of energy efficiency, optimization of hardware and software resources utilization, elasticity, performance isolation, flexibility, and on-demand service schema [2]. In addition to such technical benefits, the literature has shown how the Cloud Computing model provides several economical benefits including minimal capital and operational expenditures (CAPEX and OPEX). For all these reasons, the number of organizations adopting Cloud solutions and subscribers accessing Cloud services has rapidly increased, exceeding the optimistic initial plans, and so has done the complexity of Cloud systems. Cloud services are on-demand, elastic and scalable, and the following main features are therefore needed by a Cloud system: availability, concurrency, dynamic load balancing, independence of running applications, security, and intensiveness (as defined and analyzed in [3]). To provide these features, advanced virtualization techniques, robust and dynamic scheduling approaches, advanced security measures and disaster recovery mechanisms are implemented and operated in Cloud Computing systems. Data centers for Cloud Computing continue to grow in terms of both hardware resources and traffic volume, thus making Cloud operation and management more and more complex [149].
In this scenario, accurate and fine-grained monitoring activities are required to efficiently operate these platforms and to manage their increasing complexity.
In literature, there is a large number of works proposing surveys and taxonomies of Cloud Computing in general [4], [5], [6], [7], [8], [9], [10], of Virtualization technologies [11], [12], and of Cloud Security [13], [14], [15], [16], [17], [18], [19]. To the best of our knowledge, however, there are no specific surveys on platforms, techniques, and tools for monitoring Cloud infrastructures, services, and applications. This is what we define as Cloud monitoring.
In this paper, we provide a survey of Cloud monitoring, analyzing the articulate state of the art in this field. According to the indications reported in [126], we adopt the research methodology depicted in Fig. 1, which is described in the following.
- •
We select a well-known taxonomy of the terms and roles in the field of Cloud Computing for the contextualization of the contributions we provide in this paper. To this aim we use the work of the National Institute of Standards and Technology (NIST) [1], [20].
- •
After analyzing the literature in the field of Cloud Computing, using the definitions proposed by NIST, we provide a two-axis taxonomy for Cloud monitoring:
- •
one axis is for the several motivations for monitoring Cloud Computing (Section 3);
- •
the other axis is further expanded along three dimensions: layers; abstractions level; tests and metrics (Section 4).
- •
- •
Thanks to the results of the previous step, we analyze several research works for deriving the main properties of systems for Cloud monitoring, the issues associated with such properties, and the contributions in literature regarding these properties and issues (Section 5). Moreover, we analyze a number of commercial and open source platforms and a number of services for Cloud monitoring, evidencing also their relation with the properties and issues discussed before (Section 6).
- •
The previous steps provide us the inputs to derive the open issues and the future directions in the field of Cloud monitoring (Section 7).
Section snippets
Cloud Computing: a brief overview
According to the NIST, Cloud Computing is defined as follows [1]:
“Model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”
The NIST and Cloud community have also defined the following important concepts: (i) Essential characteristics, (ii) service models, (iii) hosting, (iv)
Cloud Computing: the need for monitoring
Monitoring of Cloud is a task of paramount importance for both Providers and Consumers. On the one side, it is a key tool for controlling and managing hardware and software infrastructures; on the other side, it provides information and Key Performance Indicators (KPIs) for both platforms and applications. The continuous monitoring of the Cloud and of its SLAs (for example, in terms of availability, delay, etc.) supplies both the Providers and the Consumers with information such as the workload
Cloud Monitoring: basic concepts
As introduced in Section 3, Cloud monitoring is needed to continuously measure and assess infrastructure or application behaviors in terms of performance, reliability, power usage, ability to meet SLAs, security, etc. [44], to perform business analytics, for improving the operation of systems and applications [45], and for several other activities (see Section 3). In this section we introduce a number of concepts at the base of Cloud monitoring that are used to set the context for the following
Cloud monitoring: properties and related issues
In order to operate properly, a distributed monitoring system is required to have several properties that, when considered in the Cloud Computing scenario, introduce new issues. In this Section we define and motivate such properties, analyze the issues arising from them, and discuss how these issues have been addressed in literature. In Fig. 2 we report these properties in a taxonomy of main aspects regarding Cloud monitoring considered in this paper.
In Fig. 3 we illustrate the research issues
Cloud monitoring platforms and services
In this section we review the most spread commercial and open source platforms for Cloud monitoring as well as services that can help Consumers to assess the performance and the reliability of Cloud services (see Table 2). We describe both Cloud management solutions that contain a module specifically targeted to the monitoring and solutions whose only target is Cloud monitoring.
Cloud monitoring: open issues and future directions
The infrastructure of a Cloud is very complex. This complexity translates into more effort needed for management and monitoring. The greater scalability and larger size of Clouds compared to traditional service hosting infrastructures, involve more complex monitoring systems, which have therefore to be more scalable, robust and fast. Such systems must be able to manage and verify a large number of resources and must do it effectively and efficiently. This has to be achieved through short
Conclusion
In this paper we have provided a careful analysis of the state of the art in the field of Cloud monitoring. Fig. 2 shows a taxonomy containing a quick snapshot of the main aspects we have considered in this paper. In more detail, we have discussed the main activities in Cloud environment that have strong benefit from or actual need of monitoring. To contextualize and study Cloud monitoring, we have provided background and definitions for key concepts. We have also derived the main properties
Acknowledgements
This work has been partially funded by PLATINO (PON01_01007) by MIUR, by ’’SMART HEALTH CLUSTER OSDH - SMART FSE - STAYWELL’’ (PON04a2_C) project by MIUR, by ‘‘Un sistema elettronico di elaborazione in tempo reale per l’estrazione di informazioni da video ad alta risoluzione, alto frame rate e basso rapporto segnale rumore’’ Project of the F.A.R.O. Programme, and Google Faculty Award for the UBICA project. We thank the Editor and the anonymous reviewers for the valuable comments that helped
Giuseppe Aceto is a Ph.D. student in Electronic and Telecommunications Engineering at the Department of Electrical Engineering and information Technologies of the University of Napoli Federico II (Italy), where he received his M.S. degree in Telecommunications Engineering in 2008, defending a thesis about a unified platform for available bandwidth estimation in heterogeneous IP networks. He is a junior researcher for the same Department. His research interests are focused on networking, more
References (150)
- P. Mell, T. Grance, The NIST Definition of Cloud Computing, NIST Special Publication 800-145, 2011....
- et al.
A view of cloud computing
Commun. ACM
(2010) - et al.
CM-measurement facets for cloud performance
International Journal of Computer Applications
(2011) - et al.
State of the art and research challenges of new services architecture technologies: virtualization, SOA and cloud computing
International Journal of Grid and Distributed Computing
(2010) - C. Gong, J. Liu, Q. Zhang, H. Chen, Z. Gong, The characteristics of cloud computing, in: 39th International Conference...
- et al.
Cloud computing research and development trend
ICFN ’10 Second International Conference on Future Networks,
(2010) - et al.
An advanced survey on cloud computing and state-of-the-art research issues
IJCSI
(2012) - B.P. Rimal, E. Choi, I. Lumb, A taxonomy and survey of cloud computing systems, in: NCM’09. Fifth International Joint...
- I. Foster, Y. Zhao, I. Raicu, S. Lu, Cloud computing and grid computing 360-degree compared, in: Grid Computing...
- et al.
A network-oriented survey and open issues in cloud computing
A survey of network virtualization
Computer Networks
Addressing cloud computing security issues
Future Generation Computer Systems
A survey on gaps threat remediation challenges and some thoughts for proactive attack detection in cloud computing
Future Generation Computer Systems
A survey on cloud computing security, challenges and threats
International Journal
Security challenges for the public cloud
IEEE Internet Computing
Capacity Planning for Web Services: Metrics, Models, and Methods
On demand fine grain resource monitoring system for server consolidation
A survey of mobile cloud computing: Architecture,applications and approaches
Wireless Communications and Mobile Computing
Trustworthy and resilient monitoring system for cloud infrastructures
Cloud application monitoring: the mOSAIC approach
Monitoring cloud computing by layer, Part 1
IEEE Security & Privacy
Monitoring cloud computing by layer, Part 2
IEEE Security & Privacy
The RESERVOIR model and architecture for open federated cloud computing
IBM Journal of Research and Development
Monalytics: online monitoring and analytics for managing large scale data centers
Auto-scaling, load balancing and monitoring in commercial and open-source clouds
Cited by (475)
A survey on intelligent management of alerts and incidents in IT services
2024, Journal of Network and Computer ApplicationsWeb service adaptation: A decade's overview
2023, Computer Science ReviewThe view on systems monitoring and its requirements from future Cloud-to-Thing applications and infrastructures
2023, Future Generation Computer SystemsPrediction models for Clustered Virtual Machines in Data Centers
2023, Procedia Computer ScienceMonitoring fog computing: A review, taxonomy and open challenges
2022, Computer NetworksCitation Excerpt :This Section analyzes other surveys and works that proposed monitoring taxonomies in the areas of cloud computing, fog computing and related paradigms and compares them with this work. Some papers have presented monitoring taxonomies on cloud computing [32,53–55]. In the work of Ward and Baker [54] the authors surveyed monitoring tools and derived a taxonomy to classify them.
Exploration of Fault Identification and Automatic Recovery in Cloud-based FPGA Systems
2024, Digest of Technical Papers - IEEE International Conference on Consumer Electronics
Giuseppe Aceto is a Ph.D. student in Electronic and Telecommunications Engineering at the Department of Electrical Engineering and information Technologies of the University of Napoli Federico II (Italy), where he received his M.S. degree in Telecommunications Engineering in 2008, defending a thesis about a unified platform for available bandwidth estimation in heterogeneous IP networks. He is a junior researcher for the same Department. His research interests are focused on networking, more specifically on network measurements and traffic analysis. Giuseppe Aceto is coauthor of papers on international journals (ACM Performance Evaluation Review, Elsevier Journal of Network and Computer Applications) and international conferences (IEEE International Workshop on Measurements & Networking, IEEE INFOCOM 2010, IEEE Symposium on Computer and Communications). In 2010 he was awarded with the best local paper award at IEEE ISCC 2010.
Alessio Botta is a postdoc at the Department of Electrical Engineering and Information Technologies of the University of Napoli Federico II (Italy). He graduated in Telecommunications Engineering (M.S.) and obtained the Ph.D. in Computer Engineering and Systems, both at University of Napoli Federico II. His research interests are in the area of networking and, in particular, in the area of network performance measurement and improvement, with a specific focus on wireless and heterogeneous systems. Alessio Botta has coauthored more than 40 international journal (IEEE Communications Magazine, IEEE Transactions on Parallel and Distributed Systems, Elsevier Computer Networks, etc.) and conference (IEEE Globecom, IEEE ICC, IEEE ISCC, etc.) publications. He has served and serves several technical program committees of several international conferences (IEEE Globecom, IEEE ICC, etc.) and he acts as reviewer for different international conferences (IEEE Infocom, etc.) and journals (IEEE Transactions on Mobile Computing, IEEE Network, IEEE Transactions on Vehicular Technology, etc.) in the area of networking. In 2010 he was awarded with the best local paper award at IEEE ISCC 2010.
Walter de Donato is a postdoc at the Department of Electrical Engineering and Information Technologies of the University of Napoli Federico II (Italy). He received a M.S. degree in Computer Engineering and a Ph.D. in Computer Engineering from the same University. His research activity mainly concerns methodologies, techniques, and distributed architectures for measuring, analyzing, classifying, and monitoring network traffic and also covers the following topics: network topology discovery and mapping, Linux-based embedded systems, network processor architectures, and content distribution networks. Walter de Donato has coauthored several international conference (ACM Sigcomm, PAM, IEEE Globecom, etc.) publications. He served as reviewer for several international conferences (IEEE Globecom, IEEE ICC, etc.) and journals (Computer Networks, etc.) in the area of networking. In 2011 he was awarded with the TEA (Technologybiz Endorsement Award).
Antonio Pescapè is an Assistant Professor at the Department of Electrical Engineering and Information Technologies of the University of Napoli Federico II (Italy) and Honorary Visiting Senior Research Fellow at the School of Computing, Informatics and Media of the University of Bradford (UK). He received the M.S. Laurea Degree in Computer Engineering and the Ph.D. in Computer Engineering and Systems, both at University of Napoli Federico II. Antonio Pescapè teaches courses in Computer Networks, Computer Architectures, Programming, and Multimedia and he has also supervised and graduated more than 130 among B.S., M.S., and Ph.D. students. His research interests are in the networking field with focus on Internet Monitoring, Measurements and Management and on Network Security. Antonio Pescapè has coauthored over 130 journal (IEEE Communications Magazine, JSAC, IEEE Wireless Communications Magazine, IEEE Networks, etc.) and conference (SIGCOMM, IMC, PAM, Globecom, ICC, etc.) publications and he is co-author of several patents pending. He has served and 23 serves on more than 150 technical program committees of IEEE and ACM conferences. He has served as Editorial Board Member of IEEE Survey and Tutorials (2007–2010) and was guest editor for the special issue of Computer Networks on ‘‘Traffic classification and its applications to modern networks’’. For his research activities he has received several awards. In 2009 he was awarded the IET Communications Premium Award 2009; in 2010 he was awarded the best local paper award at IEEE ISCC 2010; in november 2011 he was awarded the TEA (Technologybiz Endorsement Award); he was awarded by Open Source Software World Challenge for the D-ITG platform in 2011 and for the TIE platform in 2012; in 2012 two of his papers have been awarded the IRTF ANRP (Applied Networking Research Prize) and he was awarded the Best Poster award at SIGCOMM 2012; he was awarded the Google Faculty Award in 2013. He is a Senior Member of the IEEE. Finally, Antonio Pescapè has served and serves as independent reviewer/evaluator of research and implementation projects and project proposals co-funded by the Swedish government, several Italian local governments, Italian Ministry for University and Research (MIUR) and Italian Ministry of Economic Development (MISE).
- 1
Preliminary results within the same framework have been published in G. Aceto, A. Botta, W. de Donato, A. Pescapè, “Cloud monitoring: definitions, issues and future directions”, 1st IEEE International Conference on Cloud Networking (IEEE CloudNet’12)”, Paris (France), November 28–30, 2012.