Survey PaperTopology management techniques for tolerating node failures in wireless sensor networks: A survey
Introduction
The growing interest in applications of wireless sensor networks (WSNs) has motivated a lot of research work in recent years [1], [2], [3], [4]. For some of these applications, such as space exploration, coastal and border protection, combat field reconnaissance and search and rescue, it is envisioned that a set of mobile sensor nodes will be employed to collaboratively monitor an area of interest and track certain events or phenomena. By getting these sensors to operate unattended in harsh environments, it would be possible to avoid the risk to human life and decrease the cost of the application.
Since a sensor node is typically constrained in its energy, computation and communication resources, a large set of sensors are involved to ensure area coverage and increase the fidelity of the collected data. Upon their deployment, nodes are expected to stay reachable to each other and form a network. Network connectivity enables nodes to coordinate their action while performing a task, and to forward their readings to in situ users or a base-station (BS) that serves as a gateway to remote command centers [5], [6]. In fact, in many setups, such as a disaster management application, nodes need to collaborate with each other in order to effectively search for survivors, assess damage and identify safe escape paths. To enable such interactions, nodes need to stay reachable to each other and route data to the BS). Therefore, the inter-sensor connectivity as well as the sensor-BS connectivity have a significant impact on the effectiveness of WSNs and should be sustained all the time.
However, a sudden failure of a node can cause a disruption to the network operation. A node may fail due to an external damage inflicted by the inhospitable surroundings or simply because of hardware malfunction. The loss of a node can break communication paths in the network and make some of its neighbors unreachable. Moreover, WSNs operating in a harsh environment may suffer from large scale damage which partitions the network into disjoint segments. For example in a battle field, parts of the deployment area may be attacked by explosives, and thus a set of sensor nodes in the vicinity would be destroyed and the surviving nodes are split into disjoint partitions (segments). Restoring inter-segment connectivity would be crucial so that the WSN becomes operational again.
In this paper, we first highlight the challenges that node failures introduce to the operation of WSNs and provide taxonomy of recovery techniques that are geared for restoring the network connectivity. We categorize fault-tolerance techniques proposed in the literature according to the pursued recovery methodology into proactive and reactive techniques. Further classification is done within each category based on the system assumptions, required network state, metrics and objectives for the recovery process, etc. Under each category, we discuss several algorithms and highlight their strengths and weaknesses. Finally, we enumerate open research issues that are yet to be investigated by the research community. To the best of our knowledge, this paper is the first to survey contemporary connectivity-centric fault-tolerance schemes for WSNs, and sheds light on several practical issues for application designers. It will also be a good resource for newcomers to this research area.
Since the process of providing fault-tolerance is in general a form of topology management (i.e., often leads to changes in the network topology parameters), we start in Section 2 with an overview of contemporary techniques and objective of topology management in WSNs. The rest of the paper is organized as follows. In Section 3, we describe our categorization of the existing approaches. The remaining sections follow this categorization. Section 4 discusses techniques for tolerating a single node failure or a sequence of independent and non-simultaneous failures affecting non-collocated nodes. Recovery from simultaneous failure of multiple nodes is covered in Section 5. Section 6 enumerates open issues and outlines possible future research directions. Finally, Section 7 concludes the paper.
Section snippets
Topology management techniques in WSNs
Networks require monitoring and maintenance whether they are wired or wireless. The service which provides these tasks is called network management. Network management includes five functional areas as identified by the International Organization of Standardization (ISO): configuration management, fault management, security management, performance management and accounting management [1], [5], [7]. The unique requirements and constraints of wireless networks such as WSNs have inspired a new
Classification of fault tolerance techniques in WSNs
In this section, we classify fault-tolerance techniques in WSNs that are applied in response to the loss of sensor nodes. Depending on the nature of the failure, different approaches may be required. Therefore, before describing the classification of the fault-tolerance techniques, we first explain the different failure models.
Tolerating single and non-collocated failures
As pointed out in the previous section, published techniques for tolerating a node failure that causes network partitioning either provision fault-tolerance in the network topology both at setup and during normal operation, or pursue a reactive strategy by repositioning healthy nodes. In this section we discuss both tolerance strategies, in the context of a single node failure or a sequence of independent and non-simultaneous failures affecting non-collocated nodes. Recovery from simultaneous
Tolerating multi-node failures
Due to the harsh surroundings, more than one sensor node may simultaneously fail. In addition, the network may suffer a large scale damage that involves many nodes and would thus create multiple disjoint segments. Restoring connectivity in this case is more challenging than the single node failure scenarios. In cases, where the simultaneously-failed nodes are not spatially adjacent, the problem is tackled as a multiple version of single node failures with special handling of potential resource
Future research issues
While a significant research has been done on topology management techniques to restore connectivity in partitioned WSNs, there are several directions that need further exploration. The following are some open research problems that warrant additional investigation, grouped based on the recovery methodology.
Conclusion
In many applications, WSNs operate in inhospitable setups, e.g., battlefield, and the nodes becomes subject to increased risk of getting damaged. Furthermore, nodes are equipped with small batteries and their operation ceases upon depleting their onboard energy supply. The failure of nodes may not only impact coverage and data fidelity but also can cause the network to be divided into disjoint blocks of nodes. The latter can lead to major degradation of the WSN functionality since the failure
Acknowledgments
This work was supported by the National Science Foundation (NSF) awards # CNS 1018171 and CNS 1018404.
Mohamed Younis is currently an associate professor in the department of computer science and electrical engineering at the university of Maryland Baltimore County (UMBC). He received his Ph.D. degree in computer science from New Jersey Institute of Technology, USA. Before joining UMBC, he was with the Advanced Systems Technology Group, an Aerospace Electronic Systems R&D organization of Honeywell International Inc. While at Honeywell he led multiple projects for building integrated fault
References (115)
Wireless sensor networks: a survey
Computer Networks
(2002)- et al.
Strategies and techniques for node placement in wireless sensor networks: a survey
Journal of Ad-Hoc Networks
(2008) - et al.
A survey on clustering algorithms for wireless sensor networks
Computer Communications
(2007) - et al.
Sink repositioning for enhanced performance in wireless sensor networks
Computer Networks
(2005) Trading latency for energy in densely deployed wireless ad hoc networks using message ferrying
Journal of Ad Hoc Networks
(May 2007)- et al.
Steiner tree problem with minimum number of steiner points and bounded edge-length
Information Processing Letters
(1999) - et al.
Relay node placement in large scale wireless sensor networks
Computer Communications, Special Issue on Wireless Sensor Networks
(2006) - et al.
On constructing k-connected k-dominating sets in wireless ad-hoc and sensor networks
Journal of Parallel and Distributed Computing
(2006) - et al.
Approximations for Steiner trees with minimum number of Steiner points
Theoretical Computer Science
(2001) - et al.
Improved approximation algorithms for uniform connectivity problems
Journal of Algorithms
(1996)
Efficient deployment of wireless sensor networks targeting environment monitoring applications
Computer Communications
Coverage-aware connectivity restoration in mobile sensor networks
Elsevier Journal of Network and Computer Applications
Localized motion-based connectivity restoration algorithms for wireless sensor actor networks
Journal of Network and Computer Applications
Recovery from multiple simultaneous failures in wireless sensor networks using minimum steiner tree
Journal of Parallel and Distributed Systems
Handling large-scale node failures in mobile sensor/robot networks
Journal of Network and Computer Applications
Relay node placement in structurally damaged wireless sensor networks via triangular steiner tree approximation
Computer Communications
Sensor networks: evolution, opportunities, and challenges
Proceedings of the IEEE
Networking issues in wireless sensor networks
Journal of Parallel and Distributed Computing (JPDC), Special Issue on Frontiers in Distributed Sensor Networks
Protocols and Architectures for Wireless Sensor Networks
The design space of wireless sensor networks
IEEE Wireless Communications
STREAM: sensor topology retrieval at multiple resolutions
Journal of Telecommunications, Special Issue on Wireless Sensor Networks
Optimizing sensor networks in the energy-latency-density design space
IEEE Transactions on Mobile Computing
Transmission power control techniques for wireless sensor networks
Computer Networks
Design and Analysis of Fault-Tolerant Digital Systems
Optimized relay placement for wireless sensor networks federation in environmental applications
Wireless Communication and Mobile Computing
Towards augmented connectivity with delay constraints in WSN federation
International Journal of Ad Hoc and Ubiquitous Computing
Cited by (0)
Mohamed Younis is currently an associate professor in the department of computer science and electrical engineering at the university of Maryland Baltimore County (UMBC). He received his Ph.D. degree in computer science from New Jersey Institute of Technology, USA. Before joining UMBC, he was with the Advanced Systems Technology Group, an Aerospace Electronic Systems R&D organization of Honeywell International Inc. While at Honeywell he led multiple projects for building integrated fault tolerant avionics and dependable computing infrastructure. He also participated in the development of the Redundancy Management System, which is a key component of the Vehicle and Mission Computer for NASA’s X-33 space launch vehicle. His technical interest includes network architectures and protocols, wireless sensor networks, embedded systems, fault tolerant computing, secure communication and distributed real-time systems. He has published over 180 technical papers in refereed conferences and journals. He has five granted and two pending patents. In addition, he serves/served on the editorial board of multiple journals and the organizing and technical program committees of numerous conferences. He is a senior member of the IEEE.
Izzet F. Senturk received BS and MEng in Computer Science from Ege University, Turkey in 2006 and Cornell University in 2008 respectively. Currently he is a PhD candidate in the department of Computer Science at Southern Illinois University. His areas of interest are mobility and fault-tolerance in wireless sensor networks and energy-aware protocol design for mobile sensor networks. He is a member of IEEE.
Kemal Akkaya is an associate professor in the Department of Computer Science at Southern Illinois University, Carbondale, IL. He received his BS and MS in Computer Engineering from Bilkent University, Turkey and Middle-East Technical University, Turkey in 1997 and 1999 respectively. After working as a software developer in Ankara, Turkey, he moved to US in 2000 for pursuing a PhD degree in Computer Science. He received his PhD in Computer Science from University of Maryland Baltimore County in 2005 and joined the department of Computer Science at Southern Illinois University in 2005. He was a visiting professor at The George Washington University in Fall 2013. His current research interests include energy aware routing, topology control, security and quality of service issues in a variety of wireless networks such as sensor networks, sensor-actor networks, multimedia sensor networks, smart-grid communication networks and vehicular networks. He is a member of IEEE. He is the area editor of Elsevier Ad Hoc Network Journal. He has served as the guest editor for Journal of High Speed Networks, Computer Communications Journal, Elsevier Ad Hoc Networks Journal and in the TPC of many leading wireless networking conferences including IEEE ICC, Globecom, LCN and WCNC. He has published over 80 papers in peer reviewed journal and conferences. He has received ‘‘Top Cited’’ article award from Elsevier in 2010.
Sookyoung Lee received the M.S. and Ph.D. degrees in Computer Science from the Ewha Womans University, Korea, and University of Maryland, Baltimore County, USA in 1997 and 2010 respectively. She has been with LG ELECTRONICS Inc., Electronics and Telecommunications Research Institute, Korea Electrics Technology Institute, and Samsung Electronics Co. LTD., Korea from 1998 to 2004. While at LG, she has developed the IP data server over ATM switch and implemented virtual private network service for multi-protocol label switching system. She was a volunteer of IPv6 forum Korea while at ETRI and has developed the network address and protocol translation system between IPv4 and IPv6. At Samsung, she was a broadband convergence newtork designer especially focusing on requirements for QoS and IPv6. She is currently a research professor in the department of computer science and engineering at the Ewha Womans University, Korea. Her primary research interest includes network architectures and protocols, topology restoration and fault tolerance in wireless sensor networks and network modeling and performance analysis for dynamic and sparse ad-hoc networks.
Fatih Senel received his BS and MS degrees in Computer Science from Bilkent University, Ankara, Turkey in 2005 and Southern Illinois University Carbondale, IL in 2008, respectively. In 2012, he received his PhD in Computer Science from University of Maryland Baltimore County. Currently, he is an assistant professor in the Department of Computer Engineering at Antalya International University, Turkey. His research interests include clustering, relocation and fault-tolerance in wireless sensor networks, and self-deployment of nodes in underwater acoustic sensor networks.