Elsevier

Computer Networks

Volume 58, 15 January 2014, Pages 254-283
Computer Networks

Survey Paper
Topology management techniques for tolerating node failures in wireless sensor networks: A survey

https://doi.org/10.1016/j.comnet.2013.08.021Get rights and content

Abstract

In wireless sensor networks (WSNs) nodes often operate unattended in a collaborative manner to perform some tasks. In many applications, the network is deployed in harsh environments such as battlefield where the nodes are susceptible to damage. In addition, nodes may fail due to energy depletion and breakdown in the onboard electronics. The failure of nodes may leave some areas uncovered and degrade the fidelity of the collected data. However, the most serious consequence is when the network gets partitioned into disjoint segments. Losing network connectivity has a very negative effect on the applications since it prevents data exchange and hinders coordination among some nodes. Therefore, restoring the overall network connectivity is very crucial. Given the resource-constrained setup, the recovery should impose the least overhead and performance impact. This paper focuses on network topology management techniques for tolerating/handling node failures in WSNs. Two broad categories based on reactive and proactive methods have been identified for classifying the existing techniques. Considering these categories, a thorough analysis and comparison of all the recent works have been provided. Finally, the paper is concluded by outlining open issues that warrant additional research.

Introduction

The growing interest in applications of wireless sensor networks (WSNs) has motivated a lot of research work in recent years [1], [2], [3], [4]. For some of these applications, such as space exploration, coastal and border protection, combat field reconnaissance and search and rescue, it is envisioned that a set of mobile sensor nodes will be employed to collaboratively monitor an area of interest and track certain events or phenomena. By getting these sensors to operate unattended in harsh environments, it would be possible to avoid the risk to human life and decrease the cost of the application.

Since a sensor node is typically constrained in its energy, computation and communication resources, a large set of sensors are involved to ensure area coverage and increase the fidelity of the collected data. Upon their deployment, nodes are expected to stay reachable to each other and form a network. Network connectivity enables nodes to coordinate their action while performing a task, and to forward their readings to in situ users or a base-station (BS) that serves as a gateway to remote command centers [5], [6]. In fact, in many setups, such as a disaster management application, nodes need to collaborate with each other in order to effectively search for survivors, assess damage and identify safe escape paths. To enable such interactions, nodes need to stay reachable to each other and route data to the BS). Therefore, the inter-sensor connectivity as well as the sensor-BS connectivity have a significant impact on the effectiveness of WSNs and should be sustained all the time.

However, a sudden failure of a node can cause a disruption to the network operation. A node may fail due to an external damage inflicted by the inhospitable surroundings or simply because of hardware malfunction. The loss of a node can break communication paths in the network and make some of its neighbors unreachable. Moreover, WSNs operating in a harsh environment may suffer from large scale damage which partitions the network into disjoint segments. For example in a battle field, parts of the deployment area may be attacked by explosives, and thus a set of sensor nodes in the vicinity would be destroyed and the surviving nodes are split into disjoint partitions (segments). Restoring inter-segment connectivity would be crucial so that the WSN becomes operational again.

In this paper, we first highlight the challenges that node failures introduce to the operation of WSNs and provide taxonomy of recovery techniques that are geared for restoring the network connectivity. We categorize fault-tolerance techniques proposed in the literature according to the pursued recovery methodology into proactive and reactive techniques. Further classification is done within each category based on the system assumptions, required network state, metrics and objectives for the recovery process, etc. Under each category, we discuss several algorithms and highlight their strengths and weaknesses. Finally, we enumerate open research issues that are yet to be investigated by the research community. To the best of our knowledge, this paper is the first to survey contemporary connectivity-centric fault-tolerance schemes for WSNs, and sheds light on several practical issues for application designers. It will also be a good resource for newcomers to this research area.

Since the process of providing fault-tolerance is in general a form of topology management (i.e., often leads to changes in the network topology parameters), we start in Section 2 with an overview of contemporary techniques and objective of topology management in WSNs. The rest of the paper is organized as follows. In Section 3, we describe our categorization of the existing approaches. The remaining sections follow this categorization. Section 4 discusses techniques for tolerating a single node failure or a sequence of independent and non-simultaneous failures affecting non-collocated nodes. Recovery from simultaneous failure of multiple nodes is covered in Section 5. Section 6 enumerates open issues and outlines possible future research directions. Finally, Section 7 concludes the paper.

Section snippets

Topology management techniques in WSNs

Networks require monitoring and maintenance whether they are wired or wireless. The service which provides these tasks is called network management. Network management includes five functional areas as identified by the International Organization of Standardization (ISO): configuration management, fault management, security management, performance management and accounting management [1], [5], [7]. The unique requirements and constraints of wireless networks such as WSNs have inspired a new

Classification of fault tolerance techniques in WSNs

In this section, we classify fault-tolerance techniques in WSNs that are applied in response to the loss of sensor nodes. Depending on the nature of the failure, different approaches may be required. Therefore, before describing the classification of the fault-tolerance techniques, we first explain the different failure models.

Tolerating single and non-collocated failures

As pointed out in the previous section, published techniques for tolerating a node failure that causes network partitioning either provision fault-tolerance in the network topology both at setup and during normal operation, or pursue a reactive strategy by repositioning healthy nodes. In this section we discuss both tolerance strategies, in the context of a single node failure or a sequence of independent and non-simultaneous failures affecting non-collocated nodes. Recovery from simultaneous

Tolerating multi-node failures

Due to the harsh surroundings, more than one sensor node may simultaneously fail. In addition, the network may suffer a large scale damage that involves many nodes and would thus create multiple disjoint segments. Restoring connectivity in this case is more challenging than the single node failure scenarios. In cases, where the simultaneously-failed nodes are not spatially adjacent, the problem is tackled as a multiple version of single node failures with special handling of potential resource

Future research issues

While a significant research has been done on topology management techniques to restore connectivity in partitioned WSNs, there are several directions that need further exploration. The following are some open research problems that warrant additional investigation, grouped based on the recovery methodology.

Conclusion

In many applications, WSNs operate in inhospitable setups, e.g., battlefield, and the nodes becomes subject to increased risk of getting damaged. Furthermore, nodes are equipped with small batteries and their operation ceases upon depleting their onboard energy supply. The failure of nodes may not only impact coverage and data fidelity but also can cause the network to be divided into disjoint blocks of nodes. The latter can lead to major degradation of the WSN functionality since the failure

Acknowledgments

This work was supported by the National Science Foundation (NSF) awards # CNS 1018171 and CNS 1018404.

Mohamed Younis is currently an associate professor in the department of computer science and electrical engineering at the university of Maryland Baltimore County (UMBC). He received his Ph.D. degree in computer science from New Jersey Institute of Technology, USA. Before joining UMBC, he was with the Advanced Systems Technology Group, an Aerospace Electronic Systems R&D organization of Honeywell International Inc. While at Honeywell he led multiple projects for building integrated fault

References (115)

  • F. Al-Turjman et al.

    Efficient deployment of wireless sensor networks targeting environment monitoring applications

    Computer Communications

    (2013)
  • N. Tamboli et al.

    Coverage-aware connectivity restoration in mobile sensor networks

    Elsevier Journal of Network and Computer Applications

    (2010)
  • M. Imran et al.

    Localized motion-based connectivity restoration algorithms for wireless sensor actor networks

    Journal of Network and Computer Applications

    (2012)
  • S. Lee et al.

    Recovery from multiple simultaneous failures in wireless sensor networks using minimum steiner tree

    Journal of Parallel and Distributed Systems

    (2010)
  • K. Akkaya et al.

    Handling large-scale node failures in mobile sensor/robot networks

    Journal of Network and Computer Applications

    (2013)
  • F. Senel et al.

    Relay node placement in structurally damaged wireless sensor networks via triangular steiner tree approximation

    Computer Communications

    (2011)
  • C-Y. Chong et al.

    Sensor networks: evolution, opportunities, and challenges

    Proceedings of the IEEE

    (2003)
  • D. Estrin, L. Girod, G. Pottie, M. Srivastava, Instrumenting the world with wireless sensor networks, in: Proc. of the...
  • D. Ganesan

    Networking issues in wireless sensor networks

    Journal of Parallel and Distributed Computing (JPDC), Special Issue on Frontiers in Distributed Sensor Networks

    (2003)
  • H. Karl et al.

    Protocols and Architectures for Wireless Sensor Networks

    (2005)
  • K. Romer et al.

    The design space of wireless sensor networks

    IEEE Wireless Communications

    (2004)
  • L. Bao, J.J. Garcia-Luna-Aceves, Topology management in ad hoc networks, in: Proc. of the 4th ACM International...
  • B. Deb et al.

    STREAM: sensor topology retrieval at multiple resolutions

    Journal of Telecommunications, Special Issue on Wireless Sensor Networks

    (2004)
  • A. Cerpa, D. Estrin, ASCENT: adaptive self-configuring sensor networks topologies, in: Proc. of the 21st International...
  • P.B. Godfrey, D. Ratajczak, Naps: scalable, robust topology management in wireless ad hoc networks, in: Proc. of the...
  • C. Schurgers et al.

    Optimizing sensor networks in the energy-latency-density design space

    IEEE Transactions on Mobile Computing

    (2002)
  • Y. Xu, J. Heidemann, D. Estrin, Geography-informed energy conservation for ad hoc routing, in: Proc. of the 7th Annual...
  • Y. Lai, H. Chen, Energy-efficient fault-tolerant mechanism for clustered wireless sensor networks, in: Proc. of the...
  • G. Gupta, M. Younis, Fault-tolerant clustering of wireless sensor networks, in: Proc. of the Wireless Communications...
  • T. Bagheri, DFMC: decentralized fault management mechanism for cluster based wireless sensor networks, in: Proc. the...
  • L.H.A. Correiaa et al.

    Transmission power control techniques for wireless sensor networks

    Computer Networks

    (2007)
  • S. Lin, J. Zhang, G. Zhou, L. Gu, J.A. Stankovic, T. He, ATPC: adaptive transmission power control for wireless sensor...
  • J. Jeong, D. Culler, J.H. Oh, Empirical analysis of transmission power control algorithms for wireless sensor networks,...
  • J. Luo, J.-P. Hubaux, Joint mobility and routing for lifetime elongation in wireless sensor networks, in: Proc. of IEEE...
  • I. Chatzigiannakis, A. Kinalis, S. Nikoletseas, Sink mobility protocols for data collection in wireless sensor...
  • Z.M. Wang, S. Basagni, E. Melachrinoudis, C. Petrioli, Exploiting sink mobility for maximizing sensor networks...
  • W. Alsalih, S. Akl, H. Hassanein, Placement of multiple mobile base stations in wireless sensor networks, in: Proc. of...
  • W. Youssef, M. Younis, K. Akkaya, An intelligent safety-aware gateway relocation scheme for wireless sensor networks,...
  • W. Wang, V. Srinivasan, K.-C. Chua, Using mobile relays to prolong the lifetime of wireless sensor networks, in: Proc...
  • H. Almasaeid, A.E. Kamal, Modeling mobility-assisted data collection in wireless sensor networks, in: Proc. of the IEEE...
  • W. Zhao, M. Ammar, E. Zegura, A message ferrying approach for data delivery in sparse mobile ad hoc networks, in: Proc....
  • H. Almasaeid, A.E. Kamal, Data delivery in fragmented wireless sensor netowrks using mobile agents, in: Proc. the 10th...
  • B.W. Johnson

    Design and Analysis of Fault-Tolerant Digital Systems

    (1989)
  • X. Han, X. Cao, E.L. Lloyd, C.-C. Shen, Fault-tolerant relay nodes placement in heterogeneous wireless sensor networks,...
  • N. Li, J. C. Hou, Flss: a fault-tolerant topology control algorithm for wireless networks, in: Proc. 10th ACM Annual...
  • F. Al-Turjman, H. Hassanein, M. Ibnkahla, Optimized relay placement to federate wireless sensor networks in...
  • A. Ghosh, S. Boyd, Growing well-connected graphs, in: Proc. of IEEE Conf. on Decision and Control (CDC), San Diego, CA,...
  • F. Al-Turjman et al.

    Optimized relay placement for wireless sensor networks federation in environmental applications

    Wireless Communication and Mobile Computing

    (2011)
  • F. Al-Turjman et al.

    Towards augmented connectivity with delay constraints in WSN federation

    International Journal of Ad Hoc and Ubiquitous Computing

    (2012)
  • S. Vaidya, M. Younis, Efficient failure recovery in wireless sensor networks through active spare designation, in:...
  • Cited by (0)

    Mohamed Younis is currently an associate professor in the department of computer science and electrical engineering at the university of Maryland Baltimore County (UMBC). He received his Ph.D. degree in computer science from New Jersey Institute of Technology, USA. Before joining UMBC, he was with the Advanced Systems Technology Group, an Aerospace Electronic Systems R&D organization of Honeywell International Inc. While at Honeywell he led multiple projects for building integrated fault tolerant avionics and dependable computing infrastructure. He also participated in the development of the Redundancy Management System, which is a key component of the Vehicle and Mission Computer for NASA’s X-33 space launch vehicle. His technical interest includes network architectures and protocols, wireless sensor networks, embedded systems, fault tolerant computing, secure communication and distributed real-time systems. He has published over 180 technical papers in refereed conferences and journals. He has five granted and two pending patents. In addition, he serves/served on the editorial board of multiple journals and the organizing and technical program committees of numerous conferences. He is a senior member of the IEEE.

    Izzet F. Senturk received BS and MEng in Computer Science from Ege University, Turkey in 2006 and Cornell University in 2008 respectively. Currently he is a PhD candidate in the department of Computer Science at Southern Illinois University. His areas of interest are mobility and fault-tolerance in wireless sensor networks and energy-aware protocol design for mobile sensor networks. He is a member of IEEE.

    Kemal Akkaya is an associate professor in the Department of Computer Science at Southern Illinois University, Carbondale, IL. He received his BS and MS in Computer Engineering from Bilkent University, Turkey and Middle-East Technical University, Turkey in 1997 and 1999 respectively. After working as a software developer in Ankara, Turkey, he moved to US in 2000 for pursuing a PhD degree in Computer Science. He received his PhD in Computer Science from University of Maryland Baltimore County in 2005 and joined the department of Computer Science at Southern Illinois University in 2005. He was a visiting professor at The George Washington University in Fall 2013. His current research interests include energy aware routing, topology control, security and quality of service issues in a variety of wireless networks such as sensor networks, sensor-actor networks, multimedia sensor networks, smart-grid communication networks and vehicular networks. He is a member of IEEE. He is the area editor of Elsevier Ad Hoc Network Journal. He has served as the guest editor for Journal of High Speed Networks, Computer Communications Journal, Elsevier Ad Hoc Networks Journal and in the TPC of many leading wireless networking conferences including IEEE ICC, Globecom, LCN and WCNC. He has published over 80 papers in peer reviewed journal and conferences. He has received ‘‘Top Cited’’ article award from Elsevier in 2010.

    Sookyoung Lee received the M.S. and Ph.D. degrees in Computer Science from the Ewha Womans University, Korea, and University of Maryland, Baltimore County, USA in 1997 and 2010 respectively. She has been with LG ELECTRONICS Inc., Electronics and Telecommunications Research Institute, Korea Electrics Technology Institute, and Samsung Electronics Co. LTD., Korea from 1998 to 2004. While at LG, she has developed the IP data server over ATM switch and implemented virtual private network service for multi-protocol label switching system. She was a volunteer of IPv6 forum Korea while at ETRI and has developed the network address and protocol translation system between IPv4 and IPv6. At Samsung, she was a broadband convergence newtork designer especially focusing on requirements for QoS and IPv6. She is currently a research professor in the department of computer science and engineering at the Ewha Womans University, Korea. Her primary research interest includes network architectures and protocols, topology restoration and fault tolerance in wireless sensor networks and network modeling and performance analysis for dynamic and sparse ad-hoc networks.

    Fatih Senel received his BS and MS degrees in Computer Science from Bilkent University, Ankara, Turkey in 2005 and Southern Illinois University Carbondale, IL in 2008, respectively. In 2012, he received his PhD in Computer Science from University of Maryland Baltimore County. Currently, he is an assistant professor in the Department of Computer Engineering at Antalya International University, Turkey. His research interests include clustering, relocation and fault-tolerance in wireless sensor networks, and self-deployment of nodes in underwater acoustic sensor networks.

    View full text