Next Article in Journal
Robust Dehaze Algorithm for Degraded Image of CMOS Image Sensors
Next Article in Special Issue
Alternative Opportunistic Alert Diffusion to Support Infrastructure Failure during Disasters
Previous Article in Journal
Mobile-Based Analysis of Malaria-Infected Thin Blood Smears: Automated Species and Life Cycle Stage Determination
Previous Article in Special Issue
Fuzzy Modelling for Human Dynamics Based on Online Social Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design and Analysis of an Efficient Energy Algorithm in Wireless Social Sensor Networks

1
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
2
Department of Mathematics and Computer Science, Northeastern State University, Tahlequah, OK 74464, USA
3
Department of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310037, China
4
Department of Computer Science, Electrical and Space Engineering, Lulea University of Technology, 97187 Lulea, Sweden
5
College of Computer and Information Sciences, Almuzahmiyah, King Saud University, Riyadh 11451, Saudi Arabia
*
Author to whom correspondence should be addressed.
Sensors 2017, 17(10), 2166; https://doi.org/10.3390/s17102166
Submission received: 18 July 2017 / Revised: 31 August 2017 / Accepted: 2 September 2017 / Published: 21 September 2017

Abstract

:
Because mobile ad hoc networks have characteristics such as lack of center nodes, multi-hop routing and changeable topology, the existing checkpoint technologies for normal mobile networks cannot be applied well to mobile ad hoc networks. Considering the multi-frequency hierarchy structure of ad hoc networks, this paper proposes a hybrid checkpointing strategy which combines the techniques of synchronous checkpointing with asynchronous checkpointing, namely the checkpoints of mobile terminals in the same cluster remain synchronous, and the checkpoints in different clusters remain asynchronous. This strategy could not only avoid cascading rollback among the processes in the same cluster, but also avoid too many message transmissions among the processes in different clusters. What is more, it can reduce the communication delay. In order to assure the consistency of the global states, this paper discusses the correctness criteria of hybrid checkpointing, which includes the criteria of checkpoint taking, rollback recovery and indelibility. Based on the designed Intra-Cluster Checkpoint Dependence Graph and Inter-Cluster Checkpoint Dependence Graph, the elimination rules for different kinds of checkpoints are discussed, and the algorithms for the same cluster checkpoints, different cluster checkpoints, and rollback recovery are also given. Experimental results demonstrate the proposed hybrid checkpointing strategy is a preferable trade-off method, which not only synthetically takes all kinds of resource constraints of Ad hoc networks into account, but also outperforms the existing schemes in terms of the dependence to cluster heads, the recovery time compared to the pure synchronous, and the pure asynchronous checkpoint advantage.

1. Introduction

1.1. Research Motive

As a new kind of network technology in the wireless communication field, the mobile ad hoc network, which doesn’t depend on fixed infrastructures, has been widely applied to different kinds of situations such as offices, industries and military, etc., bcause it can not only construct a multi-hops network with the existing network, but also support wireless transmission of data, voices and images under bad environmental conditions by temporarily constructing a smart network.
However, mobile ad hoc networks have some limitations such as lack of support of fixed networks, limited resources and changeable topological structure. Compared to the normal mobile network, its stability is worse and the failure probability is higher. Thus, it is necessary to research fault tolerance and reliability technologies for the network.
In distributed systems, in order to ensure global consistency, when failures occur, the related processes firstly rollback the executed operations and then redo them. Checkpoint technologies are proposed for avoiding rollback to the initial states of the processes [1,2]. Namely, the processes are required to store their current states at a regular or unregularly period, so they just need to roll back to a specific checkpoint instead of the initial state when a failure occurs, to greatly reduce the recovery time [3,4].
In essence, the mobile ad hoc network is a kind of distributed network, hence the checkpoint technologies for it not only need to consider the features of distributed systems and regular mobile networks such as mobility of location and frequent disconnection, but also need take its own unique characteristics into account [5]:
Multi-hops routing (MH). Without the support of a fixed network, the communication among MHs in different clusters needs to be transmitted by multiple intermediate nodes. This will increase the number of wireless communication messages and communication delays for the checkpointing procedure.
Lack of support for fixed networks. In regular mobile networks, the checkpoint information of MHs is mostly stored in the Mobile Support Stations (MSSs) dispatching in the fixed network because of the unreliability of storage in MHs. However, there are no longer fixed hosts like MSSs in mobile ad hoc networks, hence, the existing checkpointing methods in the mobile computing environment which are dependent on MSSs are not suitable for the ad hoc network any more.
Changeable topological structure. In ad hoc wireless networks, the MHs usually leave one cluster and enter another cluster, which will easily cause a change of the network topological structure. The checkpoint technologies should be able to handle these location changes to ensure the correctness of recovery when handoff occurs [6,7,8].
Therefore, the existing checkpoint technologies for fixed networks and regular mobile networks cannot be applied to mobile ad hoc networks effectively. It is necessary to design new checkpoint technologies for coping with the upper challenges to improve the fault tolerance and reliability of mobile ad hoc networks.

1.2. Related Works

Checkpoint technologies were first proposed for improving the fault tolerance ability of distributed systems. According to the relations between checkpoints, the checkpoint technologies can be divided to synchronous checkpointing [3,9,10,11,12] and asynchronous checkpointing [13,14].
Synchronous checkpointing, also called coordinated checkpointing, is the method where all processes stay coordinated and store their states in some stable storage. This kind of checkpoints requires that the processes taking part in checkpointing do the checkpoints synchronously. Its advantages are a lack of cascade rollback and low recovery overhead, whereas it will require a lot of coordination messages and its checkpoint time is longer. On the basis of the interruption of the basic operations of processes, the synchronous checkpointing strategies could be further classified into two kinds: unblocking [3,9] and blocking checkpointing [10,11,12]. According to the triggering occasion of checkpoints, they can be divided to checkpointing based on time [3] and checkpointing induced by communication [15,16,17,18].
Asynchronous checkpointing, or independent checkpointing, is the method where every process takes its own checkpoint independently. This could avoid sending the coordination messages for checkpointing, but it has the disadvantages such as cascade rollback, higher storage space and longer recovery time.
Mobile computing environments are typical wireless network applications. They have features such as mobility, limited resources and high failure probability, so the traditional distributed checkpointing algorithms can’t be applied to this environment very well. Neves et al. [19] put forward an adaptive algorithm which considers the quality of a wireless network. This algorithm proposed the concept of soft checkpoints and hard checkpoints. Soft checkpoints are stored in the MHs, while hard checkpoints are stored in the stable storage of fixed network hosts (such as MSSs). After taking the number of maxSoft soft checkpoints every time, the process takes one hard checkpoint. The advantage of this method is that the value of maxSoft changes adaptively with the server quality of the network, but the MHs will store too many soft checkpoints and thus occupy a lot of storage space when the server quality of network in this method is worse. Some papers [20,21,22] adopted mobile agents to partake in the tasks of MSSs, but the existence of agents will cause increments in the message retransmissions and communication delays. Taking the mobility of wireless terminals into consideration, George et al. [18,23] proposed a checkpointing algorithm based on movement and logs. Namely, MHs take checkpoints independently when they pass a certain number of handoffs. However, this method has longer recovery time when the handoff rate of MHs is low. Cao et al. [18,24] came up with the mutable checkpoint which is stored in MHs. The shortcoming of this strategy is that the number of mutable checkpoints induced by messages is large and most of these mutable checkpoints may become useless checkpoints.
Recently, the related researches for ad hoc network are mainly focused on aspects such as the technologies of channel switching, network architecture, routing protocols, strategies for assuring Quality of Service (QoS), multicast protocols, energy conservation and security [22,25,26,27,28,29,30,31], but studies about the fault tolerance and reliability of ad hoc networks are relatively few. Tuli et al. [32] proposed an asynchronous checkpointing strategy which was applied to an ad hoc network, namely MHs could take checkpoints independently. In order to reduce the occupation of storage space in MHs, the cluster heads store the message logs for MHs. The advantage of this strategy is that MHs could take checkpoints independently and so the communication overhead of the system without failure is small, but the storage overhead of cluster heads is large. When some process fails, the wireless communication overhead to determine the dependence relation of rollback will also be large. Meng et al. [33] put forward a synchronous algorithm for ad hoc networks based on clusters. The basic notion of this algorithm is that the initialization of the checkpointing procedure and the sending of checkpoint requests to MHs in clusters are charged by the cluster heads. If MHs achieve the channels assignment in the first beacon interval, they will take checkpoints in the second beacon interval. Otherwise, they take checkpoints in the first beacon interval. Though this algorithm makes very good use of the channels assignment mechanism of mobile ad hoc networks, the taking of checkpoints is too frequent and the storage overhead is large. Parveen et al. [7] also presented a synchronous checkpointing strategy for ad hoc networks. This strategy assumes that the executions of processes are deterministic, namely the same results will be obtained when the same message sequences are executed twice from the same starting point. Saluja and Kumar [34] proposed a synchronous non-blocking checkpointing strategy, which was suitable for ad hoc networks. In this strategy, the cluster heads substitute for processes in the same cluster to execute the checkpointing duties. When the cluster heads receive the messages, which are sent to the processes in the same cluster, they firstly obtain the dependence relation carried by messages and then send the messages to processes alone.
Though the above checkpointing algorithms for mobile ad hoc networks take the clustering structure of the ad hoc network into account, the burden of cluster heads (cluster heads are MHs too in fact) is too heavy and they consider less the effects caused by the features of the ad hoc network such as multi-hop routing and dynamic topological structure, hence they cannot be applied to ad hoc networks very well.

1.3. Work and Organization of This Paper

This paper intends to design a hybrid checkpointing strategy based on a multi-frequency hierarchy ad hoc network. In this strategy, the MHs in the same cluster use the synchronous checkpointing strategy, while the MHs in different clusters make use of the asynchronous checkpointing strategy. The main contributions of this paper include the following:
  • It proposes the hybrid checkpointing model combined the coordinated and uncoordinated checkpointing after considering the communication features and kinds of resource restraints of mobile ad hoc networks;
  • It discusses the checkpoint correctness criteria for a hybrid checkpointing model including the checkpoint taking criteria, rollback criteria and indelibility criteria.
  • It discusses the triggering occasions of checkpoints for processes in the same cluster and different clusters according to the characteristics and requirements of different types of checkpoints.
  • It designs the intra-cluster and inter-cluster checkpoint dependence graph according to the dependence relations among checkpoints. Then it discusses the elimination rules of checkpoints in the same cluster and different clusters based on these graphs. What is more, it provides the algorithms for checkpoints in the same cluster and different clusters and rollback recovery algorithm. At the same time, it proves the correctness of the algorithms by theory.
  • By testing the performances of the proposed strategy, this paper verifies that the hybrid checkpointing strategy is a good trade off method and it has advantages such as less of dependence on cluster heads, low overhead and shorter recovery times.
The rest of the paper is organized as follows: Section 2 introduces the ad hoc network topological structure and the model of hybrid checkpointing. The correctness criteria for hybrid checkpointing model and the triggering occasions of kinds of checkpoints are discussed in Section 3 and Section 4. In Section 5 and Section 6, we propose the elimination strategy of checkpoints and related algorithms. Then we discuss the processing method for handoffs in Section 7. In Section 8, we present the testing of performance and analysis. The last part gives the conclusions of this paper and proposes future work.

2. Ad Hoc Network Structure and Hybrid Checkpointing Model

2.1. Mobile Ad Hoc Network Topological Structure

The mobile ad hoc network is a kind of wireless self-organizing network which has peer to peer movement. Its topological structures are mainly classified to two kinds: flat structures and hierarchical structures [35,36,37,38]. In a flat structure, all network nodes are equal, but the entire network in the hierarchical structure is constituted by subnets which are divided into clusters. Each cluster includes one cluster head and many cluster members. The cluster heads form a higher-level network. Because the scale of the hierarchical structure is not limited and the expandability of this structure is good, this structure has become the favorite development trend of mobile ad hoc networks, therefore, this paper will discuss the checkpointing algorithm under the multi-frequency hierarchical structure.
Figure 1 is the multi-frequency hierarchical topological structure of an ad hoc network. This structure consists of multiple clusters. Each cluster includes one cluster head (CH) and a limited number of cluster members [39,40,41]. The CHs and cluster members are all regular MHs. The hosts in one cluster share one frequency to communicate, while the CHs communicate on another frequency. The members in she ame cluster can communicate with each other directly, while the members in different clusters have to communicate by the retransmission of CHs.

2.2. Hybrid Checkpointing Model

As we know, the different kinds of resources on MHs are quite limited in mobile ad hoc networks. These resources include the storage space, processing capacity, battery energy, wireless bandwidth and so on. If we fully adopt the asynchronous checkpointing algorithm, it will need to transmit a lot of checkpoint coordination messages. Because the communication among MHs in different clusters usually needs multi-hop routing before reaching the destination, the number of checkpoint coordination messages is proportional to the number of hops and the retransmission will occupy precious wireless bandwidth. Though it will avoid a lot of checkpoint coordination messages if we completely use the asynchronous checkpointing algorithm, the cascade rollback caused by asynceshronous checkpointing would waste resources such as precious CPU allocation and power sourcs of MHs and the storage space for storing checkpoint information is large.
Therefore, this paper plans to adopt the hybrid checkpointing model which combines the synchronization and asynchronization after considering the kinds of resource limitations in MHs synthetically. In this model, the hosts in the same cluster use the synchronous checkpointing (because these hosts could communicate with each other directly without the retransmissions involving cluster heads), while the hosts in different clusters adopt the asynchronous checkpointing. This will not only avoid the waste of all kinds of resources for hosts in same cluster because of cascade rollback, but also avoid too many transmissions of messages among hosts in different clusters. What is more, it reduces the communication delay of the network and checkpoint operation delay.
Definition 1.
A same cluster checkpoint (SC) is a checkpoint which is used for ensuring the consistency of processes in hosts of the same cluster.
The SC is a synchronous checkpoint. Every time the process use a SC, it needs to remain synchronous with the related processes in the same cluster. During the synchronization procedure, if part of processes cannot finish their checkpoints because of their own reasons, then the coordinated checkpoints become useless checkpoints which need to be eliminated. Hence, in order to reduce the occupation of precious storage space in wireless terminals, the SCs can be further divided into two kinds: uncertain checkpoints and certain checkpoints.
Definition 2.
An uncertain checkpoint is a SC which is stored in the cache of a MH when the coordination is not finished.
Definition 3.
A certain checkpoint is a SC which is stored on the local disk of a MH when the coordination is finished.
Therefore, if and only if the coordination of checkpoints succeeds, the uncertain checkpoints could be turned to certain checkpoints. Otherwise, the uncertain checkpoints become useless checkpoints which could be eliminated. The elimination of uncertain checkpoints involves releasing the information stored in the cache.
Definition 4.
A different cluster checkpoint (DC) is a checkpoint which is used for assuring the consistency of processes in MHs of different clusters.
The DC is an asynchronous checkpoint. When a process receives messages from MHs of different clusters, it finishes the checkpoint operation independently without informing other MHs. The DC is triggered by the reception of messages from MHs of different clusters and its checkpoint information is directly stored in the local disk of the MH, so the DC is a certain checkpoint in nature.
Suppose Pi,j,k expresses a process, i is the id of cluster which the mobile terminal stays, j is the id of mobile terminal and k expresses the id of process. Figure 2 is an example of hybrid checkpoint model. Pi,k,s, Pi,a,v and Pi,j,t are the processes executing in different terminals of cluster Ci respectively. While Pm,s,c is the process executing in terminal s of cluster Cm. S C i , a , v b expresses the SC of Pi,a,v and b is the sequence number of checkpoint. D C i , k , s m is the DC of Pi,k,s and m is the id of cluster which the sender of message from different cluster stays.
Combined with the Figure 2, the explanation for principles of hybrid checkpointing model can be expressed as follows:
  • The initialized checkpoint. When each process initializes, it takes an initialization checkpoint (certain checkpoint). For example, S C i , a , v b , S C i , j , t y , S C i , k , s d and S C m , s , c d in Figure 2 are initialized checkpoints.
  • The taking of SCs. When a process P receives a fixed number of messages from MHs in same cluster, it initializes a coordination procedure of checkpoints. P firstly takes an uncertain checkpoint, and then it sends checkpoint requests to the processes which have dependence relation with P. When these processes receive the requests, they firstly take uncertain checkpoints and then reply to P. By the time P receives all replies from processes which take part in the coordination procedure, it turns the uncertain checkpoint into a certain checkpoint. Otherwise, P eliminates its own uncertain checkpoint. Finally, P notices related processes to finish the turning or elimination operations of checkpoints together.
  • The taking of DCs. When a process receives a message from a MH in a different cluster, the DC is triggered. In order to assure that the receiving event of this message can be cancelled when the sending event of this message is cancelled, the process has to take the DC firstly and then receives the message.
In Figure 2, Pi,j,t is the process which initializes the checkpointing. Pi,a,v and Pi,k,s are the participant processes of checkpointing because of messages M2 and M3. When the checkpointing starts, Pi,j,t firstly takes a uncertain checkpoint S C i , j , t y + 1 and then notices Pi,a,v and Pi,k,s to take uncertain checkpoints S C i , a , v b + 1 and S C i , k , s d + 1 respectively. By the time Pi,a,v and Pi,k,s finish each uncertain checkpoint S C i , a , v b + 1 and S C i , k , s d + 1 , they send replies to Pi,j,t. After Pi,j,t has received the replies from Pi,a,v and Pi,k,s (not marked in Figure 2), Pi,j,t turns S C i , j , t y + 1 to certain checkpoint. In addition, Pi,j,t notices Pi,a,v and Pi,k,s to turn S C i , a , v b + 1 and S C i , k , s d + 1 into certain checkpoints, but their Checkpoint Sequence Number (CSN) does not change. Otherwise, Pi,j,t eliminates its own uncertain checkpoint and notices Pi,a,v and Pi,k,s. For example, when process Pi,k,s in cluster Ci receives a message M2 from process Pm,s,c in cluster Cm as Figure 2 shows, Pi,k,s firstly takes D C i , k , s m and then receives M2.

2.3. Problem of Hybrid Checkpointing Model to Be Solved

The hybrid checkpoint model takes both of the communication features of hosts in same cluster and different clusters into consideration, and thus reduces the number of wireless communication messages greatly and saves the kinds of resources of MHs. What is more, this model has less dependence on cluster heads (it just records the dependence among checkpoints of processes in different clusters, and it will be discussed in Part 5). Therefore, this model has better flexibility.
However, in a hybrid checkpointing strategy, the checkpoints of a process cannot be handled according to the fully synchronous or asynchronous checkpointing algorithm because of the existing of SCs and DCs. The main problems for checkpoints processing are as follows:
  • What are the correctness criteria? In the hybrid checkpoint model, because there are mutual dependence relations among checkpoints in the same cluster and different clusters, so it is necessary to research a brand new global consistency criterion for checkpoints to assure the global consistency of checkpoints.
  • When is the checkpoint triggered? In order to reduce the checkpoint operations and storage overhead, different triggering occasions should be taken for different kinds of checkpoints according to the ad hoc network features and hybrid checkpointing model.
  • When is the checkpoint eliminated? The existing of DCs makes the elimination of SCs be complicated, and so creates a lot of checkpoints. In order to save space, it needs to clear away the useless SCs in time and coordinate the useless DCs actively.
  • What is the strategy for processing handoff? The checkpointing strategy for the hybrid checkpoint model is decided by the same cluster and different clusters relations of MHs. However, the same cluster and different clusters relations of MHs will change when MHs take handoff. Hence, the checkpointing strategy should dispose the change of MHs correctly and automatically maintain the new cluster relation to guarantee the correct checkpoint operations of processes in handoff MHs.

3. Correctness Criteria for a Hybrid Checkpointing Model

In the asynchronous checkpointing strategy, the condition that checkpoints ensure the global correctness is no orphan messages. An orphan message is a message whose receiving event is stored but its sending event is not stored. Therefore, in order to assure the messages not become orphan messages, the sender process of the message must take a checkpoint also when the receiver process of a message takes a checkpoint, but in the asynchronous checkpointing strategy, the orphan messages are allowed to exist. Hence, the cascade rollback phenomenon exists. However, the checkpoints in the same cluster and different clusters adopt different strategies. Therefore, it could not finish the checkpoint operation according to the checkpoints correctness criteria of coordinated or uncoordinated strategy purely.
Let the same cluster message (SM) be the message transmitted from process SPi to process SPj of MH in same cluster. SCi and SCj are respectively the checkpoints taken by SPi and SPj after sending and receiving SM. The different cluster message (DM) is the message transmitted between from process DPi to process DPj of MH in another cluster. CKPi and DCj are respectively the checkpoints of DPi and DPj related to DM. The Receive(M) is the operation of receiving message M and the Send (M) expresses the operation of sending message M. A >> B expresses that the appearance of operation A is prior to operation B. CKP means one checkpoint (SC or DC).

3.1. Checkpoint Taking Criteria

For assuring the SM is not an orphan message, it requires that the sending process of SM must take a checkpoint when the receiving process of SM takes a checkpoint:
Criterion 1:
SM, if Receive(SM) >> SCi meets, then Send(SM) >> SCj must meet it also.
The DC does not require assuring no orphan message, but it requires that the receiving event of DM could be cancelled when the sending event of DM is cancelled. Hence, it demands that the receiving process of DM firstly takes a DC and then receives the message.
Criterion 2:
DM, if CKPi >> Send(DM) meets, then DCj >> Receive(DM) must meet it also.

3.2. Checkpoint Rollback Criteria

To guarantee that all processes could roll back to the global consistent checkpoint states when some processes fail, it requires that the receiving event of message could be cancelled when the sending event of message is cancelled. This requirement is suitable to both SCs and DCs:
Criterion 3:
SM, if ∃SCj−1(SCj−1 >> Receive(SM)>>SCj) ∧ ∃SCi−1(SCi−1 >> Send(SM) >> SCi), then SPj must be able to roll back to SCj−1 when SPi rolls back to SCi−1.
Criterion 4:
DM, if DPi rolls back to CKPi, then DPj must be able to roll back to DCj.

3.3. Checkpoint Indelibility Criteria

In synchronous checkpointing strategy, the processes which joins the checkpointing can clear away the prior checkpoint when the checkpointing is finished. While the processes in asynchronous checkpointing strategy will eliminate checkpoints together after finding the global consistent checkpoint and finishing rollback recovery when some process fails, however, the checkpoints of one process not only include the SCs related to MHs in same cluster, but also contain the DCs related to MHs in different clusters in a hybrid checkpoint environment. When a process finishes the SC, the other SCs earlier than this SC cannot be eliminated easily, otherwise, global inconsistency may appear. The elimination of checkpoints is judged by the relations among checkpoints:
Definition 5.
If CKPi >> Send(M) and CKPj >> Receive(M), then CKPj directly depends on CKPi between processes. It expresses as CKPj CKPi.
Definition 6.
Let CKPi and CKPi+1 to be the contiguous checkpoints of Pi, and then CKPi+1 directly depends on CKPi in same process. It expresses as CKPi+1 CKPi.
The upper two dependence relations are all named by checkpoint direct dependence relation. They will not be distinguished in the later parts:
Definition 7.
If CKPi CKPj and CKPj CKPk, then CKPi depends on CKPk indirectly. It expresses as CKPi CKPk.
In Figure 3, there are relations such as S C i , a , v b D C i , k , s m and S C i , k , s d D C i , k , s m . If Pi,a,v and Pi,k,s clear away S C i , a , v b or S C i , k , s d at once after finished S C i , a , v b + 1 or S C i , k , s d + 1 , then Pi,a,v or Pi,k,s will not be able to roll back to global consistent states when Pm,s,c rolls back to S C m , s , c t (The correct rollback sequence is ①→②→③→④).
Let SCi and SCk be the SCs of process P. DCx is the DC of P or the processes in the cluster same as P. Then the indelibility criteria for checkpoint can be described as follows:
Criterion 5:
If ∃DCx (SCj DCx or SCj DCx) ∧ ∃SCk (k>i), then SCj cannot be eliminated.
Criterion 5 indicates that if one SC which is not newest depends on any DC directly or indirectly then it cannot be eliminated. Because the DC is an asynchronous checkpoint, it may cause a cascading rollback. As Figure 4 shows, the rollback of processes Pm,s,c and Pm,j,t will all cause Pi,k,s rolling back to D C i , k , s m (rollback ① and ② all cause the rollback ③). If D C i , k , s m is eliminated earlier than S C m , s , c t and S C m , j , t y , then Pi,k,s will not be able to roll back to the consistent states with processes in Cm when Pm,s,c rolls back to S C m , s , c t or Pm,j,t rolls back to S C m , j , t y .
Therefore, the indelibility criteria of DCa can be defined as follows:
Criterion 6:
If CKPi is not eliminated, then DCj must not be cleaned up.
Criterion 6 shows that if the checkpoints of MHs in different clusters which are dependent on one DC are not cleaned up, then this DC cannot be eliminated also. Otherwise, it will cause something wrong about rollback.
Criterion 7:
If CKPi has been eliminated, then DCj cannot be cleared away when one of the following conditions is satisfied.
DCj is the only checkpoint of DPj. DCj does not depend on the checkpoints of DPj, but it depends on the checkpoints of processes in the same cluster of DPj.
Th Criterion 7 means that if one DC is the only checkpoint of process or the DC is the earliest checkpoint and the SCs on which the DC depends are not eliminated, then the DC cannot be cleared away. The former requires that the new state of process be stored in an update checkpoint, while the later demands that the elimination of a DC will not cause wrong rollback. Hence, there will be one conclusion: if the hybrid checkpointing strategy satisfies the Criteria 1~7, then this strategy could assure the global consistency of checkpoints.

4. Checkpoint Triggering Occasions

In the distributed system, the checkpoint always adopts the way to be triggered at a fixed time or by a fixed number of messages. For the ad hoc network, the time interval for triggering checkpoints periodically is hard to decide because of the dynamism and uncertainty of this network on the one hand, and on the other hand, a timing trigger may usually wake up some MHs staying dormant state and then cause unnecessary battery consumption. Therefore, the SCs and DCs in the hybrid checkpointing model all adopt the active trigger mechanism based on messages. Besides, the SCs use the reactive trigger mechanism to reduce the number of SCs, which cannot be eliminated. There are two different SC triggers: Active Trigger based on Message and Reactive Trigger:
  • Active Trigger based on Message: The active trigger means that a process initializes a checkpointing procedure actively when it receives K (K is the threshold, K ≥ 1) messages from MHs in same cluster. The value of K is directly proportional to the continued communication time without failure and storage space of the MH, but inversely proportion to the disconnection and failure probability of the MH. What is more, the choice of K should consider the kinds of overhead of the ad hoc network. This part will be further discussed and verified in the experiments.
  • Reactive Trigger: It can be seen from Criterion 5, that if one SC depends on any DC directly or indirectly then it cannot be cleared away, so there will be a lot of SCs which could not be eliminated.
As Figure 5 shows, all SCs of Pi,k,l and Pi,j,e depends on D C i , j , ε m directly or indirectly, so they could not be cleared away before D C i , j , ε m is eliminated. If the checkpoint frequency of cluster Ci is higher than the one of Cm, then there may be a lot of SCs of Ci which cannot be cleaned up.
The key to solve this problem is to clean up the DCs on which the SCs depend. According to the Criteria 6 and 7, the premises to eliminate DC include two conditions. One is that the process has finished new checkpoint (namely this DC is not the new checkpoint of the process). The other is that the checkpoints of processes in other clusters on which the process depends have been eliminated.
Let Ci and Cj to be two different clusters. If the processes in Ci which contains all DCs (created by the receiving of messages from MHs in Cj) related to Cj have finished the certain checkpoints (namely meet the first premise), then Ci notices Cj to initialize a new checkpointing (only the new checkpoint is done, the former SCs can be cleaned up and this meets the last premise). If some checkpoint dependent on the DCs in Ci can be cleared away after the related processes in Cj has finished new SC, then the elimination probability of the DCs increases heavily. Then the SCs in Ci which depend on the DCs could be cleaned up further. The detailed elimination rules of checkpoints will be discussed in Section 5.
Different from the active trigger, this checkpointing procedure is initialized by the CH and the related processes in cluster just act as the participants, so it is called a reactive trigger.
The DC uses the message trigger mechanism also, namely the DC is triggered when it receives messages from MHs in different clusters. However, if a process takes a DC when it receives a message from the same different cluster each time, it will increase the checkpoints and so augment the storage overhead heavily.
As Figure 6 shows, Pi,j,e in cluster Ci receives messages M1 and M2 from processes Pm,s,c and Pm,a,v in cluster Cm respectively. If Pi,j,e takes DC1 and DC2 respectively, then Pi,j,e will roll back according to the following situations when processes in Cm fail.
M0 exists. If Pm,a,v rolls back to C m , s , c b because of failure, Pm,s,c will roll back to S C m , s , c b and Pi,j,e should roll back to DC1; if Pm,s,c rolls back to S C m , s , c b because of failure, Pi,j,e will roll back to DC1 also.
M0 does not exist. If Pm,a,v rolls back to C m , a , v b because of failure, Pi,j,e will roll back to DC2; if Pm,s,c rolls back to S C m , s , c b because of failure, Pi,j,e will roll back to DC1.
It can be seen, Pi,j,e will roll back to DC1 in most instances. Hence, for reducing the number of DCs, a process just takes one DC for each different cluster during one DC interval.

5. Strategies for Checkpoint Elimination

The storage space of MHs in an ad hoc network is quite limited, so it is necessary to clean up the overdue useless checkpoints in time. This section will firstly introduce the checkpoint dependence graph, and then design the relevant rules and strategies for checkpoint elimination. The Figure 7 below shows the IntraCDG of processes Pi,a,v and Pi,k,s (see in Figure 3). Because there are no dependence relations in proesses Pm,s,c and Pi,j do,t, they not have the dependence relations at present and so the IntraCDGm,s,c and IntraCDGi,j,t are empty.

5.1. Checkpoint Dependence Graph

5.1.1. Intra-Cluster Checkpoint Dependence Graph

In order to describe the dependence relations among checkpoints of processes in same cluster clearly, every process stores an Intra-cluster Checkpoint Dependence Graph (IntraCDG). The nodes of IntraCDG are certain checkpoints or DCs and the edges of SDG are directed edges. The edge is started by one checkpoint and ended by another checkpoint on which the checkpoint depends. According to Definitions 5 and 6, the update situations of IntraCDG contain the following two aspects:
  • The update of dependence relations among processes. When a process Pi receives a SM, it firstly obtains the recent checkpoint of the sender process and then checks whether the edge started by this recent checkpoint and ended by current checkpoint of Pi exists in IntraCDGi. If the edge does not exist, then it will be added to IntraCDGi.
  • The update of dependence relations in same process. When Pi takes a new certain checkpoint or DC, it adds a directed edge started by the current checkpoint (a certain checkpoint or DC) and ended by the new checkpoint to IntraCDG.

5.1.2. Inter-Cluster Checkpoint Dependence Graph

In order to describe the dependence relations among checkpoints of processes in different clusters, the source cluster head and target cluster head store Inter-Cluster Checkpoint Dependence Graph (InterCDG) for sending messages and receiving messages. They are marked by S_InterCDG and R_InterCDG, respectively. The nodes of InterCDG are certain checkpoints or DCs and the edges of it are directed edges. The edge is started by one checkpoint and ended by another checkpoint on which the checkpoint depends.
In Figure 4, D C i , k , s m depends on S C m , s , c t . This dependence relation can be recorded in the InterCDGs related to M1. The InterCDGs are respectively stored in CHi and CHm. The figure is comparative simple, and so it is not given.

5.2. Rules and Strategies of Checkpiont Elimination

The checkpoint elimination includes two parts: the elimination of SC and elimination of DC.

5.2.1. Elimination of SC

As Criterion 5 shows, one SC can be cleared away if it is not the newest checkpoint of process and it does not depend on any DC.
Let IntraCDGglobal to be the global checkpoint dependence graph of all processes in one cluster. SCi and SCk are the SCs of Pi. Set-Path(SCi) is the set of paths which are ended by SCi in IntraCDGglobal. DCx is any DC. The elimination rule of SC can be described in formalized way as follows:
Let Pi to be the process which cleans up the DCx, then the algorithm for elimination of DC is indicated as Algorithm 1.
Algorithm 1 DC_Elimination( )
if (∃Edge(CKPi, DCx) ∈ IntraCDGi) then
 DELETE(DCx);
 REPLACE_IntraCDGnodes(CKPi, DCx); // CKPi replaces DCx
 NOTICE_IntraCDGchange(CKPi, DCx);
 //Notice processes the replacement of CKPi for DCx
else
if (∃Edge(DCx, CKPi) ∈ IntraCDGi) then
if (∃Edge(CKPj, DCx) ∈ IntraCDGiCKPjPi) then
 TurnDCtoSC(DCx, SCk); // Turn DCx to SCk
 NOTICE_CKPturn(DCx, SCk);
 //Notice processes in same cluster of turning of DCx
else
 DELETE(DCx);
 NOTICE_delete(DCx);
 //Notice processes in same cluster of the elimination of DCx
 //Notice processes in same cluster of the elimination of DCx
Rule 1:
If ∄DCx ∈ Set-Path(SCi) ∧ ∃SCk (k > i), then SCi can be cleaned up.
The elimination occasions of SC include two occasions:
  • When one process finishes the new SC, the initialization process of checkpointing (active trigger) or the CH (reactive trigger) will collect the IntraCDGglobal and check whether the earlier checkpoints can be cleaned up.
  • After any DC is cleared away by one process, the process will notice the CH to collect the IntraCDGglobal. Then the CH checks whether the earlier checkpoints can be cleaned up.

5.2.2. Elimination of DC

It can be seen from Criterion 6, one DC must not be able to be cleared up if the related checkpoints of processes in other clusters on which the DC depends directly are not eliminated. Hence, one premise of DC to be cleaned up is that all checkpoints from other clusters on which the DC depends directly are cleared away. However, it needs to check whether the elimination of DC satisfies the conditions in Criterion 7 even if this premise meets. The elimination of DCs is based on the elimination of SCs. If the conditions are satisfied, then the DC still cannot be cleared away.
Let DCx to be any DC of process Pi. CKPi is any kind of checkpoint of Pi. CKPj is any checkpoint of process in the cluster same as Pi. Edge(A, B) expresses the edge started by A and ended by B in IntraCDG. Then the elimination rules of DC can be described in formalized way as follows:
Rule 2:
If ∃Edge (CKPi, DCx) ∈ IntraCDGi, then DCx can be cleaned up.
Rule 3:
If ∃Edge (CKPi, DCx) ∉ IntraCDGi, then if only if ∃Edge(DCx, CKPi) ∈ IntraCDGi ∧ ∃Edge (CKPj, DCx) ∉ IntraCDGi), then DCx can be cleaned up.
Because the CH to which the sending process of DM belongs records S_IntraCDG, so the process needs to inform the CH to delete the checkpoint and related edges in S_IntraCDG when the process cleans up the SC. If the CH finds the isolated DC DCx, then it informs the related cluster to judge whether the DC can be eliminated according to Rules 2 and 3. After receiving the processing request of the DC, the related process decides the next step operation by its own IntraCDG.
  • If there is any edge which is started by checkpoint CKPi of process and ended by DCx in IntraCDG, then DCx will be cleared away. The edge which is started by and ended by DCx will be eliminated. What is more, it will use CKPi to replace DCx. Finally, the CH informs the processes in same cluster to change the DCx to CKPi in their own IntraCDGs.
  • If there is no edge started by the own checkpoint CKPi of process and ended by DCx, and there is edge started by DCx and ended by its own checkpoint CKPj, then two situations exist.
  • If there is not edge started by checkpoint CKPa of process in same cluster and ended by DCx in IntraCDG, then can be cleared away.
  • If there is edge started by checkpoint CKPa of process in same cluster and ended by DCx in IntraCDG, then turns DCx to SCk and informs other processes in same cluster to change DCx in IntraCDG to SCk.
Theorem 1.
The strategy of checkpoint elimination satisfies the correctness criteria of hybrid checkpointing model.
Proof. 
We prove this from the SC and DC, respectively:
  • Let SCi and SCj to be the SCs of any process Pi. DCx is any DC. According to Rule 1, if there is no path from DCx to SCi but there is node SCj (j > i) in IntraCDGglobal, then it shows that SCi is not the current checkpoint of Pi and there is no DCx meeting the condition SCi DCx or SCi DCx. At this moment, any rollback of process in different cluster will not cause Pi,j,t to roll back to S C i , j , t y . Hence, SCi can be eliminated. It can be seen, the elimination of SCs meets the Criterion 5.
  • Let the cluster to which the DC belongs to be Cm. If there is any isolated DCx in S_InterCDGm after CHm has handled the set of checkpoints that could be eliminated, then it means that any checkpoint CKPi of the process in Cm does not meet the condition DCx CKPi. Therefore, any rollback of process in Cm will not cause Pi to roll back to DCx.
Let CKPa to be any kind of checkpoint of Pi. CKPb is any kind checkpoint of process in the cluster same as Pi. If ∃Edge(CKPa, DCx) ∈ IntraCDGi, when DCx has no relation with checkpoints in Cm, then Pi can roll back to CKPa relatively. Hence, the elimination of DCx will not cause wrong rollback. If ∃Edge (CKPa, DCx) ∈ IntraCDGi and ∃Edge(DCx, CKPa) ∈ IntraCDGi ∧ ∃Edge (CKPb, DCx) ∈ IntraCDGi, then it shows that no rollback will cause Pi to roll back to DCx but there is checkpoint which stores the newest state of Pi. Hence, DCx can be eliminated. It can be seen that the elimination of DCs meets the Criteria 6 and 7. In conclusion, the strategy of checkpoint elimination satisfies the correctness criteria of the hybrid checkpointing model.

6. Checkpointing and Recovery Algorithms

6.1. Algorithm for Checkpoints in the Same Cluster

Different from the traditional synchronous checkpointing strategy, when a process initializes the checkpointing, it firstly collects the dependence list of each process in the same cluster to obtain the dependence relation among processes. Then it finds out the processes that have to remain synchronous with it for realizing the checkpoints coordination of processes. However, if there are multiple processes initializing the checkpointing in the same cluster at the same time, then it will create unnecessary wireless communication overhead. Hence, let the CH of each cluster to control one token. In order to avoid the concurrent initializations, only the process which gets the token from CH is allowed to initialize the checkpointing. The all processes staying in active states in one cluster is named as the process list (PL).
In order to reduce the occupation of useless checkpoints to the precious storage space of wireless terminals, this paper divides the SCs into uncertain checkpoints and certain checkpoints (see the Definitions 2 and 3). If and only if the all processes which take part in the checkpointing finish the coordination of checkpoints, the uncertain checkpoints could be turned to certain checkpoints. Otherwise, these uncertain checkpoints will be eliminated, namely the caches released.
Different from the DL which stores the list of processes depended on one process directly, the LDS stores the set of processes depended on one process directly and indirectly. The dependence relations in LDS can be achieved from the DLs of different processes:
Definition 8.
If Pi sends a message to Pj, then Pj depends on Pi directly. It expresses as Pj→Pi.
Definition 9.
If Pk→Pj and Pj→Pi, then Pk depends on Pi transitively. It expresses as Pk→Pi.
Definition 10.
The least dependence set (LDS) of Pi is the set of processes in cluster same as Pi which depends on Pi directly or transitively. If Pj→Pi or Pk→Pi, then Pj, Pk ∈ LDSi.
To describe the up-dependence relations, the dependence list (DL) is used to store the set of process which directly depends on one process, and so, the transitive dependence relation for LDS can be obtained from the DLs of different processes. The steps of checkpointing in same cluster are described as follows:
  • When process P receives K, SMs, it sends token request to CH. If the token is not assigned, then P gets the token. Otherwise, P goes on applying for the token. During the interval, if P receives the checkpoint request from other process, then it takes an uncertain checkpoint.
  • After getting the token, P initializes the checkpointing. This procedure includes collecting the DLs of other processes, calculating LDS and so on.
  • P takes an uncertain checkpoint and sends checkpoint request and LDS to all of the other processes in same cluster. If some members stay in the disconnected states, it sends the information to the CH.
  • After getting the checkpoint request, the processes will take uncertain checkpoints if they are part of LDS. Then they send replies to P.
  • If P receives all of the positive replies, then it turns the uncertain checkpoint to certain checkpoint and notices the processes in LDS to turn. If some members stay in the disconnected states, P sends the turn request to the CH. If P does not receive all of the positive replies during a time interval, P clears away the uncertain checkpoint and informs all of the processes in the same cluster.
  • If P finishes the checkpointing successfully, then P collects the IntraCDG of processes in same cluster to build the IntraCDGfinal and P checks whether any checkpoints could be eliminated according to Rule 1. If this kind of checkpoint exists, P notices the processes in same cluster to execute elimination processing.
Let Pj to be the process which initializes the checkpointing. Pj stays in the cluster Cm. PLm expresses the process list of Cm. CSNi indicates the checkpoint sequence number (CSN) of Pj. UCi expresses whether Pj has taken the uncertain checkpoint. The UCi initialization value of is 0. DPm indicates the set of processes which are disconnected in Cm. MNi indicates the number of messages received by Pi from processes in same cluster. The algorithm of process which initializes the checkpointing is indicated as Algorithm 2.
Algorithm 2 CKP_InitProcess( )
 Ask_token (CHm); //ask for token
if (token is assigned) then
 WAIT (token);
else
CSNi = CSNi+1; //increase its own CSN
for (PkPLmki) do
 SEND_DLrequest(Pk); //send DL request
 CALCULATE(LDS, Pi);
 DO_uncheckpoint(SCi);
UCi = 1; //change mark of uncertain CKP
for (PkLDS) do
if (PkDPm) then
 SEND_CKPReq(CHm); //send CKP request
else
 SEND_CKPReq(Pk);
if (Pi get all replies) then
 CKP_Turning(SCi); //turn uncertain CKP to certain CKP
for (PkLDS) do
if (PkDPm) then
 SEND_CKPturn(CHm); //send checkpoint turn request
else
 SEND_CKPturn (Pk);
 FIND_CKPdeleted(IntraCDGfinal); //find the deletable checkpoints*/
 NOTICE_CKPdeleted();
 /*notice processes in same cluster of CKP elimination*/
MNi = 0; /*reset receiving messages number*/

6.2. Algorithm for Checkpoints in Different Clusters

The DC is triggered by the message that is sent from a process in a different cluster. The processing of DC is relatively simple. The following steps are its detailed procedures:
  • When a process sends a DM to MHs in different clusters, it adds the current checkpoint to the message.
  • After received the DM, the CH updates S_InterCDG according to the dependence relation between checkpoints of the sender and receiver process and then transmits the DM.
  • The CH of the cluster to which the receiver process of the DM belongs updates the R_InterCDG also after received the DM, and then transmits the DM to the receiver process.
  • By the time the process received the DM, if it has taken DC for the cluster, it receives the DM directly. Otherwise, it checks whether it receives or sends any message during the current checkpoint interval. If it has sent or received any message, it stores a OC. Or else, it turns the current checkpoint to OC.
Let DM be a message sent by Pi (belongs to Cs) and received by Pj (belongs to Ct). CKPi and DCj are respectively the checkpoints of Pi and Pj related to DM. The algorithm for checkpoints in different clusters is described as Algorithm 3.
Algorithm 3 DC_Processing( )
 SEND_message(Pi, DM); // Pi sends DM
 TRANSMIT_message(CHs, DM); // CHs transmits DM
 UPDATE_S_InterCDG(Pi, Pj); // update S_InterCDG
 TRANSMIT_message(CHt, DM);
 UPDATE_R_InterCDG(Pi, Pj);
 RECEIVE_message(Pj, DM); // Pj receives DM
if (Pj has taken DCj) then
 RECEIVE(DM);
else
if (∄M((Send(M) >> CKPk) || (Receive(M) >> CKPk)) then
 TURN_CKP(CKPk, DCj); // CKPk turns to DCj
else
 STORE_CKP(DCj);

6.3. Algorithm for Failure Recovery

When a process fails, it firstly asks related processes to roll back to some global consistency states. Then these processes execute the operations stored in the logs. The keys of the correctness for rollback recovery are that determining the earliest rollback set of process and finding out the processes in different clusters belonging to the cascade rollback.
Definition 11.
The rollback set of Pi is the set of checkpoints to which the processes in same cluster and Pi may roll back, it is recorded as RSi.
Definition 12.
The earliest rollback set is the set of earliest checkpoints to which the processes in same cluster and Pi must roll back, it is recorded as ERSi, ERSi ⊆ RSi.
Definition 13.
Let CKPi to be the checkpoint to which the failure process Pi needs roll back. If ∃CKPk(CKPk CKPi ∨ CKPk CKPi), then CKPk ∈ RSi.
Definition 14.
Let Pi to be the failure process, Pj is the process in the cluster same as Pi. CKP1, CKP2, …, CKPn are the set of checkpoints of Pj and they belong to RSi. If ∃CKPx, x ∈ {1, 2, …, n}, the condition CKPx CKPk (k ∈ {1, 2, , n}, xk) is not satisfied, then CKPx ∈ ERSi.
The ERS calculation procedure for failure process Pi is described as follows:
  • Pi collects the IntraCDGs of all processes in same cluster and builds the IntraCDGglobal.
  • Pi executes the depth-first traversal to the IntraCDGglobal started from the checkpoint to which Pi has to roll back, and adds any checkpoints that can be arrived to the RSi. Finally, the earliest checkpoint of each process in RSi is added to ERSi.
Based on the ERS, we define the following rollback rules:
Rule 4:
Let Pi to be the failure process. CKPx is any checkpoint of process Pj in the cluster same as Pi. If CKPxERSi, then Pj should roll back to CKPx.
Rule 5:
Let Pi to be the failure process. CKPx is any checkpoint of process Pj in the cluster same as Pi. CKPxRSi. If ∃Edge(CKPx, DCx) ∈ S_InterCDGm, then the related process in other cluster will be noticed to roll back to DCx.
Let Pi to be the failure process and CKPi to be the checkpoint to which Pi has to roll back. The rollback recovery algorithm is described as Algorithm 4.
Theorem 2.
The strategy of rollback satisfies the correctness criteria of hybrid checkpointing model.
Proof. 
By contradiction. If the rollback strategy does not satisfy the correctness criteria, then there must be a message M there must be a message M whose sender process has cancelled the sending event of M but the receiver process does not cancel the receiving event of M. Let Pi and Pj to be the sender process and receiver process of M respectively. CKPi and CKPj are the newest checkpoints of Pi and Pj when they send and receives M, respectively.
  • M is a SM. By the time Pj receives M, the IntraCDGj of Pj will record the dependence relation of CKPj CKPi. If some process in the cluster same as Pj build the IntraCDGglobal for failure, then there must be a path from CKPi to CKPj. Therefore, Pj is sure to roll back to CKPj because CKPj belongs to RSj when Pi rolls back to CKPi for failure or rollback request.
  • M is a DM. At present, CKPj is DCj. After Pi sends M, the cluster head CHk of cluster same as Pi will firstly receive M. When CHk receives M, it transmits M and records the dependence relation of CKPj CKPi in S_InterCDGk. When Pi rolls back to CKPi for failure or rollback request, CHk is sure to require Pj to roll back to CKPj because there is edge started by CKPi and ended by CKPj in S_InterCDGk.
So, it follows that the assumption for rollback strategy does not set up. The rollback strategy meets the correctness criteria.
Algorithm 4 CKP_Rollback( )
for (PjPLkji) do
 SEND_IntraCDGReq(Pj); //send IntraCDG request
 BUILD_IntraCDG(IntraCDGglobal);
 Traverse(IntraCDGglobal, CKPi);
 //traverse IntraCDGglobal by depth-first
for (PjPLkji) do
 SEND_ERS(Pj);
for (PjPLk) do
if (CKPsERSiCKPsPj) then
 Rollback(Pj, CKPs);
for (CKPsERSiCKPsS_InterCDGk) do
if (∃Edge(CKPs, DCj)) then
 SEND_RollReq(DCj);
 //send rollback request to process related to DCj

7. Processing of Handoffs

A MH leaves one cluster and enters another cluster in the ad hoc network, namely a handoff occurs, then the relations of the MH in same cluster and different clusters all change. We call this MH that passes the handoff as the HandOff Host (HOH). The cluster that HOH leaves is called old cluster and the cluster that HOH enters is called new cluster.
After the handoff has happened, there will be some situations. If the old cluster does not deal with this change in time, then any checkpointing procedure of processes in old cluster may fail to accomplish because of the absence of HOH. The rollback of processes may also not be realized, because the processes could not obtain all of dependence relations of checkpoints. If the new cluster does not deal with this change in time, then the DCs of HOH may not be able to be cleared away and affects the elimination of more checkpoints further. What is more, the checkpointing in new cluster will ignore the entering of HOH and so cause the incoordination of checkpoints. Hence, the HOH, old cluster and new cluster all must deal with this change relatively for assuring the correctness of checkpoints when the handoff happens.

7.1. Process States and Treatment

For the processes running in the HOH, they stay one of the following states when the handoff happens. Figure 8 is the transition figure of states.
  • Running state. The process runs or communicates normally.
  • Checkpoint state. The process is taking checkpoint. The process can be divided to CKP Initiator or CKP Participant. The CKP Intiator means that the checkpointing procedure is initialized by this process, while the CKP Participant means that this process just take part in the checkpointing procedure initialized by other process.
  • Rollback state. The process is just dealing with the rollback recovery. This state can be divided to Active Rollback and Reactive Rollback state. The Active Rollback state indicates that the process is the failure process, namely the initiator of rollback. While the Reactive Rollback state means that the process rolls back because of the cascade rollback caused by rollback request of another process.
When the HOH leaves the old cluster, and enters the new cluster, the processes running in HOH should be able to handle according to their different states. Despite the processes staying normal state, all processes running in HOH have to finish the current work (checkpointing or rollback operation) with the processes in old cluster together for assuring the consistency of checkpoints. Hence, the HOH informs its location and state to the CH of old cluster when it enters the new cluster. Then the CH judges whether it is needful to notice the processes in its cluster.

7.1.1. The Process Stays in the Rollback State

Active Rollback. The HOH goes on obtaining the IntraCDG of processes in the old cluster and calculates the ERS according to IntraCDG. The LCH sends the cluster id of new cluster to the CH of old cluster. Then the process executes rollback operation according to ERS and sends the ERS to related processes in old cluster to inform them to roll back.
Reactive Rollback. The process waits for the ERS sent by the process running in old cluster and executes rollback after received the ERS.

7.1.2. The Process Stays the Checkpoint State

CKP Initiator. If the process stays the receiving stage of DL, then it goes on waiting for the DL sent from processes in old cluster. If the process has sent the LDS, then it waits for the replies from processes in old cluster. After that, the process continues executing the checkpointing until the end.
CKP Participant. The process waits for the coordination request from the initiator of checkpointing and replies until the end of checkpointing procedure.

7.2. Update of Checkpoint Dependence Relation

Because the the relations of the HOH in same cluster and different clusters all change after it leaves the old cluster and enters the new cluster, so it needs to update the dependence relations among checkpoints in old or new cluster and the checkpoints of processes in HOH when the all processes in HOH have finished their current work. The main update work is to renew the related data structures such as the PL, DL, IntraCDG and InterCDG. Let Ck and Ci respectively the old cluster and new cluster related to HOH. PS is the set of processes running in HOH.

7.2.1. Update of Checkpoints Dependence Relation in Old Cluster

Firstly, it needs to delete the processes in PS from the PLk and DLs of all processes in old cluster. Then it should turn the same cluster dependence relation among checkpoints of processes in PLk and PS to dependence relations among different clusters. That means to modify the related IntraCDG and InterCDG.
Let PtPS, PjPLk. CKPa and CKPb are the checkpoints of Pt and Pj respectively. The algorithm for updating of checkpoints dependence relations in old cluster is described as the Algorithm 5.
Algorithm 5 OldCluster_CKPUpdate( )
 DELETE_process(PLk, PS); //delete PS from PLk
 NOTICE_DLChange(PS); //notice processes to change DL
for (PxPS || PxPLk) do
 REQUIRE_IntraCDG(Px); //send IntraCDG request
 BUILD_IntraCDG(IntraCDGglobal);
if (∃Edge(CKPa, CKPb) ∧ IntraCDGglobal) then
 CHANGE_IntraCDG(IntraCDGj); //modify IntraCDGj
 CHANGE_InterCDG(R_InterCDGk);
if (∃Edge(CKPb, CKPa) ∧ IntraCDGglobal) then
 CHANGE_IntraCDG(IntraCDGt);
 CHANGE_InterCDG(S_InterCDGk);
 BUILD_SubInterCDG(R_InterCDGk, S_InterCDGk, Pt);
 //build subgraph related to Pt
 SEND_SubInterCDG(CHi); //send subgraph to CHi
 CHANGE_InterCDG(R_InterCDGk);
 CHANGE_InterCDG(S_InterCDGk);

7.2.2. Update of Checkpoints Dependence Relation in New Cluster

Firstly, it needs to add the processes in PS to PLi. When CHi receives the subgraph SubInterCDG from CHk, it adds the edges ended by the checkpoints of processes in PS to R_InterCDGi and adds the rest to S_InterCDGi. Then CHi builds new relations of same cluster for the processes in PS, namely updates the IntraCDGs and DLs of related checkpoints.
Let PtPS, PsCi. CKPc and CKPd are the checkpoints of Pt and Ps, respectively. The algorithm for updating checkpoints dependence relations in a new cluster is described as the Algorithm 6.
Algorithm 6 NewCluster_CKPUpdate( )
 ADD_process(PLi, PS); //add PS to PLi
 CHANGE_InterCDG(SubInterCDG, R_InterCDGi);
 //modify R_InterCDGi with SubInterCDG
 CHANGE_InterCDG(SubInterCDG, S_InterCDGi);
if (∃Edge(CKPc, CKPd) ∈ R_InterCDGi || ∃Edge(CKPd, CKPc) ∈ S_InterCDGi) then
 CHANGE_dependence(CKPc, CKPd);
 //change dependence relation of CKPc and CKPd
for (PxPLi) do
 NOTICE_IntraCDGchange(Px);
for (CHm, mkmi) do
 NOTICE_clusterchange(CHm, Pt);
 //notice CHm the cluster change of Pt
for (PxPLi) do
 CHANGE_IntraCDG(Px); //Px modify IntraCDG
 CHANGE_DL(Px); //Px modify DL

8. Performance Test and Analysis

8.1. Experiment Environment and Parameters

This experiment builds a simulation environment of a mobile ad hoc network based on Windows XP. The simulation network consists of five clusters. Each cluster includes 5 MHs. The routing protocol in the experiment is the Ad hoc On-Demand Distance Vector Routing (AODV). The size of one checkpoint and one message log are set to 2 KB and 50 B, respectively. Table 1 shows the parameters of this experiment.
This experiment plans to compare the proposed hybrid checkpointing protocol (HCP) with the pure Synchronous Checkpointing Protocol (SCP) and the pure Asynchronous Checkpointing Protocol (ACP). In the pure synchronous checkpointing strategy, a process sends the coordination request to the related processes in same cluster and different clusters after received K messages. The process firstly takes an uncertain checkpoint, and then turns the uncertain one to certain one after the coordination is successful. In the pure asynchronous checkpointing strategy, a process takes a checkpoint independently after received K messages. According to the characteristics of ad hoc networks, this experiment adopts the following performance indexes:
  • Average Checkpoint Time (ACT): average time for finishing the checkpointing procedure;
  • Average Recovery Time (ART): average time for finishing the rollback recovery operation;
  • Average Storage Size (ASS): average disk space that is occupied by the checkpoints of MH;
  • Average Additional Messages (AAM): average number of messages for checkpoints coordination and rollback recovery despite the normal communication messages;
  • Average Cluster head Forwarded Messages (ACFM): average number of messages forwarded by the cluster head.

8.2. Experiment Results and Analysis

This experiment divides to two parts. The first part is to test the message threshold for checkpoint K in the experiment environment. The second part is to test the ACT, ART, ASS, AAM and ACFM of the three protocols under different conditions.

8.2.1. Experiment 1 Selection of K

The checkpoint frequency of a process is decided by the message threshold K in HCP. If K enlarges, then the number of checkpoints deceases and the ART increases. What is more, if K increases, then the logs between checkpoints increase. Because the storage overhead of checkpoints is much larger than the average storage overhead of message logs, so the storage overhead will reduce as K increases.
For the additional messages, the checkpoint frequency deceases as K increases and so the additional messages for checkpoint coordination reduce, but the additional messages for rollback recovery augment, so the AAM does not maintain an absolute downtrend as K augments. Hence, this experiment takes the lower of AAM as the main parameter for selection of K. Figure 9 shows the AAM for varying values of K under different failure probabilities. It can be seen that the AAM under different failure probabilities is the least when K is 10. Therefore, the follow-up experiments choose this value of K to analyze and compare.

8.2.2. Experiment 2 Test of ACT

Figure 10 shows the ACT for varying ratios of SM number to DM number (R). It can be seen that because the checkpoints in ACP do not remain coordinated with other processes and the processes just need to store the current states of processes, so the ACT does not change at all. However, the ACT of HCP and SCP reduces as R augments. For the HCP, the number of processes that participate in the coordination decreases when the processes satisfy the checkpoint condition as R increases, so the ACT reduces as R augments. For the SCP, the number of DMs diminishes as R increases and it means that the processes in different clusters which also have to join the coordination reduction. Hence, the ACT decreases as R augments. What is more, the time for checkpoints coordination of HCP is sure to be less than the SCP, because the checkpoints coordination of HCP is limited in one cluster.

8.2.3. Experiment 3 Test of ART

Because there are major cascade rollbacks in ACP, so the ART of ACP is greatly large than the one of HCP and SCP. In order to clearly describe the difference between HCP and SCP, this experiment just compares the effect on ART of HCP and ACP for varying M and F. The results are displayed in Figure 11 and Figure 12.
As Figure 11 shows, the ART of SCP is lower than the one of HCP when M is 2. This is because the extent of change of the rollback caused by the cascade rollback of HCP is larger than the one brought by the rollback request of SCP, but the ART of these two protocols all increase as M augments and the time of HCP is lower than SCP when M is 3 and M is 4. The reason for the augment of two algorithms is that there are rollback requests sent to different clusters. Thus, the rollback extent increases when processes in different clusters receive the rollback requests as M augments. By contrast, the rollback requests sent to different clusters of SCP are greatly larger than the ones of HCP because the rollback of SCP is global. Hence, the rollback extent change of SCP caused by M is greater than the rollback extent change of HCP brought by the cascade rollback when M is larger.
It can be seen from Figure 12, the recovery time of HCP and SCP all increase when F augments but the recovery time of HCP is less than the one of SCP. This is because the communications between the checkpoint and failure point decrease and then the related rollback processes reduce. Hence, the rollback extents of processes that do not fail have larger rollback extents when they roll back next time. This situation causes the increment of recovery time. Because the rollback notifications sent to different clusters of HCP is lesser, so the ART of HCP is less than the one of SCP.

8.2.4. Experiment 4 Test of ASS

In order to test the ASS, this experiment compares the occupation of storage space of different protocols by testing the change of SCN. Figure 13 shows the results of tests. It can be seen, the ASS of SCP does not change basically. This is because each process in SCP just needs to store one checkpoint and a handful of message logs. However, each process in HCP and ACP has to store multiple checkpoints and many message logs, and so the occupation of storage space increases as the successive checkpoints number augments. But the ASS of HCP is less than the ASS of ACP greatly because the DCs of HCP will be cleared away in advance.

8.2.5. Experiment 5 Test of AAM

In order to test the AAM, this experiment compares the effect of change created by F and R respectively. The performance results of this experiment are described in Figure 14 and Figure 15. It can be seen from Figure 14 that the additional messages increase along with the increment of F. This is because the larger the F is, the more additional messages related to rollbacks exist, but only the AAM of SCP shows a downtrend as R augments. The reason is that the processes which participate in the coordination decrease with the reduction of number of DMs when R increases, but for the HCP and ACP, the increment of R causes the dependence relations among checkpoints of processes to become tighter and so the additional messages created by the rollbacks of failure processes augment. In general, the AAM of SCP is greatly larger than the one of HCP and ACP. This is because the SCP has the global rollback despite the global coordination, but for the HCP, though there are additional messages caused by coordination and rollback together, the coordination and rollback are limited in the same cluster. Hence, the additional messages of HCP are less than S for CP. For the ACP, its additional messages are the least because there is no coordination in ACP.

8.2.6. Experiment 6 Test of ACFM

Figure 16 and Figure 17 express the change of ACFM for the three protocols under different F and R. It can be seen that the enlargement of F causes the increment of ACFM in the three protocols. Hereinto, the change extents of HCP and ACP are lesser. This is because the rollback requests forwarded by CHs increase when F augments. For the HCP and ACP, the rollback requests forwarded by CHs are lesser, and so is the effect of F. In the Scp, each rollback requires all processes to partake and so the messages forwarded by CHs become major and take up a high proportion of the total messages. Then it causes the change extent of messages forwarded by CHs being larger when R varies. What is more, the CHs in SCP will forward the coordination requests also and so the messages forwarded by CHs in Scp are the largest. For the HCP, the coordination requests for different clusters and rollback requests forwarded by CHs are lesser, and so the messages forwarded by CHs are less than ACP. As R increases, the ACFM of the three protocols all reduce. Hereinto, the ACFM of HCP is least when R is less than 1. This is because the kinds of requests related to processes in different clusters increase when R augments. So the messages forwarded by CHs reduce. For the SCP, the related coordination processes in different clusters decreases with the increment of R when R is larger than 1, and so the coordination requests forwarded by CHs reduce obviously. In the HCP, though the coordination and rollback requests for different clusters decrease with the increment of R, there are a certain amount of elimination notices of DCs. Hence, the reduction extent of HCP is less than the one of SCP.
In short, because each process in SCP just stores one checkpoint, so the occupation of storage space is lesser. There is no cascade rollback in SCP, thus the recovery time is short, but in order to assure the checkpoints coordination of all processes, the processes in SCP need to exchange a lot of coordination information and so the additional messages are major.
In the ACP, because the checkpoints of processes are not synchronous, so the checkpoint time is just the time needed to store the state of the process and the additional messages just include bits of additional messages caused by rollback. However, the storage space of ACP is larger because of the storing of many checkpoints. What is more, the recovery time is longer, because there are cascade rollbacks among checkpoints of processes.
For the proposed hybrid protocol HCP, the time to transmit and receive the kinds of coordination information is less than the one of SCP, because the coordination of checkpoints is limited to the processes in same cluster. Hence, the checkpoint time of HCP is greatly shorter than the one of SCP. In addition, the coordination and rollback of HCP do not affect the other clusters marginally and so the retransmission overhead of CHs is lesser. Though the additional messages of HCP are a little higher than the ones of ACP because of coordination, the recovery time of HCP is shorter for bits of constraints from different clusters and cascade rollbacks.

9. Conclusions and Next Work

Mobile ad hoc networks are an important part of modern communication networks, but the existing checkpointing algorithms could not be applied to this network environment very well because it has the features such as less of support of fixed network, multi-hops routing, changeable topological structure, bad stability and high failure probability.
In the distributed system, the performance indexes for measuring checkpoints mainly include the number of communication messages, storage overhead, recovery overhead and time for checkpointing coordination, so it can choose between coordinated or uncoordinated checkpointing according to the characteristics of application environment. However, there is no support for fixed networks and the kinds of resources are very limited in mobile ad hoc networks. For example, the wireless bandwidth is lower and the CPU capability is weak in mobile ad hoc networks, so the checkpointing technologies for this kind of network cannot only consider any index alone, and a trade off method which takes kinds of performance indexes (such as checkpoint time, rollback recovery time, storage space and number of additional messages) into account synthetically should be designed. Therefore, this paper designs a hybrid checkpointing model and algorithm which combines the synchronization with asynchronization technology. This hybrid checkpointing algorithm not only avoids the major cascade rollback among MHs in the same cluster and too much message transmissions among the MHs in different clusters, but also reduces communication delay. What is more, this hybrid model has little dependence on CHs, so it has good flexibility.
Compared to the pure synchronous and asynchronous checkpointing strategy, this paper does plenty of performance tests for the hybrid checkpointing strategy. The results show that the hybrid checkpointing strategy is a good trade off method between the synchronous and asynchronous checkpointing strategy. What is more, it has shorter recovery time compared with the two methods.
Our next work is to analyze the multicast communication protocols and kinds of routing protocols and then design a new checkpointing algorithm which is suitable to the multicast environment and adapts to the different routing protocols. To do so, the fault tolerance and reliability of ad hoc networks will be further improved.

Acknowledgments

This work is supported by The Fundamental Research Funds for the Central Universities (Grant No. YWF-14-YQGD-007) and The National Natural Science Fund (NSFC-61473022). The authors extend their appreciation to the International Scientific Partnership Program ISPP at King Saud University for funding this research work through ISPP# 0033.

Author Contributions

For this article with several authors, N.N.X. developed the idea to use mode splitting for angular-rate sensing. L.Z.Z. and W.Z. constructed the measurement model, carried out the simulations, and wrote the manuscript. A.V.V. and M.I. helped with the analysis of the rate equations.

Conflicts of Interest

All authors declare no conflict of interest.

References

  1. Zhou, Z.; Wu, Q.J.; Huang, F.; Sun, X. Fast and accurate near-duplicate image elimination for visual sensor networks. Int. J. Distrib. Sens. Netw. 2017, 13. [Google Scholar] [CrossRef]
  2. Zhang, Y.; Sun, X.; Wang, B. Efficient Algorithm for K-Barrier Coverage Based on Integer Linear Programming. China Commun. 2016, 13, 16–23. [Google Scholar] [CrossRef]
  3. Chandy, K.M.; Lamport, L. Distributed snapshots: Determining global state of distrubuted systems. ACM Trans. Comput. Syst. 1985, 3, 63–75. [Google Scholar] [CrossRef]
  4. Shen, J.; Chang, S.; Shen, J.; Liu, Q.; Sun, X. A lightweight multi-layer authentication protocol for wireless body area networks. Future Gener. Comput. Syst. 2016. [Google Scholar] [CrossRef]
  5. Soni, S.K.; Aseri, T.C. A Review of Current Multicast Routing Protocol of Mobile Ad Hoc Network. In Proceedings of the Second International Conference on Computer Modeling and Simulation, Sanya, China, 22–24 January 2010; pp. 207–211. [Google Scholar]
  6. Wang, B.; Gu, X.; Ma, L.; Yan, S. Temperature Error Correction based on BP Neural Network in Meteorological WSN. Int. J. Sens. Netw. 2017, 23, 265–278. [Google Scholar] [CrossRef]
  7. Praveen, P.K. A min-process checkpointing protocol for deterministic mobile Ad hoc networks. Int. J. Eng. Sci. Technol. 2011, 3, 6923–6930. [Google Scholar]
  8. Sun, Y.; Gu, F. Compressive sensing of piezoelectric sensor response signal for phased array structural health monitoring. Int. J. Sens. Netw. 2017, 23, 258–264. [Google Scholar] [CrossRef]
  9. Elnozahy, E.N.; Johnson, D.B.; Zwaenepoel, W. The performance of consistent checkpointing. In Proceedings of the 11th Symposium on Reliable Distributed Systems, Houston, TX, USA, 5–7 October 1992; pp. 39–47. [Google Scholar]
  10. Xiong, N.; Vasilakos, A.V.; Yang, L.T.; Song, L.; Pan, Y.; Kannan, R.; Li, Y. Comparative analysis of quality of service and memory usage for adaptive failure detectors in healthcare systems. IEEE J. Sel. Areas Commun. 2009, 27, 495–509. [Google Scholar] [CrossRef]
  11. Tamir, Y.; Sequin, C.H. Error recovery in multi-computers using global checkpoints. In Proceedings of the 13th International Conference on Parallel Processing, Bellaire, MI, USA, 13–15 March 1984; pp. 32–41. [Google Scholar]
  12. Koo, R.; Toueg, S. Checkpointing and rollback recovery for distributed systems. IEEE Trans. Softw. Eng. 1987, SE-13, 23–31. [Google Scholar] [CrossRef]
  13. Strom, R.E.; Yemini, S. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 1985, 3, 204–226. [Google Scholar] [CrossRef]
  14. Bhargava, B.; Lian, S.R. Independent checkpointing and concurrent rollback for recovery in distributed systems—An optimistic approach. In Proceedings of the 17th IEEE Symposium on Reliable Distributed Systems, Columbus, OH, USA, 10–12 October 1988; pp. 3–12. [Google Scholar]
  15. Wang, Y.; Fuchs, W.K. Lazy checkpoint coordination for bounding rollback propagation. In Proceedings of the 12th Symposium on Reliable Distributed Systems, Princeton, NJ, USA, 17–19 October 1983; pp. 78–85. [Google Scholar]
  16. Baldoni, R.; Hélary, J.; Mostefaoui, A.; Raynal, M. A communication-induced checkpointing protocol that ensures rollback-dependency trackability. In Proceedings of the International Symposium on Fault-Tolerant-Computing Systems, Seattle, WA, USA, 24–27 June 1997; pp. 68–77. [Google Scholar]
  17. Hélary, J.M.; Mostefaoui, A.; Raynal, M. Communication-induced determination of consistent snapshots. IEEE Trans. Parallel Distrib. Syst. 1999, 10, 865–877. [Google Scholar] [CrossRef]
  18. Xiong, N.; Vasilakosb, A.V.; Yang, L.T.; Wang, C.; Kannane, R.; Chang, C.; Pan, Y. A novel self-tuning feedback controller for active queue management supporting TCP flows. Inf. Sci. 2010, 180, 2249–2263. [Google Scholar] [CrossRef]
  19. Ssu, K.F.; Yao, B.; Fuchs, W.K.; Neves, N.F. Adaptive checkpointing with storage management for mobile environments. IEEE Trans. Reliab. 1999, 48, 315–324. [Google Scholar]
  20. Cao, J.; Chan, G.H.; Dillon, T.S.; Jia, W. Checkpointing and rollback of wide-area distributed applications using mobile agents. In Proceedings of the 15th International Parallel and Distributed Processing Symposium, San Francisco, CA, USA, 23–27 April 2001; pp. 1–6. [Google Scholar]
  21. Chowdhury, C.; Neogy, S. Checkpointing using mobile agents for mobile computing system. Int. J. Recent Trends Eng. 2009, 1, 26–29. [Google Scholar]
  22. Xiong, N.; Jia, X.; Yang, L.T.; Vasilakos, A.V.; Li, Y.; Pan, Y. A distributed efficient flow control scheme for multirate multicast networks. IEEE Trans. Parallel Distrib. Syst. 2010, 21, 1254–1266. [Google Scholar] [CrossRef]
  23. George, S.E.; Chen, I. Movement based checkpointing and logging for failure recovery of database applications in mobile environments. Distrib. Parallel Databases 2008, 23, 189–205. [Google Scholar] [CrossRef]
  24. Cao, G.; Singhal, M. Mutable checkpoints: A new checkpointing approach for mobile computing systems. IEEE Trans. Parallel Distrib. Syst. 2001, 12, 157–172. [Google Scholar]
  25. Zhou, J.; Lu, Z.; Lu, J.; Huang, S. A Channel Assignment Scheme for Location Service in Multi-Channel Mobile Ad Hoc Networks. In Proceeding of the 5th International Conference on Ubiquitous Information Technologies and Applications, Sanya, China, 16–18 December 2010; pp. 1–6. [Google Scholar]
  26. George, S.M.; Zhou, W.; Chenji, H.; Won, M.; Lee, Y.; Pazarloglou, A.; Stoleru, R.; Barooah, P. DistressNet: A wireless ad hoc and sensor network architecture for situation management in disaster response. IEEE Commun. Mag. 2010, 48, 126–138. [Google Scholar] [CrossRef]
  27. Brahmi, N.; Boukhatem, L.; Boukhatem, N.; Boussedjra, M.; Nuy, N.D.; Labiod, H.; Mouzna, J. End-to-end routing through a hybrid ad hoc architecture for V2V and V2I communications. In Proceeding of the Med-Hoc-Net, Juan Les Pins, France, 23–25 June 2010; pp. 1–6. [Google Scholar]
  28. Du, K.; Chang, Y.; Li, B. DSR-Based Cross-Layer QoS Design in Ad Hoc Networks. In Proceeding of the International Conference on Communications and Intelligence Information Security, Nanning, China, 13–14 October 2010; pp. 224–227. [Google Scholar]
  29. Yahata, H.; Yamamoto, K.; Yamamoto, M. A study on performance improvement of ad hoc multicast by directional antenna. In Proceedings of the 15th Asia-Pacific conference on Communications, Shanghai, China, 8–10 October 2009; pp. 808–811. [Google Scholar]
  30. Liang, W.; Brent, R.; Xu, Y.; Wang, Q. Minimum-energy all-to-all multicasting in wireless ad hoc networks. IEEE Trans. Wirel. Commun. 2009, 8, 5490–5499. [Google Scholar] [CrossRef]
  31. Visu, P.; Chembian, W.T.; Koteeswaran, S. Security in multicast mobile ad hoc networks. In Proceedings of the First International Coference on Advanced Computing, Chennai, India, 13–15 December 2009; pp. 38–44. [Google Scholar]
  32. Tuli, R.; Kumar, P. Asynchronous checkpointing and optimistic message logging for mobile Ad hoc networks. Int. J. Adv. Comput. Sci. Appl. 2011, 2, 70–76. [Google Scholar] [CrossRef]
  33. Men, C.; Xu, Z.; Li, X. An efficient checkpointing and rollback recovery scheme for cluster-based multi-channel Ad-hoc wireless networks. In Proceedings of the International Symposium on Parallel and Distributed Processing with Applications, Sydney, Australia, 10–12 December 2008; pp. 371–378. [Google Scholar]
  34. Saluja, K.; Kumar, P. A non-blocking checkpointing algorithm for non-deterministic mobile Ad hoc networks. Int. J. Comput. Organ. Trends 2011, 1, 13–20. [Google Scholar]
  35. Qin, S. Scale-free topology structure in ad hoc networks. In Proceedings of the 11th International Conference on Communication Technology, Hangzhou, China, 10–12 November 2008; pp. 21–24. [Google Scholar]
  36. Zeng, Y.; Sreenan, C.J.; Xiong, N.; Yang, L.T.; Park, J.H. Connectivity and coverage maintenance in wireless sensor networks. J. Supercomput. 2010, 52, 23–46. [Google Scholar] [CrossRef]
  37. He, R.; Xiong, N.; Yang, L.T.; Park, J.P. Using multi-modal semantic association rules to fuse keywords and visual features automatically for web image retrieval. Inf. Fusion 2011, 12, 223–230. [Google Scholar] [CrossRef]
  38. Tan, L.; Zhu, Z.; Ge, F.; Xiong, N. Utility maximization resource allocation in wireless networks: Methods and algorithms. IEEE Trans. Syst. Man Cybern. Syst. 2015, 45, 1018–1034. [Google Scholar]
  39. Wan, Z.; Xiong, N.; Ghani, N.; Vasilakos, A.V.; Zhou, L. Adaptive unequal protection for wireless video transmission over IEEE 802.11 e networks. Multimed. Tools Appl. 2014, 72, 541–571. [Google Scholar] [CrossRef]
  40. Xiong, N.; Han, W.; Vandenberg, A. Green cloud computing schemes based on networks: A survey. IET Commun. 2012, 6, 3294–3300. [Google Scholar] [CrossRef]
  41. Guo, W.; Xiong, N.; Vasilakos, A.V.; Chen, G.; Yu, C. Distributed k–connected fault–tolerant topology control algorithms with PSO in future autonomic sensor systems. Int. J. Sens. Netw. 2012, 12, 53–62. [Google Scholar] [CrossRef]
Figure 1. Multi-frequency hierarchy topological structure.
Figure 1. Multi-frequency hierarchy topological structure.
Sensors 17 02166 g001
Figure 2. The model of hybrid checkpointing.
Figure 2. The model of hybrid checkpointing.
Sensors 17 02166 g002
Figure 3. Example of checkpoints dependence.
Figure 3. Example of checkpoints dependence.
Sensors 17 02166 g003
Figure 4. Example of rollback in different clusters.
Figure 4. Example of rollback in different clusters.
Sensors 17 02166 g004
Figure 5. An example of the coexistence of multiple same cluster checkpoints.
Figure 5. An example of the coexistence of multiple same cluster checkpoints.
Sensors 17 02166 g005
Figure 6. Example of DC taking.
Figure 6. Example of DC taking.
Sensors 17 02166 g006
Figure 7. IntraCDGi,a,v and IntraCDGi,k,s.
Figure 7. IntraCDGi,a,v and IntraCDGi,k,s.
Sensors 17 02166 g007
Figure 8. The transition of states.
Figure 8. The transition of states.
Sensors 17 02166 g008
Figure 9. Test for a suitable value of K.
Figure 9. Test for a suitable value of K.
Sensors 17 02166 g009
Figure 10. Impact of R on ACT.
Figure 10. Impact of R on ACT.
Sensors 17 02166 g010
Figure 11. Impact of M on ART.
Figure 11. Impact of M on ART.
Sensors 17 02166 g011
Figure 12. Impact of F on ART.
Figure 12. Impact of F on ART.
Sensors 17 02166 g012
Figure 13. Impact of SCN on ASS.
Figure 13. Impact of SCN on ASS.
Sensors 17 02166 g013
Figure 14. Impact of F on AAM.
Figure 14. Impact of F on AAM.
Sensors 17 02166 g014
Figure 15. Impact of R on AAM.
Figure 15. Impact of R on AAM.
Sensors 17 02166 g015
Figure 16. Impact of F on ACFM.
Figure 16. Impact of F on ACFM.
Sensors 17 02166 g016
Figure 17. Impact of R on ACFM.
Figure 17. Impact of R on ACFM.
Sensors 17 02166 g017
Table 1. Parameters for Experiments.
Table 1. Parameters for Experiments.
ParameterDescriptionValue
FFailure probability5~20%
MAverage hops between MHs2~4
KMessage threshold for checkpoint10
SCNSuccessive checkpoints number4~16
RRatio of SM number to DM number0.25~4

Share and Cite

MDPI and ACS Style

Xiong, N.; Zhang, L.; Zhang, W.; Vasilakos, A.V.; Imran, M. Design and Analysis of an Efficient Energy Algorithm in Wireless Social Sensor Networks. Sensors 2017, 17, 2166. https://doi.org/10.3390/s17102166

AMA Style

Xiong N, Zhang L, Zhang W, Vasilakos AV, Imran M. Design and Analysis of an Efficient Energy Algorithm in Wireless Social Sensor Networks. Sensors. 2017; 17(10):2166. https://doi.org/10.3390/s17102166

Chicago/Turabian Style

Xiong, Naixue, Longzhen Zhang, Wei Zhang, Athanasios V. Vasilakos, and Muhammad Imran. 2017. "Design and Analysis of an Efficient Energy Algorithm in Wireless Social Sensor Networks" Sensors 17, no. 10: 2166. https://doi.org/10.3390/s17102166

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop