main-content

## Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.08.2014 | Ausgabe 3/2014 Open Access

# Learning Automata Based Face-Aware Mobicast

Zeitschrift:
Wireless Personal Communications > Ausgabe 3/2014
Autoren:
S. M. Safavi, M. R. Meybodi, M. Esnaashari

## 1 Introduction

Wireless sensor networks (WSNs) are a hi-technology that researchers have recently focused on. A typical WSN consists of hundreds or thousands of sensor nodes equipped with one or more specific sensors such as temperature sensors or cameras. Also each of these nodes have wireless communicate circuit which can communicate with its neighbor nodes and can be used as a router for sending collected information to the base station which called sink node. The sensor nodes can work in two modes: active or sleep. In the active mode, communicating and sensing circuits are switched on, while in the sleep mode, they are switched off. Energy is a vital resource in WSN, and hence, trying to deactivate the sensor nodes when they are not necessarily needed is a very common practice in such networks [ 1, 2].
Among the numerous applications of sensor networks, target tracking is one of the most important. Considering the energy constraints of WSN, in target tracking applications, only the nearest sensor nodes to the moving target must participate in the tracking task [ 35]. Other nodes can be in sleep mode, saving their energies. Thus, a sensor node must be awakened before the moving target reaches its sensing area. Without a precise awakening mechanism, the moving target may be lost and the tracking task is failed. For a precise awakening mechanism, it is necessary to have a group of active nodes around the moving target at any time, responsible for not only monitoring the target, but also awakening. An active node must decide on the set of sensor nodes for awakening, considering the moving path of the target.
Huang et al. [ 6] after studying different aspects of object tracking, concluded that both spatial and temporal (spatiotemporal) constraints must be satisfied simultaneously for precise tracking. The result of their investigation has formed a concept called Mobicast. Mobicast expresses the delivery of the messages to a set of nodes; just in time before the moving target arrives at their sensing region. This is done by considering a mobile delivery zone, which is defined as the set of nodes, which must be awakened (or more precisely, the area within which such nodes present). The delivery zone is mobile, because it moves along the moving path of the target.
Formally, a mobicast session is specified by a quadruple, (m, Z[t], Ts, T). m is the mobicast message. Z[t] is the group of nodes (delivery zone) where m should be disseminated. As the delivery zone Z[t] evolves over time, the set of recipient nodes of m changes as well. Ts and T are the sending time and duration of the mobicast session, respectively. Each node receives this quadruple, can compute location of delivery zone. Figure  1a shows a rectangular delivery zone moving upward. Figure  1b shows a more general scenario where the delivery zone can change its direction, size and shape over time.
Huang et al. [ 6] proposed a mobicast protocol with 100 % tracking guaranty across the network. The main problem of their protocol is its high energy consumption, because of the large delivery zone considered. Huang et al. [ 7] presented four mobicast protocols. The first protocol is very simple and suffers from many problems. This one is only used for clarifying the concepts. Every node that receives a mobicast message, checks if it resides within the delivery zone of the message, and if this is not the case, it does not deliver the message to the upper layers (specifically application layer). Regardless of this fact, the node rebroadcasts the receiving message in its vicinity. This protocol is not efficient from energy point of view because it has global flooding problem. In addition, it delivers the mobicast messages too early to the nodes, a contradiction with the “just in time” property of mobicast. Average of such early delivery times is referred to as slack time.
The second protocol, named “DZC”, is just like the previous one, with the exception that it uses the concepts of “hold and forward” zones in order to decrease the slack time. A sensor node, within the hold zone, holds received messages and forwards them only when it resides within the forwarding zone.
The third protocol, “FZC”, works in two phases—initialization and operational. This protocol uses a forwarding zone, much bigger than the delivery zone, which always moves ahead of the delivery zone, guarantying the delivery of messages in time. Its main problem is the size of the forwarding zone, which usually contains many, unnecessary, nodes. The last presented protocol is not actually a new one. Indeed, in this protocol the size of the forwarding zone is reduced to some extent, sacrificing guaranty level of message delivery to save more energy.
In Huang et al. [ 8], proposed the compactness metric, which is a relation between the Euclidean distance and the network spatial distance (hops) among network nodes. Using this concept, they introduced an adaptive mobicast protocol, which locally estimates the compactness metric and uses it for enhancing the size of the forwarding zone in different sensor network fields.
Huang et al. [ 9] introduced a new mobicast protocol, using the face concept. A face is the subdivision of maximal connected subset of the plane that does not contain a point on an edge or a vertex [ 9]. They called the nodes around a face as spatial neighbors. In their protocol, every node must be aware of the faces and the spatial neighbors. When a mobicast message arrives at a node, the node checks if any of the faces is in the delivery zone and if so, it rebroadcasts the message locally considering the hold and forward zones concepts.
All above mentioned protocols, have assumed that the mobile target move with the constant velocity in a straight direction. Chen et al. [ 10] designed a novel mobicast routing protocol, called variant-egg-based mobicast (VE-mobicast), to support a different movement path for the moving target. They defined the concept of predictive accuracy as the percentage of the sensor nodes which are located in both delivery and forwarding zones out of the sensor nodes within the delivery zone. If all sensor nodes in the delivery zone are also in the forwarding zone, then the predictive accuracy is 100 %. On the other hand, predictive accuracy is 0 % if none of the sensor nodes in the delivery zone are in the forwarding zone. Chen and et al. tried to improve the predictive accuracy in VE-mobicast. At any given time $$t$$, VE-mobicast is divided into two phases. In the first phase, called the egg estimation phase, the variant-egg forwarding zone in time $$t+1 (F[t+1])$$ is estimated by sensor nodes in $$H[t]=F[t]\cap F[t+1]$$. In the second phase, called the distributed variant-egg-based mobicast, the shape of the next variant-egg forwarding zone is adjusted using the results of the first phase. The shape of the forwarding zone is based on the cassini oval equation [ 11].
HVE-mobicast routing protocol, presented in [ 12], extends VE-mobicast so that it can handle different moving speeds for the moving target. In HVE-mobicast, sensor network is first clustered prior to the target tracking task. During the target tracking task, message delivery in the forwarding zone accomplished by two phase; cluster-to-cluster and cluster-to-node. In the cluster-to-cluster phase, only the cluster-heads and relay nodes are notified to wake up and become active. In the cluster-to-node phase, member nodes of the awakened clusters are notified and awakened by their cluster-heads considering the estimated arrival time of the delivery zone. The main contribution of the HVE-mobicast routing protocol is the speed and accuracy of nodes waking up.
In this paper, we propose a novel mobicast protocol, based on learning automata, for minimizing the activated nodes during mobicast sessions. In this protocol, like the protocol given in [ 9], when a node $$n_{i}$$ receives a message, it checks if one of its faces is in the delivery zone and if so, $$n_{i}$$ attaches its id to the received message and then rebroadcasts it in its vicinity. In addition, $$n_{i}$$ holds a copy of the message for some certain period. Copies of the message maybe delivered to $$n_{i}$$ during this period from different paths. All such copies will be dropped as soon as they arrived at the node except the last one, which is indeed received from the longest path. $$n_{i}$$ then creates two signals, a positive signal and a negative signal. The positive signal will be sent along the path from which the first message was received, and the negative signal will be sent along the longest path. This way, the probability of receiving mobicast messages from the shortest paths increases eventually. Compared with the other protocols, our protocol significantly reduces the slack time and the number of active nodes during mobicast sessions.
Rest of the paper is organized as follows: In the next section, we will introduce learning automata as an intelligent agent used in the proposed protocol. In Sect.  3, we will give the details of the protocol. Simulation results will be given in Sect.  4. Section  5 concludes the paper.

## 2 Learning Automata

Learning automata is an abstract model which randomly selects one action out of its finite set of actions and performs it on a random environment. Environment then works based on the selected action and responses to the automata with a reinforcement signal Based on selected action, and based on received signal, the automata updates its internal state and selects next action. Figure  2 depicts the relationship between an automata and its environment.
Environment can be defined by the triple $$E\equiv \{\alpha , \beta ,c\}$$ where $$\alpha \equiv \{\alpha _1, \alpha _2, \ldots ,\alpha _r \}$$ represents a finite input set, $$\beta \equiv \{\beta _1, \beta _2,\ldots ,\beta _m \}$$ represents the output set, and $$c\equiv \{c_1,c_2,\ldots ,c_r \}$$ is a set of penalty probabilities, where each element $$\hbox {c}_{\mathrm{i}}$$ of c corresponds to one input action i. Environments in which $$\beta$$ can take only binary values 0 or 1 are referred to as P-models. A further generalization of the environment allows finite output sets with more than two elements that take values in the interval [0,1]. Such an environment is referred to as Q-model. Finally, when the output of the environment is a continuous random variable which assumes values in the interval [0,1], it is referred to as an S-model. Learning automata are classified into fixed-structure stochastic, and variable-structure stochastic. In the following, we consider only variable-structure automata.
A variable-structure automaton is defined by the quadruple $$\{\alpha ,\beta ,p,T\}$$ in which $$\alpha =\{\alpha _1,\ldots ,\alpha _r \}$$ represents the action set of the automata, $$\beta =\{\beta _1 ,\ldots ,\beta _m \}$$ represents the input set, $$p=\{p_1,\ldots ,p_r \}$$ represents the action probability set, and finally $$p(n+1)=T[\alpha (n),\beta (n),p(n)]$$ the learning algorithm. This automaton operates as follows. Based on the action probability set $$p$$, automaton randomly selects an action i, and performs it on the environment. After receiving the environment’s reinforcement signal, automaton updates its action probability set based on Eq. ( 1) for favorable responses, and Eq. ( 2) for unfavorable ones.
\begin{aligned} p_i (n+1)&= p_i (n)+a[1-p_i (n)]\nonumber \\ p_j (n+1)&= (1-a)p_j (n) \qquad \forall j \,\,j\ne i\end{aligned}
(1)
\begin{aligned} p_i (n+1)&= (1-b)p_i (n)\nonumber \\ p_j (n+1)&= (b/{r-1})+(1-b)p_j (n) \quad \forall j \,\,j\ne i \end{aligned}
(2)
In these two equations, $$a$$ and $$b$$ are reward and penalty parameters respectively. For $$a =b$$, learning algorithm is called $$L_{R-P}$$, 1 for $$a < < b$$, it is called $$L_{R\_P}$$, 2 and for $$b = 0$$, it is called $$L_{R-I}$$. 3
For more information about learning automata the reader may refer to [ 13]. Learning automata has been recently used successfully in the area of WSN; LABER: A Learning Automata Based Energy-aware Routing Protocol for Sensor Networks [ 14], A cellular learning automata based clustering algorithm for WSNs [ 15], Dynamic point coverage in WSNs: A learning automata approach [ 16], Data aggregation in sensor networks using learning automata [ 17], A learning automata based scheduling solution to the dynamic point coverage problem in WSNs [ 18], Dynamic Point Coverage Problem in WSNs: A Cellular Learning Automata Approach [ 19], EEMLA: energy efficient monitoring of WSN with learning automata [ 20], A cellular learning automata-based deployment strategy for mobile WSNs [ 21], to mention a few.

## 3 Proposed Protocol

The aim of the proposed protocol is at minimizing the number of active nodes as well as the slack time during mobicast sessions. To this end, we make use of Face-aware mobicasting enhanced with learning techniques.
First, we have to precisely define the concept of “spatial neighborhood”. On a planar graph, each node has one or more adjacent faces. We define the “spatial neighborhood” of a node in a planar graph to be the set of nodes in all faces adjacent to that node except the node itself. In the proposed algorithm, we assume that every node knows its spatial neighbors. In what follows, we first describe the message formats used in the proposed algorithm and then, present the algorithm operation.

### 3.1 Message Format

Our protocol has two types of messages. The first one is the mobicast message and the second one is the signal message. Mobicast message contains the following information:
1.
Sending time

2.
Coordinates of the initial delivery zone: This field contains an ordered sequence of the locations of the initial vertices of the delivery zone, if the zone is a polygon. For a circular delivery zone, the radius and the initial center are recorded instead.

3.
Velocity of the delivery zone

4.
Message lifetime: It is used for terminating the mobicast session.

5.
Message type: Indicates the shape of the delivery zone, e.g., rectangle, pentagon, circle, ellipse, etc.

6.
Sequence number

7.
A bounded list of last forwarders IDs: This helps nodes in determining the path along which the message traversed towards them.

The coordinates of the initial delivery zone combined with the velocity of the zone and the sending time of the message can be used to determine the location of the delivery zone at any time in the future.
The second message type, i.e. the signal massage, is used as the reinforcement signal for the learning automata residing in the sensor nodes. Signal message contains the list of forwarders IDs (just like the mobicast message) and the signal itself (either a positive signal or negative one).

### 3.2 Algorithm Operation

Each node in the network is equipped with a learning automaton. The learning automata embedded in each node has two actions: “sending” and “not sending”. In the beginning, the probability of “sending” action is equal to 1 and the probability of “not sending” action is equal to 0. According to FAR protocol, when a mobicast message $$m$$ arrives at a node $$n_{i}$$, if at least one of the spatial neighbors of $$n_{i}$$ is in the delivery zone, $$n_{i}$$ must rebroadcast the message. But, in the proposed protocol, $$n_{i}$$ does this only if its learning automaton selects the “sending” action. If this action is selected, then $$n_{i}$$ attaches its ID to the list of the last forwarder field, rebroadcasts the message locally considering the hold and forwarding zones and waits for a certain period $$\rho$$. During this period, copies of the message $$m$$ may be received at $$n_{i}$$ from different paths. $$n_{i}$$ ignores all of these copies except the last one, which is indeed received from the longest path. $$n_{i}$$ then creates two signal messages; a positive one and a negative one. The positive signal will be sent along the path from which the original message $$m$$ was received. On the other hand, the negative signal will be sent along the longest path. When a node receives a signal message, it updates the action probability vector of its learning automata according to Eqs. ( 1) and ( 2), if top ID in last forwarder field be equal with its ID. Note that the duplicate messages received after the period $$\rho$$ are simply dropped, but considering long enough duration for $$\rho$$, this can be happened very rarely. In Fig.  3, a typical example of the execution of the proposed algorithm (b) and FAR algorithm (a) is depicted. In this example the moving target moves from a similar path multiple of times, so the proposed algorithm converged and only shortest paths remained active.
It’s important to know that at the beginning, the proposed algorithm and FAR algorithm perform the same, but when the moving target repeats its moving path, the proposed algorithm learns that path and thus, become superior to FAR in terms of the number of active nodes. The guaranty level of the proposed algorithm is the same as that of FAR.

## 4 Simulation Results

In order to study the performance of the proposed protocol, multiple protocols and the proposed protocol are simulated using J-SIM. We focus on three metrics to compare protocols: Slack time, Message exchange, and number of active nodes.
The simulation consist of 800 node dispersed randomly with a uniform distribution in an area of 1,000 $$\times$$ 600 m $$^{2}$$. The delivery zone assumed a rectangle with size $$30 \times 10\,\hbox { m}^{2}$$ that moves with constant velocity 40 m/s in a straight line. The mobile entity moves the same path multiple times while our algorithm be converged. So our algorithm in the worst case is like the FAR protocol, but with multiple path of mobile object it converges and results has been so better. Note that the $$a$$ and $$b$$ parameters of nodes learning automata is considered to be 0.5 and 0.01 respectively.
Slack time is a metric that shows how much time mobicast packets delivered sooner. Figure  4 depicted comparison of three different protocols. The X axis is the expected delivery deadline for nodes in each location, i.e. the first time they are expected to enter the delivery zone. If a node get the packet before deadline, it was locate in top of X axis, otherwise it was locate under X axis. The Y axis show amount of slack time. Diamond signs shows flooding protocol that broadcast messages to all nodes with as-soon-as possible strategy in network. Square signs shows FAR mobicast protocol. Finally, the triangles shows our algorithm slacktime. The Broadcast delivery has worst slack time because upon a packet received, it forward it. In our protocol because only a small number of nodes are active and only shortest paths are active it’s so closed to delivery deadline.
Message exchange is the number of packets exchanged between nodes across mobicast session. The message exchange load of forward-zone constrained (FZC) and FAR mobicast protocols are relatively same over multiple pass of mobile object from same path (because FAR has bigger messages, in Fig.  5, its exchanged data is slightly bigger than FZC protocol) but our protocol in first multiple passes, have higher load, but load is reduced by increasing the same path traversing. After a certain passes and strengthen the shortest paths (learning automata converged), the load of message exchange will be constant.
As presented in Fig.  5, in the few first passes the bulk of exchanged data are high but with removing longest paths from message dissemination process, it reduces significantly. In the end of curve, because only shortest path reminded, bulk of exchanged data become stable.
Number of active nodes is number of nodes that activated during mobicast session. As shown in Fig. 6, this is obvious in the first pass of mobile entity, number of nodes that activated in our protocol is exactly equal to FAR protocol. But with repeat passage after a while only shortest paths remain active so number of active nodes in our protocol decrease dramatically.

## 5 Conclusion

We use learning automata as an intelligent tool to decrease number of active nodes in a tracking session. For this work, number of communicated packets increased in short time interval but after this time, it decreases using learning method. When the number of repetition on the same path is high, this protocol has significant advantage. In this work, we suppose the mobile entity have a constant velocity and direct direction as what is supposed in [ 69], so we aren’t compare it with recent work like VE-mobicast [ 10] that supposed mobile entity can change its direction or HVE-mobicast [ 12] that supposed mobile entity can change both of direction and speed and have bigger energy consumption in compare to former protocols. In the future works, just likes [ 10, 12] we try to use different methods like clustering methods, so the nodes able optimize shortest path activated nodes in different directions and speeds.
Footnotes
1
Linear reward-penalty.

2
Linear reward epsilon penalty.

3
Linear reward inaction.

Literatur
Über diesen Artikel

Zur Ausgabe