Stochastics and Statistics
Dynamic shortest path in stochastic dynamic networks: Ship routing problem

https://doi.org/10.1016/S0377-2217(01)00385-XGet rights and content

Abstract

In this paper, we apply the stochastic dynamic programming to find the dynamic shortest path from the source node to the sink node in stochastic dynamic networks, in which the arc lengths are independent random variables with exponential distributions. In each node there is an environmental variable, which evolves in accordance with a continuous time Markov process. The parameter of the exponential distribution of the transition time of each arc is also a function of the state of the environmental variable of its initiative node. Upon arriving at each node, we can move toward the sink node through the best outgoing arc or wait. At the beginning, it is assumed that upon arriving at each node, we know the state of its environmental variable and also the states of the environmental variables of its adjacent nodes. Then we extend this assumption such that upon arriving at each node, we know the states of the environmental variables of all nodes. In the ship routing problem, which we focus in this paper, the environmental variables of all nodes are known, but it is shown that the complexity of the algorithm becomes exponential in this case.

Introduction

This paper develops an algorithm to find the dynamic shortest path from the source node to the sink node in acyclic networks with the following specifications. It is assumed that the arc lengths are independent random variables with exponential distributions. In each node (except the sink node) there is an environmental variable, which evolves in accordance with a continuous time Markov process.

The transitions of each environmental variable influence the parameters of the exponential distributions of the lengths of the related outgoing arcs and consequently the expected lengths of the outgoing arcs. It is also assumed that all continuous time Markov processes corresponding to the environmental variables are independent. At the beginning, it is assumed that upon arriving at each node, we know the state of its environmental variable and also the states of the environmental variables of its adjacent nodes that all together form the state vector of the system. Then we extend this assumption such that upon arriving at each node, we know the state of this node and also the states of the environmental variables of its forward nodes. Upon arriving at each node, taking into account the recursive function of the stochastic dynamic programming corresponding to the problem of finding the dynamic shortest path, we can do one of these actions:

  • 1.

    Moving through the best outgoing arc.

  • 2.

    Waiting for encountering better state of the environmental variable, which reduces the expected length of the outgoing arcs.


If the arc lengths are constant, there are several methods to find the shortest path from the source node to the sink node based on dynamic programming, zero-one programming and also network flows theory. Some of these algorithms can be found in [1]. Deo and Pang [7] provided a taxonomy and annotation for the shortest path algorithms. When arc lengths are random variables, the problem will become more difficult. Martin [11] found the distribution function of shortest path and also the expected value of shortest path in stochastic networks, in which the arc lengths are independent random variables with polynomial distribution functions, in the form of multiple integrals. Frank [8] computed the probability that the time of the shortest path of the network is smaller than a specific value. He assumed that the arc lengths are continuous random variables. Mirchandani [13] presented another method for obtaining the distribution function of shortest path in stochastic networks. It is not required to solve multiple integrals in this paper, but this method can only be used for the special case where arc lengths are discrete random variables.

There are several papers about finding a path with minimum expected value or minimum variance or other criteria in stochastic networks. In these papers, the multicriteria networks are analyzed. Martins [12] found the set of efficient paths for bicriteria shortest path problem by dynamic programming. Current and Min [6] provided a taxonomy and annotation for the multi-criteria networks. Wijerante et al. [17] presented a method to find the set of non-dominated paths from the source node to the sink node, in which each arc includes several criteria that some of them might be stochastic. Carraway et al. [5] applied the method of generalized dynamic programming to find the optimal path of a bicriteria network.

All of these algorithms are static, because they always find a specific path, taking into account the proper objective function as the shortest path from the source node to the sink node. Hall [9] introduced the problem of finding least expected travel time path between two nodes in a network with travel times that are both random and time-dependent. He showed that the optimal “route choice” is not a simple path but an adaptive decision rule. The best route from any given node to the final destination depends on the arrival time at that node. Bertsekas and Tsitsiklis [2] considered a stochastic version of the classical shortest path problem whereby for each node of a graph, we must choose a probability distribution over the set of successor nodes such that we reach a certain destination node with minimum expected cost. Bertsimas and Van Ryzin [3] introduced and analyzed a model for stochastic and dynamic vehicle routing, in which a single, uncapacitated vehicle traveling at a constant velocity in a Euclidean region must serve demands whose time of arrival, location and on-site service are stochastic. Bertsimas and Van Ryzin [4] extended this analysis and considered the problem of m identical vehicles with unlimited capacity. Psaraftis and Tsitsiklis [14] examined shortest path problems, in which arc costs are the known functions of certain environmental variables at network nodes, and each of these variables evolves according to an independent Markov process. The vehicle can wait at a node in anticipation of more favorable arc costs. They showed that the optimal policy essentially classifies the state of the environmental variable at a node into two categories: green states for which the optimal action is to immediately traverse the arc, and red states for which the optimal action is to wait. Then they extended these concepts for the entire network by developing a dynamic programming, which solves the corresponding problem.

In our paper, we extend the paper by Psaraftis and Tsitsiklis [14] to suite the real situation problems by altering the following assumptions:

  • 1.

    It is assumed that the environmental variables evolve in accordance with the independent semi-Markov processes, instead of Markov processes. Taking into account this assumption, the transition times from each state to other states would not be the same.

  • 2.

    It is assumed that the length of each arc is an exponential random variable whose parameter is a function of the environmental variable of its initiative node. In [14], it was assumed that the length of each arc is a deterministic function of the environmental variable of its initiative node. In real-world problems, it would be better to consider the length of each arc as a random variable, in which the transitions of the environmental variable of its initiative node influence the expected value of this arc length like our model.

  • 3.

    It is assumed that upon arriving at each node, we know the state of the environmental variable of this node and also the states of the environmental variables of its forward nodes. In this information age, we can assume that upon arriving at each node, not only we know its environmental variable, but also we can know the states of the environmental variables of all nodes.


Our model is adaptable to solve many practical problems, but we focus on the problem of routing a ship in the ocean, taking into account the variations of the weather conditions. In this case, nodes of the network indicate the geographical regions where the ship passes through and the environmental variable of each node indicates the weather conditions of that region whose transitions are in accordance with a continuous time Markov process. The transition times from each region to the adjacent regions depend on the weather conditions of the initiative region, because if the weather conditions are not good, we cannot move fast and it would be better to wait for encountering more favorable conditions. At the beginning, it is assumed that upon arriving the ship at each region, the weather conditions of that region and its adjacent regions are known. Then we extend this assumption such that upon arriving at each region, we know the weather conditions of all forward regions. In this information age, the weather conditions are actually known for all regions, but the major defect of this extended model, which is proved in Section 4, is that the complexity of algorithm becomes exponential. Finally, we obtain the optimal strategy of movement in each region by the stochastic dynamic programming.

In Section 2, we describe the framework of the proposed model. In Section 3, we describe the optimal strategy of movement in each node of the network, when we know the states of the environmental variables of the node and its adjacent nodes. In Section 4, we extend the previous model, such that upon arriving at each node, we know the states of the environmental variables of all nodes. In Section 5, we solve a numerical example and finally, we draw the conclusion of the paper.

Section snippets

Framework of the proposed model

Let V represent the set of nodes and A represent the set of arcs of the directed acyclic network G=(V,A). We want to find the shortest path from the source node 1 to the sink node n. The length of each arc (i,j)∈A is an exponential random variable with parameter λij(si). This parameter is a function of state si of the environmental variable of node i upon moving from node i. Assume that the random variables corresponding to the arc lengths of the network and also the environmental variables

Optimal strategy of movement in each node, when we know the states of the environmental variables of its adjacent nodes

In this section, we extend the previous method to the directed acyclic network G=(V,A). It is assumed that upon arriving at node i, we know the state of environmental variable si and also all environmental variables of A(i). Therefore the state vector of this system upon arriving at node i is in the form (si,si1,si2,…,siQi) whose elements indicate state si and also all states of the environmental variables of A(i). In this case, we want to find the optimal strategy of movement in each node such

Optimal strategy of movement in each node, when we know the states of the environmental variables of all nodes

In this section, we extend the previous assumption such that upon arriving at each node, we know the state of the environmental variable of this node and also the states of the environmental variables of its forward nodes. Therefore the state vector of this system upon arriving at node i is in the form (si,si+1,si+2,…,sn−1) whose elements indicate state si and also the states of the environmental variables of all forward nodes of node i. It is clear that upon arriving at node iV, we may wait

Numerical example

Consider the network depicted in Fig. 1. We want to obtain the optimal strategy of movement in nodes 1, 2, and node 3 such that the expected arrival time at node 4 from node 1 is minimized. We consider the weather conditions of each node as the environmental variable of that node. It is assumed that these weather conditions evolve in accordance with three independent continuous time Markov processes with two states. State 1 is corresponding to the good weather conditions and state 2 is

Conclusion

In this paper, we developed an algorithm based on semi-Markov decision process and network flows theory for obtaining the optimal strategy of movement in each node of acyclic networks such that the expected arrival time at the sink node from the source node is minimized. It was assumed that the arc lengths are independent random variables with exponential distributions. It was also assumed that in each node (except the sink node), there is an environmental variable, which evolves in accordance

References (17)

There are more references available in the full text version of this article.

Cited by (48)

  • Optimal decisions in stochastic graphs with uncorrelated and correlated edge weights

    2023, Computers and Operations Research
    Citation Excerpt :

    Specifically, algorithmic methods including explicitly variance–covariance matrices, correlation matrices, and regression models for estimating correlation have been proposed in the literature (Chen et al., 2016; Fan et al., 2005; Nie et al., 2012; Waller and Ziliaskopoulos, 2002; Zeng et al., 2015; Zhang et al., 2017; Zockaie et al., 2014). Different stochastic shortest path problems can be analyzed using the general theory of Markov Decision Processes (MDPs) (Azaron and Kianfar, 2003; Bertsekas and Tsitsiklis, 1991; Thomas L. Dean et al., 1993; Psaraftis and Tsitsiklis, 1993; Al-Sabban et al., 2013; Ohtsubo, 2003). A recent formulation as a Markov decision process provides a practical framework to compute optimal decisions for each decision epoch whilst incorporating general distributions of random weights and dependent edge weights (Buchholz and Felko, 2015).

  • Dynamic relative robust shortest path problem

    2020, Computers and Industrial Engineering
  • Optimising cargo loading and ship scheduling in tidal areas

    2020, European Journal of Operational Research
View all citing articles on Scopus
View full text