Reinforcement based mobile robot navigation in dynamic environment

https://doi.org/10.1016/j.rcim.2010.06.019Get rights and content

Abstract

In this paper, a new approach is developed for solving the problem of mobile robot path planning in an unknown dynamic environment based on Q-learning. Q-learning algorithms have been used widely for solving real world problems, especially in robotics since it has been proved to give reliable and efficient solutions due to its simple and well developed theory. However, most of the researchers who tried to use Q-learning for solving the mobile robot navigation problem dealt with static environments; they avoided using it for dynamic environments because it is a more complex problem that has infinite number of states. This great number of states makes the training for the intelligent agent very difficult. In this paper, the Q-learning algorithm was applied for solving the mobile robot navigation in dynamic environment problem by limiting the number of states based on a new definition for the states space. This has the effect of reducing the size of the Q-table and hence, increasing the speed of the navigation algorithm. The conducted experimental simulation scenarios indicate the strength of the new proposed approach for mobile robot navigation in dynamic environment. The results show that the new approach has a high Hit rate and that the robot succeeded to reach its target in a collision free path in most cases which is the most desirable feature in any navigation algorithm.

Introduction

In the recent years, research and industrial interests are focused on developing smart machines such as robots that are able to work under certain conditions for a very long time and without any human intervention. This includes doing specific tasks in hazardous and hostile environments. Mobile robots are smart machines that can do such tedious tasks. These robots are used in the areas where the robot navigates and carries a certain task at the same time such as, service robots, surveillance and explorations [1].

Mobile robot navigation in an unknown environment has two main problems: localization and path planning [5], [6]. Localization is the process of determining the position and orientation of the robot with respect to its surrounding. The robot needs to recognize the objects around it. It needs to recognize each object as a target or as an obstacle. Many techniques deal with this localization problem using laser range finders, sonar range finders, ultrasonic sensors, infrared sensors, vision sensors and GPS that have been developed on-board or off-board. When a larger view of the environment is necessary, a network of cameras has been used.

The other problem is the path planning in which the robot needs to find a collision free path from its starting point to its end point. In order to be able to find that path, the robot needs to run a suitable path planning algorithm, to compute the path between any two points [7].

Many researchers studied the problem of robot path planning with obstacle avoidance and many solutions were proposed to deal with the problem [8], [9]. Since the robot motion in dynamic field has a certain amount of randomness due to the nature of the real world, these solutions did not give accurate results under all conditions. In the recent years there was a drift toward artificial intelligent approaches to improve the robot autonomous ability based on accumulated experiences. In general, artificial intelligent methods can be computationally less expensive and easier than classical methods.

This research focuses on mobile robot path planning moving in a dynamic environment, where a new approach is proposed to solve this problem based on Q-learning algorithm. Q-learning algorithm has many features that make it suitable for solving the mobile robot navigation problem in dynamic environment. First, the Q-learning agent is a reinforcement learning [29] agent that has no previous knowledge about its working environment. It learns about the environment through interacting with it. This type of learning agents is called unsupervised learning agent. Since it was assumed that the mobile robot has no previous knowledge about its working environment a Q-learning agent is a good alternative for solving the mobile robot navigation problem in dynamic environment. Secondly, Q-learning agent is an on-line learning agent. It learns the best action to take at each state by trial and error. It chooses actions randomly and calculates the value for taking an action at a specific state. Through evaluating every state and action pair it can build a policy for working in the environment. In the mobile robot navigation problem in order to find a collision free path, the robot needs to find the best action to take at each state. It needs to learn this knowledge on line, while navigating its environment. Because Q-learning is very simple, it is a very appealing alternative [10].

Section snippets

Literature review

In the last decade many classical solutions tried to address the robot path planning problem. The most commonly used solution is the potential field method [22] and its variants [12], [14]. This method has been studied extensively by scholars. It was introduced in its most common form by Borenstein and Koren [11]. The basic idea behind this method is to fill the robot environment with a potential field in which the robot is attracted to the target position and is repulsive away from obstacles.

Mobile robot path planning using Reinforcement Learning in literature

In the proposed approach the robot path planning is solved using Q-learning. This method was first introduced by Watkins [18] for learning from delayed rewards and punishments. In literature there were many attempts to solve the mobile robot path planning problem using Reinforcement Learning algorithms. These methods learn the optimal policy for navigation to select the action that produces maximum cumulative reward.

Smart and Kaelbling [23], [24] used the Q-learning for mobile robot navigation

Assumptions

The initial robot location and the goal are predefined to the robot, where the robot will try to reach the goal with free collision path in spite of the presence of obstacles in the robot’s surrounding environment. There are no predefined assumptions on the velocity of the robot when it reaches its target, which means that it is a hard landing robot.

In this paper, it is assumed that the robot is equipped with all necessary sensors to supply the robot with all necessary sensory data required by

Methodology

In order to apply the Q-learning algorithm four major parts should be addressed: the working environment, the reward function, the value function and the adapted policy. In the following subsections, each part is explained in details.

Simulation and results

An extensive simulation studies were carried to train the robot and to prove the effectiveness of the new method. This simulation was implemented using MATLAB software. Different scenarios for different situations were implemented and the results of these scenarios were used to assess the performance of the proposed Q-learning solution.

Conclusions

In this research a new approach for mobile robot navigation in dynamic environment was presented using the Q-learning algorithm. The Q-learning algorithm helped in solving the problem of motion planning without having a model for the environment since the environment is completely unknown for the robot. No previous constrains were assumed about the environment or about the target or obstacles movements. In order to be able to apply this algorithm in dynamic environment a new definition for the

References (34)

  • K. Song et al.

    Reactive navigation in dynamic environment using a multisensor predictor

    IEEE Transactions on Systems, Man, and Cybernetics

    (1999)
  • S. Russell et al.

    Reinforcement learning in: artificial intelligence a modern approach

    (2003)
  • J. Borenstein et al.

    Real-time obstacle avoidance for fast mobile robots

    IEEE Transactions on Systems, Man, and Cybernetics

    (1989)
  • S. Ge et al.

    New potential functions for mobile robot path planning

    IEEE Transactions on Robotics and Automation

    (2000)
  • S. Ge et al.

    Dynamic motion planning for mobile robots using potential field method

    Autonomous Robots

    (2002)
  • N. Tsourveloudis et al.

    Autonomous vehicle navigation utilizing electrostatic potential fields and fuzzy logic

    IEEE Transactions on Robotics and Automation

    (2001)
  • M. Joo et al.

    Obstacle avoidance of a mobile robot using hybrid learning approach

    IEEE Transactions on Industrial Electronics

    (2005)
  • Cited by (179)

    • Robot learning towards smart robotic manufacturing: A review

      2022, Robotics and Computer-Integrated Manufacturing
    • Adaptive Q-learning path planning algorithm based on virtual target guidance

      2024, Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS
    View all citing articles on Scopus
    View full text