Reinforcement based mobile robot navigation in dynamic environment
Introduction
In the recent years, research and industrial interests are focused on developing smart machines such as robots that are able to work under certain conditions for a very long time and without any human intervention. This includes doing specific tasks in hazardous and hostile environments. Mobile robots are smart machines that can do such tedious tasks. These robots are used in the areas where the robot navigates and carries a certain task at the same time such as, service robots, surveillance and explorations [1].
Mobile robot navigation in an unknown environment has two main problems: localization and path planning [5], [6]. Localization is the process of determining the position and orientation of the robot with respect to its surrounding. The robot needs to recognize the objects around it. It needs to recognize each object as a target or as an obstacle. Many techniques deal with this localization problem using laser range finders, sonar range finders, ultrasonic sensors, infrared sensors, vision sensors and GPS that have been developed on-board or off-board. When a larger view of the environment is necessary, a network of cameras has been used.
The other problem is the path planning in which the robot needs to find a collision free path from its starting point to its end point. In order to be able to find that path, the robot needs to run a suitable path planning algorithm, to compute the path between any two points [7].
Many researchers studied the problem of robot path planning with obstacle avoidance and many solutions were proposed to deal with the problem [8], [9]. Since the robot motion in dynamic field has a certain amount of randomness due to the nature of the real world, these solutions did not give accurate results under all conditions. In the recent years there was a drift toward artificial intelligent approaches to improve the robot autonomous ability based on accumulated experiences. In general, artificial intelligent methods can be computationally less expensive and easier than classical methods.
This research focuses on mobile robot path planning moving in a dynamic environment, where a new approach is proposed to solve this problem based on Q-learning algorithm. Q-learning algorithm has many features that make it suitable for solving the mobile robot navigation problem in dynamic environment. First, the Q-learning agent is a reinforcement learning [29] agent that has no previous knowledge about its working environment. It learns about the environment through interacting with it. This type of learning agents is called unsupervised learning agent. Since it was assumed that the mobile robot has no previous knowledge about its working environment a Q-learning agent is a good alternative for solving the mobile robot navigation problem in dynamic environment. Secondly, Q-learning agent is an on-line learning agent. It learns the best action to take at each state by trial and error. It chooses actions randomly and calculates the value for taking an action at a specific state. Through evaluating every state and action pair it can build a policy for working in the environment. In the mobile robot navigation problem in order to find a collision free path, the robot needs to find the best action to take at each state. It needs to learn this knowledge on line, while navigating its environment. Because Q-learning is very simple, it is a very appealing alternative [10].
Section snippets
Literature review
In the last decade many classical solutions tried to address the robot path planning problem. The most commonly used solution is the potential field method [22] and its variants [12], [14]. This method has been studied extensively by scholars. It was introduced in its most common form by Borenstein and Koren [11]. The basic idea behind this method is to fill the robot environment with a potential field in which the robot is attracted to the target position and is repulsive away from obstacles.
Mobile robot path planning using Reinforcement Learning in literature
In the proposed approach the robot path planning is solved using Q-learning. This method was first introduced by Watkins [18] for learning from delayed rewards and punishments. In literature there were many attempts to solve the mobile robot path planning problem using Reinforcement Learning algorithms. These methods learn the optimal policy for navigation to select the action that produces maximum cumulative reward.
Smart and Kaelbling [23], [24] used the Q-learning for mobile robot navigation
Assumptions
The initial robot location and the goal are predefined to the robot, where the robot will try to reach the goal with free collision path in spite of the presence of obstacles in the robot’s surrounding environment. There are no predefined assumptions on the velocity of the robot when it reaches its target, which means that it is a hard landing robot.
In this paper, it is assumed that the robot is equipped with all necessary sensors to supply the robot with all necessary sensory data required by
Methodology
In order to apply the Q-learning algorithm four major parts should be addressed: the working environment, the reward function, the value function and the adapted policy. In the following subsections, each part is explained in details.
Simulation and results
An extensive simulation studies were carried to train the robot and to prove the effectiveness of the new method. This simulation was implemented using MATLAB software. Different scenarios for different situations were implemented and the results of these scenarios were used to assess the performance of the proposed Q-learning solution.
Conclusions
In this research a new approach for mobile robot navigation in dynamic environment was presented using the Q-learning algorithm. The Q-learning algorithm helped in solving the problem of motion planning without having a model for the environment since the environment is completely unknown for the robot. No previous constrains were assumed about the environment or about the target or obstacles movements. In order to be able to apply this algorithm in dynamic environment a new definition for the
References (34)
- et al.
Sensor-based robot motion generation in unknown, dynamic and troublesome scenarios: real-time obstacle avoidance for fast mobile robots
Robotics and Autonomous Systems
(2005) - et al.
A genetic-fuzzy approach for mobile robot navigation among moving obstacles
International Journal of Approximate Reasoning
(1999) - et al.
A real-time limit-cycle navigation method for fast mobile robots and its application to robot soccer
Robotics and Autonomous Systems
(2003) Environment–robot interaction—the basis for mobility in planetary micro-rovers
Robotics and Autonomous Systems
(2005)- et al.
Obstacle avoidance in a dynamic environment: a collision cone approach
IEEE Transactions on Systems, Man, and Cybernetics
(1998) - et al.
Model-based localization for an autonomous mobile robot equipped with sonar sensors
IEEE International Conference on Systems, Man, and Cybernetics
(1995) - et al.
Fuzzy temporal rules for mobile robot guidance in dynamic environments
IEEE Transactions on Systems, Man, and Cybernetics
(2001) - et al.
Map based navigation in mobile robots: I. A review of localization strategies
Cognitive System Research
(2003) - et al.
Map based navigation in mobile robots: II. A review of map learning and path planning strategies
Cognitive System Research
(2003) - et al.
Vision based path planning for mobile robot using extrapolated artificial potential field and probabilistic obstacle avoidance
ASME International Mechanical Engineering Congress and Exposition
(2002)
Reactive navigation in dynamic environment using a multisensor predictor
IEEE Transactions on Systems, Man, and Cybernetics
Reinforcement learning in: artificial intelligence a modern approach
Real-time obstacle avoidance for fast mobile robots
IEEE Transactions on Systems, Man, and Cybernetics
New potential functions for mobile robot path planning
IEEE Transactions on Robotics and Automation
Dynamic motion planning for mobile robots using potential field method
Autonomous Robots
Autonomous vehicle navigation utilizing electrostatic potential fields and fuzzy logic
IEEE Transactions on Robotics and Automation
Obstacle avoidance of a mobile robot using hybrid learning approach
IEEE Transactions on Industrial Electronics
Cited by (179)
A real‐time fuzzy motion planning system for unmanned aerial vehicles in dynamic 3D environments
2024, Applied Soft ComputingHierarchical multi-robot navigation and formation in unknown environments via deep reinforcement learning and distributed optimization
2023, Robotics and Computer-Integrated ManufacturingRobot learning towards smart robotic manufacturing: A review
2022, Robotics and Computer-Integrated ManufacturingAdaptive Q-learning path planning algorithm based on virtual target guidance
2024, Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS