nach oben

Neural Computing and Applications

Erschienen in:

01.12.2013 | ISNN2012

A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots

verfasst von: Lei Zuo, Xin Xu, Chunming Liu, Zhenhua Huang

Erschienen in: Neural Computing and Applications | Ausgabe 7-8/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Robust motion control is fundamental to autonomous mobile robots. In the past few years, reinforcement learning (RL) has attracted considerable attention in the feedback control of wheeled mobile robot. However, it is still difficult for RL to solve problems with large or continuous state spaces, which is common in robotics. To improve the generalization ability of RL, this paper presents a novel hierarchical RL approach for optimal path tracking of wheeled mobile robots. In the proposed approach, a graph Laplacian-based hierarchical approximate policy iteration (GHAPI) algorithm is developed, in which the basis functions are constructed automatically using the graph Laplacian operator. In GHAPI, the state space of an Markov decision process is divided into several subspaces and approximate policy iteration is carried out on each subspace. Then, a near-optimal path-tracking control strategy can be obtained by GHAPI combined with proportional-derivative (PD) control. The performance of the proposed approach is evaluated by using a P3-AT wheeled mobile robot. It is demonstrated that the GHAPI-based PD control can obtain better near-optimal control policies than previous approaches.

Vorheriger Artikel Optimal control of an electric vehicle’s charging schedule under electricity markets

Nächster Artikel Pulse neural network–based adaptive iterative learning control for uncertain robots

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Oh SY, Lee JH et al (2000) A new reinforcement learning vehicle control architecture for vision-based road following [J]. IEEE Trans Veh Technol 49(3):997–1005CrossRef

Yamaguchi T, Sato E et al (2003) Intelligent space and human centered robotics [J]. IEEE Trans Ind Electron 50(5):881–889CrossRef

Lee JM, Son K et al (2003) Localization of a mobile robot using the image of a moving object [J]. IEEE Trans Ind Electron 50(3):612–619CrossRef

Lee TC, Tsai CY et al (2004) Fast parking control of mobile robots: a motion planning approach with experimental validation [J]. IEEE Trans Control Syst Technol 12(5):661–676CrossRef

Palacin J, Salse JA et al (2004) Building a mobile robot for a floor-cleaning operation in domestic environments [J]. IEEE Trans Instrum Meas 53(5):1418–1424CrossRef

Ding D, Cooper RA (2005) Electric-powered wheelchairs: a review of current technology and insight into future direction [J]. IEEE Control Syst Mag 25(2):22–34CrossRef

Shim HS, Sung YG (2004) Stability and four-posture control for nonholonomic mobile robots [J]. IEEE Trans Robot Autom 20(1):148–154CrossRef

Zhao DB, Deng XY, Yi JQ (2009) Motion and internal force control for omni-directional wheeled mobile robots [J]. IEEE ASME Trans Mechatron 14(3):382–387CrossRef

Wu Y, Wang B et al (2005) Finite-time tracking controller design for nonholonomic systems with extended chained form[J]. IEEE Trans Circuit Syst II Exp Briefs 52(11):798–802CrossRef

10.

Antonelli G, Chiaverini S et al (2007) A fuzzy-logic-based approach for mobile robot path tracking[J]. IEEE Trans Fuzzy Syst 15(2):211–221CrossRef

11.

Raffo GV, Gomes GK et al (2009) A predictive controller for autonomous vehicle path tracking[J]. IEEE Trans Intell Transp Syst 10(1):92–102CrossRef

12.

Wai R, Liu C (2009) Design of dynamic petri recurrent fuzzy neural network and its application to path-tracking control of nonholonomic mobile robot[J]. IEEE Trans Ind Electron 56(7):2667–2683CrossRef

13.

Mohareri O, Dhaouadi R et al (2012) Indirect adaptive tracking control of a nonholonomic mobile robot via neural networks[J]. Neurocomputing 88:54–66CrossRef

14.

Aguiar AP, Hespanha JP (2007) Trajectory-tracking and path-following of underactuated autonomous vehicles with parametric modeling uncertainty[J]. IEEE Trans Autom Cont 52(8):1362–1379MathSciNetCrossRef

15.

Xu D, Zhao DB, Yi JQ, Tan XM (2009) Trajectory tracking control of omnidirectional wheeled mobile manipulators: robust neural network based sliding mode approach [J]. IEEE Trans Syst Man Cybern Part B 39(3):788–799CrossRef

16.

Park BS, Yoo SJ et al (2010) A simple adaptive control approach for trajectory tracking of electrically driven nonholonomic mobile robots[J]. IEEE Trans Control Syst Technol 18(5):1199–1206CrossRef

17.

Sutton R, Barto AG (1998) Reinforcement learning: an introduction[M]. The MIT Press, Cambridge

18.

Zhang Q, Li M, Wang XS, Zhang Y (2012) Reinforcement learning in robot path optimization [J]. J Softw 7(3):657–662

19.

Zhang PC, Xu X, Liu C, Yuan Q (2009) Reinforcement learning control of a real mobile robot using approximate policy iteration [C]. ISNN 2009, Part III, Lecture Notes in Computer Science, LNCS 5553, pp 278–288

20.

Yen GG, Hickey TW (2004) Reinforcement learning algorithms for robotic navigation in dynamic environments. ISA Trans 43:217–230CrossRef

21.

Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction [J]. IEEE Comput Intell Mag 4(2):39–47CrossRef

22.

Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding[C]. In: Advances in Neural Information Processing Systems 8 (Proceedings of the 1995 conference). MIT Press, pp 1038–1044

23.

Xu X, He H et al (2002) Efficient reinforcement learning using recursive least-squares methods[J]. J Art Intell Res 16:259–292MathSciNetMATH

24.

Lagoudakis MG, Parr R (2003) Least-squares policy Iteration[J]. J Mach Learn Res 4:1107–1149MathSciNet

25.

Liu D, Javaherian H, Kovalenko O, Huang T (2008) Adaptive critic learning techniques for engine torque and air-fuel ratio control [J]. IEEE Trans Syst Man Cybern Part B Cybern 38(4):988–993CrossRef

26.

Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503CrossRef

27.

Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214MathSciNetCrossRefMATH

28.

Yang Q, Jagannathan S (2007) Online reinforcement learning neural network controller design for nanomanipulation. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning. Honolulu, HI, pp 225–232

29.

Xu X, Hu DW et al (2007) Kernel-based least squares policy iteration for reinforcement learning[J]. IEEE Trans Neural Netw 18(4):973–992CrossRef

30.

Bernhard H (2003) Discovering hierarchy in reinforcement learning[D]. University of New South Wales, Australia

31.

Doina SRSP et al (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artif Intell 112:181–211CrossRefMATH

32.

Parr R (1998) Hierarchical control and learning for Markov decision processes[D]. University of California, Berkeley, California

33.

Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J Art Intell Res 13:227–303MathSciNetMATH

34.

Xu X, Liu C et al (2011) Hierarchical approximate policy iteration with binary-tree state space decomposition[J]. IEEE Trans Neural Netw 22(12):1863–1877CrossRef

35.

Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont, MassachusettsMATH

36.

Vapnik V (1998) Statistical learning theory[M]. John Wiley and Sons, Inc., New York

37.

Mahadevan S, Maggioni M (2007) Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes[J]. J Mach Learn Res 8:2169–2231MathSciNetMATH

38.

Mahadevan S (2008) Learning representation and control in Markov decision processes: new Frontiers[J]. Found Trends Mach Learn 1(4):403–565MathSciNetCrossRef

39.

Normey-Rico JE, Alcalab I et al (2001) Mobile robot path tracking using a robust PID controller[J]. Control Eng Pract 9:1209–1214CrossRef

40.

Mahadevan S, Maggioni M (2006) Value function approximation with diffusion wavelets and Laplacian eigenfunctions[C]. In: Proceedings of the neural information processing systems (NIPS). MIT Press

41.

Munos R (2003) Error bounds for approximate policy iteration. In: Proceedings of the 20th annual international conference machine learning. p 560

Titel: A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots
verfasst von: Lei Zuo
Xin Xu
Chunming Liu
Zhenhua Huang
Publikationsdatum: 01.12.2013
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 7-8/2013
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-012-1243-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 7-8/2013

Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays

Predictability of relative humidity by two artificial intelligence techniques using noisy data from two Californian gauging stations

Hybrid PSO-based variable translation wavelet neural network and its application to hypoglycemia detection system

Robust stochastic moment control via genetic-pole placement in communication network parameter setting

Exponential convergence for HRNNs with continuously distributed delays in the leakage terms

Neural network optimized with evolutionary computing technique for solving the 2-dimensional Bratu problem