Skip to main content

2018 | OriginalPaper | Buchkapitel

Path Planning of Robotic Fish in Unknown Environment with Improved Reinforcement Learning Algorithm

verfasst von : Jingbo Hu, Jie Mei, Dingfang Chen, Lijie Li, Zhengshu Cheng

Erschienen in: Internet and Distributed Computing Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Path planning is the primary task for robotic fish, especially when the environment under water of robotic fish is unknown. The conventional reinforcement learning algorithms usually exhibit a poor convergence property in unknown environment. In order to find the optimal path and increase the convergence speed in the unknown environment, an improved reinforcement learning method utilizing a simulated annealing approach is proposed in robotic fish navigation. The simulated annealing policy with a novel cooling method rather than a general ɛ-greedy policy is taken for action choice. The algorithm convergence speed is improved by a novel reward function with goal-oriented strategy. Then the stopping condition of the proposed reinforcement learning algorithm is rectified as well. In this work, the robotic fish is designed and the prototype is produced by 3D printing technology. Then the proposed algorithm is examined in the 2D unpredictable environment to obtain greedy actions. Experimental results show that the proposed algorithms can generate an optimal path in unknown environment for robotic fish and increase the convergence speed as well as balance the exploration and exploitation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lamini, C., Fathi, Y., Benhlima, S.: Collaborative Q-learning path planning for autonomous robots based on holonic multi-agent system. In: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), pp. 1–6 (2015) Lamini, C., Fathi, Y., Benhlima, S.: Collaborative Q-learning path planning for autonomous robots based on holonic multi-agent system. In: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), pp. 1–6 (2015)
2.
Zurück zum Zitat Kaluđer, H., Brezak, M., Petrović, I.: A visibility graph based method for path planning in dynamic environments. In: 2011 Proceedings of the 34th International Convention MIPRO, pp. 717–721 (2011) Kaluđer, H., Brezak, M., Petrović, I.: A visibility graph based method for path planning in dynamic environments. In: 2011 Proceedings of the 34th International Convention MIPRO, pp. 717–721 (2011)
3.
Zurück zum Zitat Li, C., Lu, H., Cui, G.: The improved potential grid method in robot path planning. In: International Technology and Innovation Conference (ITIC 2009), pp. 1–5 (2009) Li, C., Lu, H., Cui, G.: The improved potential grid method in robot path planning. In: International Technology and Innovation Conference (ITIC 2009), pp. 1–5 (2009)
4.
Zurück zum Zitat Chin, W.H., Saputra, A.A., Kubota, N.: A neuro-based network for on-line topological map building and dynamic path planning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2805–2810 (2017) Chin, W.H., Saputra, A.A., Kubota, N.: A neuro-based network for on-line topological map building and dynamic path planning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2805–2810 (2017)
5.
Zurück zum Zitat Bounini, F., Gingras, D., Pollart, H., Gruyer, D.: Modified artificial potential field method for online path planning applications. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 180–185 (2017) Bounini, F., Gingras, D., Pollart, H., Gruyer, D.: Modified artificial potential field method for online path planning applications. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 180–185 (2017)
6.
Zurück zum Zitat Lee, J., Park, W.: A probability-based path planning method using fuzzy logic. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2978–2984 (2014) Lee, J., Park, W.: A probability-based path planning method using fuzzy logic. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2978–2984 (2014)
7.
Zurück zum Zitat Cobano, J.A., Conde, R., Alejo, D., Ollero, A.: Path planning based on genetic algorithms and the Monte-Carlo method to avoid aerial vehicle collisions under uncertainties. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4429–4434 (2011) Cobano, J.A., Conde, R., Alejo, D., Ollero, A.: Path planning based on genetic algorithms and the Monte-Carlo method to avoid aerial vehicle collisions under uncertainties. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4429–4434 (2011)
8.
Zurück zum Zitat Lamini, C., Fathi, Y., Benhlima, S.: H-MAS architecture and reinforcement learning method for autonomous robot path planning. In: 2017 Intelligent Systems and Computer Vision (ISCV), pp. 1–7 (2017) Lamini, C., Fathi, Y., Benhlima, S.: H-MAS architecture and reinforcement learning method for autonomous robot path planning. In: 2017 Intelligent Systems and Computer Vision (ISCV), pp. 1–7 (2017)
9.
Zurück zum Zitat Konar, A., Chakraborty, I.G., Singh, S.J., Jain, L.C., Nagar, A.K.: A deterministic improved q-learning for path planning of a mobile robot. IEEE Trans. Syst. Man Cybern.: Syst. 43(5), 1141–1153 (2013) Konar, A., Chakraborty, I.G., Singh, S.J., Jain, L.C., Nagar, A.K.: A deterministic improved q-learning for path planning of a mobile robot. IEEE Trans. Syst. Man Cybern.: Syst. 43(5), 1141–1153 (2013)
10.
Zurück zum Zitat Liu, Y., Liu, H., Wang, B.: Autonomous exploration for mobile robot using Q-learning. In: 2017 2nd International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 614–619 (2017) Liu, Y., Liu, H., Wang, B.: Autonomous exploration for mobile robot using Q-learning. In: 2017 2nd International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 614–619 (2017)
11.
Zurück zum Zitat Kim, B., Pineau, J.: Socially Adaptive path planning in human environments using inverse reinforcement learning. Int. J. Soc. Robot. 8(1), 51–66 (2016)CrossRef Kim, B., Pineau, J.: Socially Adaptive path planning in human environments using inverse reinforcement learning. Int. J. Soc. Robot. 8(1), 51–66 (2016)CrossRef
12.
Zurück zum Zitat Das, P.K., Behera, H.S., Panigrahi, B.K.: Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity. Eng. Sci. Technol. Int. J. 19(1), 651–669 (2016)CrossRef Das, P.K., Behera, H.S., Panigrahi, B.K.: Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity. Eng. Sci. Technol. Int. J. 19(1), 651–669 (2016)CrossRef
13.
Zurück zum Zitat Li, L., Lv, Y., Wang, F.-Y.: Traffic signal timing via deep reinforcement learning. IEEE/CAA J. Autom. Sinica 3(3), 247–254 (2016) Li, L., Lv, Y., Wang, F.-Y.: Traffic signal timing via deep reinforcement learning. IEEE/CAA J. Autom. Sinica 3(3), 247–254 (2016)
14.
Zurück zum Zitat Liu, T., Tian, B., Ai, Y., Li, L., Cao, D., Wang, F.-Y.: Parallel reinforcement learning: a framework and case study. IEEE/CAA J. Autom. Sinica 5(4), 827–835 (2018)MathSciNetCrossRef Liu, T., Tian, B., Ai, Y., Li, L., Cao, D., Wang, F.-Y.: Parallel reinforcement learning: a framework and case study. IEEE/CAA J. Autom. Sinica 5(4), 827–835 (2018)MathSciNetCrossRef
15.
Zurück zum Zitat Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators, p. 280. CRC Press, Inc., Boca Raton (2010)CrossRef Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators, p. 280. CRC Press, Inc., Boca Raton (2010)CrossRef
16.
Zurück zum Zitat Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4(1), 237–285 (1996)CrossRef Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4(1), 237–285 (1996)CrossRef
17.
Zurück zum Zitat Watkins, C., Dayan, P.: Technical note Q-learning. Mach. Learn. 8, 279–292 (1992)MATH Watkins, C., Dayan, P.: Technical note Q-learning. Mach. Learn. 8, 279–292 (1992)MATH
18.
Zurück zum Zitat Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. Trans. Neur. Netw. 9(5), 1054 (1998)CrossRef Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. Trans. Neur. Netw. 9(5), 1054 (1998)CrossRef
19.
Zurück zum Zitat Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953) Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
20.
Zurück zum Zitat Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRef Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRef
Metadaten
Titel
Path Planning of Robotic Fish in Unknown Environment with Improved Reinforcement Learning Algorithm
verfasst von
Jingbo Hu
Jie Mei
Dingfang Chen
Lijie Li
Zhengshu Cheng
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-02738-4_21