Skip to main content
Top

2019 | OriginalPaper | Chapter

Multi-USVs Coordinated Detection in Marine Environment with Deep Reinforcement Learning

Authors : Ruiying Li, Rui Wang, Xiaohui Hu, Kai Li, Haichang Li

Published in: Benchmarking, Measuring, and Optimizing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In recent years, with the rapid development of deep reinforcement learning, numerous researches have begun taking more and more attention in military and civilian fields. Compared with ship monitoring and other technical means, USVs have more significant advantages in marine environment and is gradually becoming a concern of academic and marine management departments. However, single agent reinforcement learning cannot fit well in the multi-USVs cases because of the non-stationary environment and complex multi-agent interactions. In order to learn cooperation models among USVs, we propose a multi-USVs coordinated detection method based on DDPG and LSTM is used for storage about the sequence of states and actions. Besides, in order to adapt to the algorithm, we model the marine environment where every USV is considered as an agent. Experiments are constructed in simulation conditions and the results verify the effectiveness of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef
2.
go back to reference Jitao, S., Gao, Y., Bingkun, B., Snoek, C., Dai, Q.: Recent advances in social multimedia big data mining and applications. Multimed. Syst. 22(1), 1–3 (2016)CrossRef Jitao, S., Gao, Y., Bingkun, B., Snoek, C., Dai, Q.: Recent advances in social multimedia big data mining and applications. Multimed. Syst. 22(1), 1–3 (2016)CrossRef
3.
go back to reference Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, 2nd edn. Cambridge University Press, Cambridge (2014)CrossRef Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, 2nd edn. Cambridge University Press, Cambridge (2014)CrossRef
4.
go back to reference Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)CrossRef Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)CrossRef
5.
go back to reference Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163 (1994) Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163 (1994)
6.
go back to reference Schmidhuber, J.: A general method for multi-agent reinforcement learning in unrestricted environments. In: Adaptation, Coevolution and Learning in Multiagent Systems: Papers from the 1996 AAAI Spring Symposium, pp. 84–87 (1996) Schmidhuber, J.: A general method for multi-agent reinforcement learning in unrestricted environments. In: Adaptation, Coevolution and Learning in Multiagent Systems: Papers from the 1996 AAAI Spring Symposium, pp. 84–87 (1996)
7.
go back to reference Busoniu, L., Babuska, R., De Schutter, B.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(2), 156–172 (2008)CrossRef Busoniu, L., Babuska, R., De Schutter, B.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(2), 156–172 (2008)CrossRef
8.
go back to reference Matignon, L., Laurent, G.J., Le Fort-Piat, N.: Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)CrossRef Matignon, L., Laurent, G.J., Le Fort-Piat, N.: Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)CrossRef
9.
go back to reference Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-Agent Syst. 11(3), 387–434 (2005)CrossRef Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-Agent Syst. 11(3), 387–434 (2005)CrossRef
10.
11.
go back to reference Grondman, I., Busoniu, L., Lopes, G.A.D., et al.: A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1291–1307 (2012)CrossRef Grondman, I., Busoniu, L., Lopes, G.A.D., et al.: A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1291–1307 (2012)CrossRef
12.
go back to reference Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)MATH Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)MATH
14.
go back to reference Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016) Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
15.
go back to reference Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)MATH Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)MATH
16.
go back to reference Zhang, C., Lesser, V.: Coordinating multi-agent reinforcement learning with limited communication. In: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1101–1108. International Foundation for Autonomous Agents and Multiagent Systems (2013) Zhang, C., Lesser, V.: Coordinating multi-agent reinforcement learning with limited communication. In: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1101–1108. International Foundation for Autonomous Agents and Multiagent Systems (2013)
17.
go back to reference Foerster, J., Assael, I.A., Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016) Foerster, J., Assael, I.A., Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
18.
go back to reference Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning, pp. 387–395 (2014) Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning, pp. 387–395 (2014)
19.
go back to reference Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef
20.
go back to reference Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6379–6390 (2017) Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6379–6390 (2017)
21.
go back to reference Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
22.
go back to reference Vaneck, T., Manley, J., Rodriguez, C., Schmidt, M.: Automated bathymetry using an autonomous surface craft navigation. J. Inst. Navig. 43(4), 407–419 (1996)CrossRef Vaneck, T., Manley, J., Rodriguez, C., Schmidt, M.: Automated bathymetry using an autonomous surface craft navigation. J. Inst. Navig. 43(4), 407–419 (1996)CrossRef
23.
go back to reference Bertram, V.: Unmanned surface vehicles - a survey. Skibsteknisk Selskab (2008) Bertram, V.: Unmanned surface vehicles - a survey. Skibsteknisk Selskab (2008)
24.
go back to reference Enderle, B., Yanagihara, T., Suemori, M., Imai, H., Sato, A.: Recent developments in a total unmanned integration system. In: AUVSI Unmanned Systems Conference, Anaheim (2004) Enderle, B., Yanagihara, T., Suemori, M., Imai, H., Sato, A.: Recent developments in a total unmanned integration system. In: AUVSI Unmanned Systems Conference, Anaheim (2004)
25.
go back to reference Yang, W., Chen, C., Hsu, C., Tseng, C., Yang, W.: Multifunctional inshore survey platform with unmanned surface vehicles. Int. J. Autom. Smart Technol. 1, 19–25 (2011)CrossRef Yang, W., Chen, C., Hsu, C., Tseng, C., Yang, W.: Multifunctional inshore survey platform with unmanned surface vehicles. Int. J. Autom. Smart Technol. 1, 19–25 (2011)CrossRef
26.
go back to reference Caccia, M., et al.: Sampling sea surfaces with SESAMO: an autonomous craft for the study of sea-air interactions. Robot. Autom. Mag. 12(3), 95–105 (2005)CrossRef Caccia, M., et al.: Sampling sea surfaces with SESAMO: an autonomous craft for the study of sea-air interactions. Robot. Autom. Mag. 12(3), 95–105 (2005)CrossRef
27.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
28.
go back to reference Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017) Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017)
29.
go back to reference Boutilier, C.: Learning conventions in multiagent stochastic domains using likelihood estimates. In: Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence, pp. 106–114 (1996) Boutilier, C.: Learning conventions in multiagent stochastic domains using likelihood estimates. In: Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence, pp. 106–114 (1996)
30.
go back to reference Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press (2015) Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press (2015)
31.
go back to reference Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific, Belmont (2005)MATH Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific, Belmont (2005)MATH
Metadata
Title
Multi-USVs Coordinated Detection in Marine Environment with Deep Reinforcement Learning
Authors
Ruiying Li
Rui Wang
Xiaohui Hu
Kai Li
Haichang Li
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-32813-9_17

Premium Partner