nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

09.07.2020 | Original Article

Multi-agent reinforcement learning for redundant robot control in task-space

verfasst von: Adolfo Perrusquía, Wen Yu, Xiaoou Li

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Task-space control needs the inverse kinematics solution or Jacobian matrix for the transformation from task space to joint space. However, they are not always available for redundant robots because there are more joint degrees-of-freedom than Cartesian degrees-of-freedom. Intelligent learning methods, such as neural networks (NN) and reinforcement learning (RL) can learn the inverse kinematics solution. However, NN needs big data and classical RL is not suitable for multi-link robots controlled in task space. In this paper, we propose a fully cooperative multi-agent reinforcement learning (MARL) to solve the kinematic problem of redundant robots. Each joint of the robot is regarded as one agent. The fully cooperative MARL uses a kinematic learning to avoid function approximators and large learning space. The convergence property of the proposed MARL is analyzed. The experimental results show that our MARL is much more better compared with the classic methods such as Jacobian-based methods and neural networks.

Vorheriger Artikel Creating rule-based agents for artificial general intelligence using association rules mining

Nächster Artikel A new rough set model based on multi-scale covering

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Task-space (or Cartesian space) is defined by the position and orientation of the end effector of a robot. Joint-space is defined by angular displacements of each joint angle of a robot.

Ahmadi S, Fateh M (2018) Task-space asymptotic tracking control of robots using a direct adaptive Taylor series controller. J Vib Control 24(23):5570–5584. https://doi.org/10.1177/1077546318758800MathSciNetCrossRef

Ansari Y, Falotico E (2016) A multiagent reinforcement learning approach for inverse kinematics oh high dimensional manipulators with precision positioning. In: 6th IEEE RAS/EMBS international conference on biomedical robotics and biomechatronics (BioRob). https://doi.org/10.1109/BIOROB.2016.7523669

Atashzar S, Tavakoli M, Patel R (2018) A computational-model-based study of supervised haptics-enabled therapist-in-the-loop training for upper-limb poststroke robotic rehabilitation. IEEE/ASME Trans Mechatron 23(2):562–574. https://doi.org/10.1109/TMECH.2018.2806918CrossRef

Axinte D, Dong X, Palmer D, Rushworth A, Guzman S, Olarra A (2018) Miror-miniaturized robotic systems for holisticin-siturepair and maintenance works in restrained and hazardous environments. IEEE/ASME Trans Mechatron 23(2):978–981. https://doi.org/10.1109/TMECH.2018.2800285CrossRef

Bcsi B, Nguyen-Tuong D, Csat L, Schlkopf B, Peters J (2011) Learning inverse kinematics with structured prediction. IEEE/RSJ Int Conf Intell Robots Syst. https://doi.org/10.1109/IROS.2011.6094666CrossRef

Bitzer S, Howard M, Vijayakumar S (2010) Using dimensionality reduction to exploit constraints in reinforcement learning. IEEE/RSJ Int Conf Intell Robots Syst (IROS). https://doi.org/10.1109/IROS.2010.5650243CrossRef

Buşoniu L, Babûska R, De Schutter B (2010) Multi-agent reinforcement learning: an overview. In: Srinivasan D, Jain L (eds) Innovations in multi-agent systems and applications—1. Studies in computational intelligence. Lecture notes in computer science, vol 310. Springer, Berlin. https://doi.org/10.1007/978-3-642-14435-6_7CrossRef

Buşoniu L, Babûska R, De Schutter B, Ernst D (2010) Reinforcement learning and dynamic programming using function approximators. Automation and Control Engineering Series. CRC Press, Boca Raton

Cheah C, Li X (2011) Singularity-robust task-space tracking control of robot. IEEE Int Conf Robot Autom. https://doi.org/10.1109/ICRA.2011.5979932CrossRef

10.

Csistzar A, Eilers J, Verl A (2017) On solving the inverse kinematics problem using neural networks. In: 24th international conference on mechatronics and machine vision in practice. https://doi.org/10.1109/M2VIP.2017.8211457

11.

Deisenroth M, Rasmussen C (2011) PILCO: A model-based and data-efficient approach to policy search. In: Proceedings of the 28th international conference on machine learning, Bellevue, WAA, USA

12.

Deisenroth MP, Neumann G, Peters J (2011) A survey on policy search for robotics. Found Trends Robot 2(1–2):1–142. https://doi.org/10.1561/2300000021CrossRef

13.

Duka A (2014) Neural network based inverse kinematics solution for trajectory tracking of a robotic arm. In: Procedia technology, the 7th international conference interdisciplinarity in engineering, INTER-ENG 2013. Petru Maior University of Tirgu Mures, Romania. https://doi.org/10.1016/j.protcy.2013.12.451

14.

Feng Y, Yao-nan W, Yi-min Y (2012) Inverse kinematics solution for robot manipulator based on neural network under joint subspace. Int J Comput Commun Control 7(3):459–472. https://doi.org/10.15837/ijccc.2012.3.1387CrossRef

15.

Galicki M (2016) Finite-time trajectory tracking control in task space of robotic manipulators. Automatica 67:165–170. https://doi.org/10.1016/j.automatica.2016.01.025MathSciNetCrossRefMATH

16.

Galicki M (2016) Robust task space trajectory tracking control of robotic manipulators. Int J Appl Mech Eng 21(3):547–568. https://doi.org/10.1515/ijame-2016-0033CrossRefMATH

17.

Grondman I, Buşoniu L, Babûska R (2012) Model learning actor-critic algorithms: performance evaluation in a motion control task. In: 51st IEEE conference on decision and control (CDC), pp 5272–5277. https://doi.org/10.1109/CDC.2012.6426427

18.

Grondman I, Vaandrager M, Buşoniu L, Babûska R, Schuitema E (2011) Actor-critic control with reference model learning. In: Proceedings of the 18th World congress the international federation of automatic control, pp 14723–14728. https://doi.org/10.3182/20110828-6-IT-1002.00759

19.

Grondman I, Vaandrager M, Buşoniu L, Babûska R, Schuitema E (2012a) Efficient model learning methods for actor-critic control. IEEE Trans Syst Man Cybern B Cybern 42(3):291–602. https://doi.org/10.1109/TSMCB.2011.2170565CrossRef

20.

Hyatt P (2019) Configuration estimation for accurate position control of large-scale soft robots. IEEE/ASME Trans Mechatron 24(1):88–99. https://doi.org/10.1109/TMECH.2018.2878228CrossRef

21.

Jaakola TMJ, Singh S (1994) On the convergence of stochastic iterative dyanamic programming algorithms. Neural Comput 6(6):1185–1201. https://doi.org/10.1162/neco.1994.6.6.1185CrossRef

22.

Kober J, Bagnell J, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274. https://doi.org/10.1007/978-3-319-03194-1_2CrossRef

23.

Lewis F, Vrable D, Vamvoudakis K (2012) Reinforcement learning and feedback control: using natural decision methods to desgin optimal adaptive controllers. IEEE Control Syst Mag 32(6):76–105. https://doi.org/10.1109/MCS.2012.2214134MathSciNetCrossRefMATH

24.

Luya L, Gruver W, Zhang Q, Yang Z (2001) Kinematic control of redundant robots and the motion optimizability measure. IEEE Trans Syst Man Cybern Part B Cybern 31(1):155–160. https://doi.org/10.1109/3477.907575CrossRef

25.

Moon Y, Seo J, Choi J (2015) Development of new end-effector for proof-of-concept of fully robotic multichannel biopsy. IEEE/ASME Trans Mechatron 20(6):2996–3008. https://doi.org/10.1109/TMECH.2015.2418793CrossRef

26.

Patel R, Shadpey F (2005) Control of redundant manipulators: theory and experiments. Springer, Berlin. https://doi.org/10.1007/b93979CrossRefMATH

27.

Perrusquía A, Yu W (2020) Human-in-the-loop control using euler angles. J Intell Robot Syst 97:271–285. https://doi.org/10.1007/s10846-019-01058-2CrossRef

28.

Perrusquía A, Yu W (2020) Robot position/force control in unknown environment using hybrid reinforcement learning. Cybern Syst. https://doi.org/10.1080/01969722.2020.1758466CrossRef

29.

Perrusquía A, Yu W, Soria A (2019) Large space dimension reinforcement learning for robot position/force discrete control. In: 2019 6th international conference on control, decision and information technologies (CoDIT 2019), Paris, France. https://doi.org/10.1109/CoDIT.2019.8820575

30.

Perrusquía A, Yu W, Soria A (2019) Optimal contact force in unknown environments using reinforcement learning and model-free controllers. In: 16th international conference on electrical engineering, computing science and automatic control (CCE), Mexico city, Mexico. https://doi.org/10.1109/ICEEE.2019.8884518

31.

Perrusquía A, Yu W, Soria A (2019) Position/force control of robots manipulators using reinforcement learning. Ind Robot Int J Robot Res Appl 46(2):267–280. https://doi.org/10.1108/IR-10-2018-0209CrossRef

32.

Perrusquiía A, Yu W (2020) Robust control under worst-case uncertainty for unknown nonlinear systems using modified reinforcement learning. Int J Robust Nonlinear Control 30(7):2920–2936. https://doi.org/10.1002/rnc.4911MathSciNetCrossRef

33.

Rolf M, Steil J (2014) Efficient exploratory learning of inverse kinematics on a bionic elephant trunk. IEEE Trans Neural Netw Learn Syst 25(6):1147–1160. https://doi.org/10.1109/TNNLS.2013.2287890CrossRef

34.

Schulman J, Wolski F, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347

35.

Silver D, Lever G, Hess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: Proceedings of the 31st international conference on machine learning, Beijing, China, vol 32, pp 387–395

36.

Sun K, Liu L, Qiu J, Feng G (2020) Fuzzy adaptive finite-time fault tolerant control for strict-feedback nonlinear systems. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2020.2965890CrossRef

37.

Sun K, Qiu J, Karimi H, Fu Y (2020) Event- triggered robust fuzzy adaptive finite-time control of nonlinear systems with prescribed performance. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2020.2979129CrossRef

38.

Sutton RAB (1998) Reinforcement learning: an introduction. MIT Press, CambridgeMATH

39.

Tamei T, Shibata T (2009) Policy gradient learning of cooperative interaction with a robot using user’s biological signals. Int Conf Neural Inf Process (ICONIP). https://doi.org/10.1007/978-3-642-03040-6_125CrossRef

40.

Theodorou E, Buchli J, Schaal S (2010) Reinforcement learning of motor skills in high dimensions: a path integral approach. IEEE Int Conf Robot Autom (ICRA). https://doi.org/10.1109/ROBOT.2010.5509336CrossRefMATH

41.

Tuong D, Peters J (2011) Learning task-space tracking control with kernels. IEEE/RSJ Int Conf Intell Robots Syst. https://doi.org/10.1109/IROS.2011.6094428CrossRef

42.

Wiering MA, van Hasselt H (2007) Two novel on-policy reinforcement learning algorithms based on TD(\(\lambda\))-method. In: Proceedings of the 2007 IEEE symposium on approximate dynamic programming and reinforcement learning (ADPRL). https://doi.org/10.1109/ADPRL.2007.368200

43.

Wiering MA, van Hasselt H (2009) The QV family compared to other reinforcement learning algorithms. In: 2009 IEEE symposium on adaptive dynamic programming and reinforcement learning. https://doi.org/10.1109/ADPRL.2009.4927532

44.

Xian B, de Queiroz M, Dawson D, Walker I (2004) Task-space tracking control of robots manipulators via quaternion feedback. IEEE Trans Robot Autom 20(1):160–167. https://doi.org/10.1109/TRA.2003.820932CrossRef

45.

Yu W, Perrusquía A (2019) Simplified stable admittance control using end-effector orientations. Int J Soc Robot. https://doi.org/10.1007/s12369-019-00579-yCrossRef

46.

Zhang D, Wei B (2017) On the development of learning control for robotic manipulators. Robotics. https://doi.org/10.3390/robotics6040023CrossRef

47.

Zheng Y, Ma J, Wang L (2017) Consensus of hybrid multi-agent systems. IEEE Trans Neural Netw Learn Syst 29(4):1359–1365. https://doi.org/10.1109/TNNLS.2017.2651402CrossRef

48.

Zhu Y, Li S, Ma J, Zheng Y (2018) Bipartite consensus in networks of agents with antagonistic interactions and quantization. IEEE Trans Circuits Syst II Express Briefs 65(12):2012–2016. https://doi.org/10.1109/TCSII.2018.2811803CrossRef

Titel: Multi-agent reinforcement learning for redundant robot control in task-space
verfasst von: Adolfo Perrusquía
Wen Yu
Xiaoou Li
Publikationsdatum: 09.07.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 1/2021
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-020-01167-7

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Benedikt Bonnmann von Adesso/© Adesso, Teilzeit/© Fokussiert / stock.adobe.com, Hans-Joachim Lefeld/© Lucht Probst Associates GmbH, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 1/2021

ADET: anomaly detection in time series with linear time

Clinical quantitative information recognition and entity-quantity association from Chinese electronic medical records

A new framework of multi-objective evolutionary algorithms for feature selection and multi-label classification of video data

Dynamic dominance-based multigranulation rough sets approaches with evolving ordered data

A weighted exponential discriminant analysis through side-information for face and kinship verification using statistical binarized image features

Causative label flip attack detection with data complexity measures

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.