Skip to main content
Erschienen in: Soft Computing 6/2020

06.07.2019 | Methodologies and Application

Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method

verfasst von: André L. C. Ottoni, Erivelton G. Nepomuceno, Marcos S. de Oliveira, Daniela C. R. de Oliveira

Erschienen in: Soft Computing | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a technique to tune the reinforcement learning (RL) parameters applied to the sequential ordering problem (SOP) using the Scott–Knott method. The RL has been widely recognized as a powerful tool for combinatorial optimization problems, such as travelling salesman and multidimensional knapsack problems. It seems, however, that less attention has been paid to solve the SOP. Here, we have developed a RL structure to solve the SOP that can partially fill that gap. Two traditional RL algorithms, Q-learning and SARSA, have been employed. Three learning specifications have been adopted to analyze the performance of the RL: algorithm type, reinforcement learning function, and \(\epsilon \) parameter. A complete factorial experiment and the Scott–Knott method are used to find the best combination of factor levels, when the source of variation is statistically different in analysis of variance. The performance of the proposed RL has been tested using benchmarks from the TSPLIB library. In general, the selected parameters indicate that SARSA overwhelms the performance of Q-learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Alipour MM, Razavi SN (2015) A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem. Multiagent Grid Syst 11(2):107–119 Alipour MM, Razavi SN (2015) A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem. Multiagent Grid Syst 11(2):107–119
Zurück zum Zitat Alipour MM, Razavi SN, Feizi Derakhshi MR, Balafar MA (2018) A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem. Neural Comput Appl 30(9):2935–2951 Alipour MM, Razavi SN, Feizi Derakhshi MR, Balafar MA (2018) A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem. Neural Comput Appl 30(9):2935–2951
Zurück zum Zitat Anghinolfi D, Montemanni R, Paolucci M, Gambardella LM (2011) A hybrid particle swarm optimization approach for the sequential ordering problem. Comput Oper Res 38(7):1076–1085MathSciNetMATH Anghinolfi D, Montemanni R, Paolucci M, Gambardella LM (2011) A hybrid particle swarm optimization approach for the sequential ordering problem. Comput Oper Res 38(7):1076–1085MathSciNetMATH
Zurück zum Zitat Applegate D, Bixby RE, Chvátal V, Cook W (2007) The traveling salesman problem: a computational study. Princeton University Press, PrincetonMATH Applegate D, Bixby RE, Chvátal V, Cook W (2007) The traveling salesman problem: a computational study. Princeton University Press, PrincetonMATH
Zurück zum Zitat Arin A, Rabadi G (2017) Integrating estimation of distribution algorithms versus q-learning into meta-raps for solving the 0–1 multidimensional knapsack problem. Comput Ind Eng 112:706–720 Arin A, Rabadi G (2017) Integrating estimation of distribution algorithms versus q-learning into meta-raps for solving the 0–1 multidimensional knapsack problem. Comput Ind Eng 112:706–720
Zurück zum Zitat Ascheuer N, Jünger M, Reinelt G (2000) A branch & cut algorithm for the asymmetric traveling salesman problem with precedence constraints. Comput Optim Appl 17(1):61–84MathSciNetMATH Ascheuer N, Jünger M, Reinelt G (2000) A branch & cut algorithm for the asymmetric traveling salesman problem with precedence constraints. Comput Optim Appl 17(1):61–84MathSciNetMATH
Zurück zum Zitat Asiain E, Clempner JB, Poznyak AS (2019) Controller exploitation–exploration reinforcement learning architecture for computing near-optimal policies. Soft Comput 23(11):3591–3604MATH Asiain E, Clempner JB, Poznyak AS (2019) Controller exploitation–exploration reinforcement learning architecture for computing near-optimal policies. Soft Comput 23(11):3591–3604MATH
Zurück zum Zitat Barsce JC, Palombarini JA, Martinez EC (2017) Towards autonomous reinforcement learning: automatic setting of hyper-parameters using Bayesian optimization. In: 2017 XLIII Latin American Computer Conference (CLEI). IEEE, pp 1–9 Barsce JC, Palombarini JA, Martinez EC (2017) Towards autonomous reinforcement learning: automatic setting of hyper-parameters using Bayesian optimization. In: 2017 XLIII Latin American Computer Conference (CLEI). IEEE, pp 1–9
Zurück zum Zitat Bazzan AL (2019) Aligning individual and collective welfare in complex socio-technical systems by combining metaheuristics and reinforcement learning. Eng Appl Artif Intell 79:23–33 Bazzan AL (2019) Aligning individual and collective welfare in complex socio-technical systems by combining metaheuristics and reinforcement learning. Eng Appl Artif Intell 79:23–33
Zurück zum Zitat Bianchi RAC, Ribeiro CHC, Costa AHR (2009) On the relation between ant colony optimization and heuristically accelerated reinforcement learning. In: 1st international workshop on hybrid control of autonomous system, pp 49–55 Bianchi RAC, Ribeiro CHC, Costa AHR (2009) On the relation between ant colony optimization and heuristically accelerated reinforcement learning. In: 1st international workshop on hybrid control of autonomous system, pp 49–55
Zurück zum Zitat Bianchi RA, Celiberto LA, Santos PE, Matsuura JP, de Mantaras RL (2015) Transferring knowledge as heuristics in reinforcement learning: a case-based approach. Artif Intell 226:102–121MathSciNetMATH Bianchi RA, Celiberto LA, Santos PE, Matsuura JP, de Mantaras RL (2015) Transferring knowledge as heuristics in reinforcement learning: a case-based approach. Artif Intell 226:102–121MathSciNetMATH
Zurück zum Zitat Bodin L, Golden B, Assad A, Ball M (1983) Routing and scheduling of vehicles and crews—the state of the art. Comput Oper Res 10(2):63–211 Bodin L, Golden B, Assad A, Ball M (1983) Routing and scheduling of vehicles and crews—the state of the art. Comput Oper Res 10(2):63–211
Zurück zum Zitat Cardenoso Fernandez F, Caarls W (2018) Parameters tuning and optimization for reinforcement learning algorithms using evolutionary computing. In: 2018 International conference on information systems and computer science (INCISCOS). IEEE, pp 301–305 Cardenoso Fernandez F, Caarls W (2018) Parameters tuning and optimization for reinforcement learning algorithms using evolutionary computing. In: 2018 International conference on information systems and computer science (INCISCOS). IEEE, pp 301–305
Zurück zum Zitat Carvalho SA, Cunha DC, Silva-Filho AG (2019) Autonomous power management in mobile devices using dynamic frequency scaling and reinforcement learning for energy minimization. Microprocess Microsyst 64:205–220 Carvalho SA, Cunha DC, Silva-Filho AG (2019) Autonomous power management in mobile devices using dynamic frequency scaling and reinforcement learning for energy minimization. Microprocess Microsyst 64:205–220
Zurück zum Zitat Chhabra JPS, Warn GP (2019) A method for model selection using reinforcement learning when viewing design as a sequential decision process. Struct Multidiscip Optim 59(5):1521–1542MathSciNet Chhabra JPS, Warn GP (2019) A method for model selection using reinforcement learning when viewing design as a sequential decision process. Struct Multidiscip Optim 59(5):1521–1542MathSciNet
Zurück zum Zitat Conover WJ (1971) Practical nonparametric statistics. Wiley, New York Conover WJ (1971) Practical nonparametric statistics. Wiley, New York
Zurück zum Zitat Costa ML, Padilha CAA, Melo JD, Neto ADD (2016) Hierarchical reinforcement learning and parallel computing applied to the k-server problem. IEEE Latin Am Trans 14(10):4351–4357 Costa ML, Padilha CAA, Melo JD, Neto ADD (2016) Hierarchical reinforcement learning and parallel computing applied to the k-server problem. IEEE Latin Am Trans 14(10):4351–4357
Zurück zum Zitat Cunha B, Madureira AM, Fonseca B, Coelho D, (2020) Deep reinforcement learning as a job shop scheduling solver: a literature review. In: Madureira A, Abraham A, Gandhi N, Varela M (eds) Hybrid intelligent systems. HIS 2018. Advances in intelligent systems and computing, vol 923. Springer, Cham Cunha B, Madureira AM, Fonseca B, Coelho D, (2020) Deep reinforcement learning as a job shop scheduling solver: a literature review. In: Madureira A, Abraham A, Gandhi N, Varela M (eds) Hybrid intelligent systems. HIS 2018. Advances in intelligent systems and computing, vol 923. Springer, Cham
Zurück zum Zitat Da Silva F, Glatt R, Costa A (2019) MOO-MDP: an object-oriented representation for cooperative multiagent reinforcement learning. IEEE Trans Cybern 49(2):567–579 Da Silva F, Glatt R, Costa A (2019) MOO-MDP: an object-oriented representation for cooperative multiagent reinforcement learning. IEEE Trans Cybern 49(2):567–579
Zurück zum Zitat Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66 Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Zurück zum Zitat Escudero L (1988) An inexact algorithm for the sequential ordering problem. Eur J Oper Res 37(2):236–249MathSciNetMATH Escudero L (1988) An inexact algorithm for the sequential ordering problem. Eur J Oper Res 37(2):236–249MathSciNetMATH
Zurück zum Zitat Fiala Timlin MT, Pulleyblank WR (1992) Precedence constrained routing and helicopter scheduling: heuristic design. Interfaces 22(3):100–111 Fiala Timlin MT, Pulleyblank WR (1992) Precedence constrained routing and helicopter scheduling: heuristic design. Interfaces 22(3):100–111
Zurück zum Zitat Fox J, Weisberg S (2011) An R companion to applied regression, 2nd edn. Sage, Beverly Hills Fox J, Weisberg S (2011) An R companion to applied regression, 2nd edn. Sage, Beverly Hills
Zurück zum Zitat Gambardella LM, Dorigo M (1995) Ant-Q: a reinforcement learning approach to the traveling salesman problem. In: Proceedings of the 12th international conference on machine learning, pp 252–260 Gambardella LM, Dorigo M (1995) Ant-Q: a reinforcement learning approach to the traveling salesman problem. In: Proceedings of the 12th international conference on machine learning, pp 252–260
Zurück zum Zitat Gambardella LM, Dorigo M (2000) An ant colony system hybridized with a new local search for the sequential ordering problem. INFORMS J Comput 12(3):237–255MathSciNetMATH Gambardella LM, Dorigo M (2000) An ant colony system hybridized with a new local search for the sequential ordering problem. INFORMS J Comput 12(3):237–255MathSciNetMATH
Zurück zum Zitat Guerriero F, Mancini M (2003) A cooperative parallel rollout algorithm for the sequential ordering problem. Parallel Comput 29(5):663–677 Guerriero F, Mancini M (2003) A cooperative parallel rollout algorithm for the sequential ordering problem. Parallel Comput 29(5):663–677
Zurück zum Zitat Hernández-Pérez H, Salazar-González J-J (2009) The multi-commodity one-to-one pickup-and-delivery traveling salesman problem. Eur J Oper Res 196(3):987–995MathSciNetMATH Hernández-Pérez H, Salazar-González J-J (2009) The multi-commodity one-to-one pickup-and-delivery traveling salesman problem. Eur J Oper Res 196(3):987–995MathSciNetMATH
Zurück zum Zitat Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32:1238–1274 Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32:1238–1274
Zurück zum Zitat Letchford AN, Salazar-González J-J (2016) Stronger multi-commodity flow formulations of the (capacitated) sequential ordering problem. Eur J Oper Res 251(1):74–84MathSciNetMATH Letchford AN, Salazar-González J-J (2016) Stronger multi-commodity flow formulations of the (capacitated) sequential ordering problem. Eur J Oper Res 251(1):74–84MathSciNetMATH
Zurück zum Zitat Li D, Zhao D, Zhang Q, Chen Y (2019) Reinforcement learning and deep learning based lateral control for autonomous driving [application notes]. IEEE Comput Intell Mag 14(2):83–98 Li D, Zhao D, Zhang Q, Chen Y (2019) Reinforcement learning and deep learning based lateral control for autonomous driving [application notes]. IEEE Comput Intell Mag 14(2):83–98
Zurück zum Zitat Likas A, Kontoravdis D, Stafylopatis A (1995) Discrete optimisation based on the combined use of reinforcement and constraint satisfaction schemes. Neural Comput Appl 3(2):101–112 Likas A, Kontoravdis D, Stafylopatis A (1995) Discrete optimisation based on the combined use of reinforcement and constraint satisfaction schemes. Neural Comput Appl 3(2):101–112
Zurück zum Zitat Lima Júnior FC, Neto ADD, Melo JD (2010) Traveling salesman problem, theory and applications, chapter. In: Hybrid metaheuristics using reinforcement learning applied to salesman traveling problem. InTech, pp 213–236 Lima Júnior FC, Neto ADD, Melo JD (2010) Traveling salesman problem, theory and applications, chapter. In: Hybrid metaheuristics using reinforcement learning applied to salesman traveling problem. InTech, pp 213–236
Zurück zum Zitat Liu F, Zeng G (2009) Study of genetic algorithm with reinforcement learning to solve the TSP. Expert Syst Appl 36(3):6995–7001MathSciNet Liu F, Zeng G (2009) Study of genetic algorithm with reinforcement learning to solve the TSP. Expert Syst Appl 36(3):6995–7001MathSciNet
Zurück zum Zitat Low ES, Ong P, Cheah KC (2019) Solving the optimal path planning of a mobile robot using improved q-learning. Robot Auton Syst 115:143–161 Low ES, Ong P, Cheah KC (2019) Solving the optimal path planning of a mobile robot using improved q-learning. Robot Auton Syst 115:143–161
Zurück zum Zitat Ma J, Yang T, Hou Z-G, Tan M, Liu D (2008) Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput Appl 17(4):347–355 Ma J, Yang T, Hou Z-G, Tan M, Liu D (2008) Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput Appl 17(4):347–355
Zurück zum Zitat Mariano C, Morales E (2000) A new distributed reinforcement learning algorithm for multiple objective optimization problems. In: Monard M, Sichman J (eds) Advances in artificial intelligence. Lecture Notes in Computer Science, vol 1952. Springer, Berlin, pp 290–299 Mariano C, Morales E (2000) A new distributed reinforcement learning algorithm for multiple objective optimization problems. In: Monard M, Sichman J (eds) Advances in artificial intelligence. Lecture Notes in Computer Science, vol 1952. Springer, Berlin, pp 290–299
Zurück zum Zitat McAuley A, Sinkar K, Kant L, Graff C, Patel M (2012) Tuning of reinforcement learning parameters applied to OLSR using a cognitive network design tool. In: 2012 IEEE wireless communications and networking conference (WCNC). IEEE, pp 2786–2791 McAuley A, Sinkar K, Kant L, Graff C, Patel M (2012) Tuning of reinforcement learning parameters applied to OLSR using a cognitive network design tool. In: 2012 IEEE wireless communications and networking conference (WCNC). IEEE, pp 2786–2791
Zurück zum Zitat Miagkikh V, Punch WF (1999) Global search in combinatorial optimization using reinforcement learning algorithms. In: Evolutionary computation, 1999. CEC 99. Proceedings of the 1999 Congress on, vol 1 Miagkikh V, Punch WF (1999) Global search in combinatorial optimization using reinforcement learning algorithms. In: Evolutionary computation, 1999. CEC 99. Proceedings of the 1999 Congress on, vol 1
Zurück zum Zitat Miki S, Yamamoto D, Ebara H (2018) Applying deep learning and reinforcement learning to traveling salesman problem. In: 2018 international conference on computing, electronics communications engineering (iCCECE), pp 65–70 Miki S, Yamamoto D, Ebara H (2018) Applying deep learning and reinforcement learning to traveling salesman problem. In: 2018 international conference on computing, electronics communications engineering (iCCECE), pp 65–70
Zurück zum Zitat Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare M, Graves A, Riedmiller M, Fidjeland A, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533 Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare M, Graves A, Riedmiller M, Fidjeland A, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Zurück zum Zitat Montemanni R, Smith DH, Gambardella LM (2007) Ant colony systems for large sequential ordering problems. In: 2007 IEEE Swarm intelligence symposium, pp 60–67 Montemanni R, Smith DH, Gambardella LM (2007) Ant colony systems for large sequential ordering problems. In: 2007 IEEE Swarm intelligence symposium, pp 60–67
Zurück zum Zitat Montemanni R, Smith D, Gambardella L (2008) A heuristic manipulation technique for the sequential ordering problem. Comput Oper Res 35(12):3931–3944. Part Special Issue: Telecommunications Network EngineeringMATH Montemanni R, Smith D, Gambardella L (2008) A heuristic manipulation technique for the sequential ordering problem. Comput Oper Res 35(12):3931–3944. Part Special Issue: Telecommunications Network EngineeringMATH
Zurück zum Zitat Montgomery DC (2017) Design and analysis of experiments, 9th edn. Wiley, New York Montgomery DC (2017) Design and analysis of experiments, 9th edn. Wiley, New York
Zurück zum Zitat Ottoni ALC, Nepomuceno EG, Oliveira MS (2017) Performance analysis of reinforcement learning in the solution of multidimensional knapsack problem. Rev Bras Comput Apl 9(3):56–70 Ottoni ALC, Nepomuceno EG, Oliveira MS (2017) Performance analysis of reinforcement learning in the solution of multidimensional knapsack problem. Rev Bras Comput Apl 9(3):56–70
Zurück zum Zitat Ottoni ALC, Nepomuceno EG, de Oliveira MS (2018) A response surface model approach to parameter estimation of reinforcement learning for the travelling salesman problem. J Control Autom Electr Syst 29(3):350–359 Ottoni ALC, Nepomuceno EG, de Oliveira MS (2018) A response surface model approach to parameter estimation of reinforcement learning for the travelling salesman problem. J Control Autom Electr Syst 29(3):350–359
Zurück zum Zitat Ottoni ALC, Nepomuceno EG, Oliveira MS, Cordeiro LT, Lamperti RD (2016) Analysis of the influence of learning rate and discount factor on the performance of q-learning and sarsa algorithms: application of reinforcement learning in autonomous navigation. Rev Bras Comput Apl 8(2):44–59 Ottoni ALC, Nepomuceno EG, Oliveira MS, Cordeiro LT, Lamperti RD (2016) Analysis of the influence of learning rate and discount factor on the performance of q-learning and sarsa algorithms: application of reinforcement learning in autonomous navigation. Rev Bras Comput Apl 8(2):44–59
Zurück zum Zitat Papapanagiotou V, Jamal J, Montemanni R, Shobaki G, Gambardella LM (2015) A comparison of two exact algorithms for the sequential ordering problem. In: 2015 IEEE conference on systems, process and control (ICSPC), pp 73–78 Papapanagiotou V, Jamal J, Montemanni R, Shobaki G, Gambardella LM (2015) A comparison of two exact algorithms for the sequential ordering problem. In: 2015 IEEE conference on systems, process and control (ICSPC), pp 73–78
Zurück zum Zitat R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Zurück zum Zitat Santos JPQ, Melo JD, Duarte Neto AD, Aloise D (2014) Reactive search strategies using reinforcement learning, local search algorithms and variable neighborhood search. Expert Syst Appl 41(10):4939–4949 Santos JPQ, Melo JD, Duarte Neto AD, Aloise D (2014) Reactive search strategies using reinforcement learning, local search algorithms and variable neighborhood search. Expert Syst Appl 41(10):4939–4949
Zurück zum Zitat Schweighofer N, Doya K (2003) Meta-learning in reinforcement learning. Neural Netw 16(1):5–9 Schweighofer N, Doya K (2003) Meta-learning in reinforcement learning. Neural Netw 16(1):5–9
Zurück zum Zitat Scott AJ, Knott M (1974) A cluster analysis methods for grouping means in the analysis of variance. Biometrics 30:507–512MATH Scott AJ, Knott M (1974) A cluster analysis methods for grouping means in the analysis of variance. Biometrics 30:507–512MATH
Zurück zum Zitat Shao J, Lin H, Zhang K (2014) Swarm robots reinforcement learning convergence accuracy-based learning classifier systems with gradient descent (XCS-GD). Neural Comput Appl 25(2):263–268 Shao J, Lin H, Zhang K (2014) Swarm robots reinforcement learning convergence accuracy-based learning classifier systems with gradient descent (XCS-GD). Neural Comput Appl 25(2):263–268
Zurück zum Zitat Shobaki G, Jamal J (2015) An exact algorithm for the sequential ordering problem and its application to switching energy minimization in compilers. Comput Optim Appl 61(2):343–372MathSciNetMATH Shobaki G, Jamal J (2015) An exact algorithm for the sequential ordering problem and its application to switching energy minimization in compilers. Comput Optim Appl 61(2):343–372MathSciNetMATH
Zurück zum Zitat Skinderowicz R (2017) An improved ant colony system for the sequential ordering problem. Comput Oper Res 86:1–17MathSciNetMATH Skinderowicz R (2017) An improved ant colony system for the sequential ordering problem. Comput Oper Res 86:1–17MathSciNetMATH
Zurück zum Zitat Sun R, Tatsumi S, Zhao G (2001) Multiagent reinforcement learning method with an improved ant colony system. In: Proceedings of the IEEE international conference on systems, man and cybernetics, vol 3, pp 1612–1617 Sun R, Tatsumi S, Zhao G (2001) Multiagent reinforcement learning method with an improved ant colony system. In: Proceedings of the IEEE international conference on systems, man and cybernetics, vol 3, pp 1612–1617
Zurück zum Zitat Sutton R, Barto A (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, CambridgeMATH Sutton R, Barto A (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, CambridgeMATH
Zurück zum Zitat Watkins CJ, Dayan P (1992) Technical note Q-learning. Mach Learn 8(3):279–292MATH Watkins CJ, Dayan P (1992) Technical note Q-learning. Mach Learn 8(3):279–292MATH
Zurück zum Zitat Woo S, Yeon J, Ji M, Moon I, Park J (2018) Deep reinforcement learning with fully convolutional neural network to solve an earthwork scheduling problem. In: 2018 IEEE international conference on systems, man, and cybernetics (SMC), pp 4236–4242 Woo S, Yeon J, Ji M, Moon I, Park J (2018) Deep reinforcement learning with fully convolutional neural network to solve an earthwork scheduling problem. In: 2018 IEEE international conference on systems, man, and cybernetics (SMC), pp 4236–4242
Zurück zum Zitat Yliniemi L, Tumer K (2016) Multi-objective multiagent credit assignment in reinforcement learning and NSGA-II. Soft Comput 20(10):3869–3887 Yliniemi L, Tumer K (2016) Multi-objective multiagent credit assignment in reinforcement learning and NSGA-II. Soft Comput 20(10):3869–3887
Zurück zum Zitat Zhang W, Dietterich TG (1995) High-performance job-shop scheduling with a time-delay TD(lambda) network. In: Touretzky D, Mozer M, Hasseimo ME (eds) Advances in neural information processing systems, vol 8. MIT Press, Cambridge, pp 1024–1030 Zhang W, Dietterich TG (1995) High-performance job-shop scheduling with a time-delay TD(lambda) network. In: Touretzky D, Mozer M, Hasseimo ME (eds) Advances in neural information processing systems, vol 8. MIT Press, Cambridge, pp 1024–1030
Metadaten
Titel
Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method
verfasst von
André L. C. Ottoni
Erivelton G. Nepomuceno
Marcos S. de Oliveira
Daniela C. R. de Oliveira
Publikationsdatum
06.07.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 6/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-019-04206-w

Weitere Artikel der Ausgabe 6/2020

Soft Computing 6/2020 Zur Ausgabe

Premium Partner