Skip to main content
Erschienen in: Neural Computing and Applications 2/2019

21.06.2017 | Original Article

Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection

verfasst von: Biao Yin, Mahjoub Dridi, Abdellah El Moudni

Erschienen in: Neural Computing and Applications | Sonderheft 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a new method to solve the scheduling problem of adaptive traffic signal control at intersection. The method involves recursive least-squares temporal difference (RLS-TD(λ)) learning that is integrated into approximate dynamic programming. The learning mechanism of RLS-TD(λ) is to make an adaptation of linear function approximation by updating its parameters based on environmental feedback. This study investigates the method implementation after modeling a traffic dynamic system at intersection in discrete time. In the model, different traffic control schemes regarding signal phase sequence are considered, especially the defined adaptive phase sequence (APS). By simulating traffic scenarios, RLS-TD(λ) is superior to TD(λ) for updating functional parameters in the approximation, and APS outperforms other conventional control schemes on reducing traffic delay. By comparing with other traffic signal control algorithms, the proposed algorithm yields satisfying results in terms of traffic delay and computation time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Khan SG, Herrmann G, Lewis FL, Pipe T, Melhuish C (2012) Reinforcement learning and optimal adaptive control: an overview and implementation examples. Annu Rev Control 36(1):42–59CrossRef Khan SG, Herrmann G, Lewis FL, Pipe T, Melhuish C (2012) Reinforcement learning and optimal adaptive control: an overview and implementation examples. Annu Rev Control 36(1):42–59CrossRef
2.
Zurück zum Zitat Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeMATH Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeMATH
3.
Zurück zum Zitat Xu X, Zuo L, Huang Z (2014) Reinforcement learning algorithms with function approximation: recent advances and applications. Inform Sci 261:1–31MathSciNetMATHCrossRef Xu X, Zuo L, Huang Z (2014) Reinforcement learning algorithms with function approximation: recent advances and applications. Inform Sci 261:1–31MathSciNetMATHCrossRef
4.
Zurück zum Zitat Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality. Wiley, New YorkMATHCrossRef Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality. Wiley, New YorkMATHCrossRef
5.
Zurück zum Zitat Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell M 4(2):39–47CrossRef Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell M 4(2):39–47CrossRef
6.
Zurück zum Zitat Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. Handbook of intelligent control: neural, fuzzy, and adaptive approaches 15:493–525 Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. Handbook of intelligent control: neural, fuzzy, and adaptive approaches 15:493–525
7.
Zurück zum Zitat Cai C, Wong CK, Heydecker BG (2009) Adaptive traffic signal control using approximate dynamic programming. Transport Res Part C Emerg Technol 17(5):456–474CrossRef Cai C, Wong CK, Heydecker BG (2009) Adaptive traffic signal control using approximate dynamic programming. Transport Res Part C Emerg Technol 17(5):456–474CrossRef
8.
Zurück zum Zitat Haijema R, van der Wal J (2008) An MDP decomposition approach for traffic control at isolated signalized intersections. Proba Eng Inform Sci 22(4):587–602MathSciNetMATHCrossRef Haijema R, van der Wal J (2008) An MDP decomposition approach for traffic control at isolated signalized intersections. Proba Eng Inform Sci 22(4):587–602MathSciNetMATHCrossRef
9.
Zurück zum Zitat Yu XH, Recker WW (2006) Stochastic adaptive control model for traffic signal systems. Transp Res Part C Emerg Technol 14(4):263–282CrossRef Yu XH, Recker WW (2006) Stochastic adaptive control model for traffic signal systems. Transp Res Part C Emerg Technol 14(4):263–282CrossRef
10.
Zurück zum Zitat Baird L, Moore AW (1999) Gradient descent for general reinforcement learning. In: Advances in neural information processing systems, pp 968–974 Baird L, Moore AW (1999) Gradient descent for general reinforcement learning. In: Advances in neural information processing systems, pp 968–974
11.
Zurück zum Zitat Tsitsiklis JN, Van Roy B (1997) An analysis of temporal-difference learning with function approximation. IEEE Trans Automat Contr 42(5):674–690MathSciNetMATHCrossRef Tsitsiklis JN, Van Roy B (1997) An analysis of temporal-difference learning with function approximation. IEEE Trans Automat Contr 42(5):674–690MathSciNetMATHCrossRef
12.
Zurück zum Zitat Xu X, He H, Hu D (2002) Efficient reinforcement learning using recursive least-squares methods. J Artif Intell Res 16(1):259–292MathSciNetMATHCrossRef Xu X, He H, Hu D (2002) Efficient reinforcement learning using recursive least-squares methods. J Artif Intell Res 16(1):259–292MathSciNetMATHCrossRef
13.
Zurück zum Zitat Ormoneit D, Sen Ś (2002) Kernel-based reinforcement learning. Mach Learn 49(2–3):161–178MATHCrossRef Ormoneit D, Sen Ś (2002) Kernel-based reinforcement learning. Mach Learn 49(2–3):161–178MATHCrossRef
14.
Zurück zum Zitat Bradtke SJ, Barto AG (1996) Linear least-squares algorithms for temporal difference learning. Mach Learn 22(1–3):33–57MATH Bradtke SJ, Barto AG (1996) Linear least-squares algorithms for temporal difference learning. Mach Learn 22(1–3):33–57MATH
15.
Zurück zum Zitat Boyan JA (2002) Technical update: least-squares temporal difference learning. Mach Learn 49(2–3):233–246MATHCrossRef Boyan JA (2002) Technical update: least-squares temporal difference learning. Mach Learn 49(2–3):233–246MATHCrossRef
16.
Zurück zum Zitat Hunt PB, Robertson DI, Bretherton RD, Winton RI (1981) SCOOT–a traffic responsive method of coordinating signals. Transport and Road Research Laboratory, Crowthorne, Technique Report Hunt PB, Robertson DI, Bretherton RD, Winton RI (1981) SCOOT–a traffic responsive method of coordinating signals. Transport and Road Research Laboratory, Crowthorne, Technique Report
17.
Zurück zum Zitat Lowrie PR (1982) The Sydney coordinated adaptive traffic system-principles, methodology, algorithms. In: Proceddings of international conference on road traffic signalling Lowrie PR (1982) The Sydney coordinated adaptive traffic system-principles, methodology, algorithms. In: Proceddings of international conference on road traffic signalling
18.
Zurück zum Zitat Mladenovic MN, Stevanovic A, Kosonen I, Glavic D (2015) Adaptive traffic control systems: guidelines for development of functional requirements. mobil.TUM. Munich, Germany Mladenovic MN, Stevanovic A, Kosonen I, Glavic D (2015) Adaptive traffic control systems: guidelines for development of functional requirements. mobil.TUM. Munich, Germany
19.
Zurück zum Zitat Gartner NH, Pooran FJ, Andrews CM (2001) Implementation of the OPAC adaptive control strategy in a traffic signal network. In: Proceedings of IEEE conference intelligent transportation systems, pp 195–200 Gartner NH, Pooran FJ, Andrews CM (2001) Implementation of the OPAC adaptive control strategy in a traffic signal network. In: Proceedings of IEEE conference intelligent transportation systems, pp 195–200
21.
Zurück zum Zitat Mirchandani P, Head L (2001) A real-time traffic signal control system: architecture, algorithms, and analysis. Transp Res Part C Emerg Technol 9(6):415–432CrossRef Mirchandani P, Head L (2001) A real-time traffic signal control system: architecture, algorithms, and analysis. Transp Res Part C Emerg Technol 9(6):415–432CrossRef
22.
Zurück zum Zitat Heung TH, Ho TK, Fung YF (2005) Coordinated road-junction traffic control by dynamic programming. IEEE Trans Intell Transp 6(3):341–350CrossRef Heung TH, Ho TK, Fung YF (2005) Coordinated road-junction traffic control by dynamic programming. IEEE Trans Intell Transp 6(3):341–350CrossRef
23.
Zurück zum Zitat Wu J, Abbas-Turki A, El Moudni A (2009) Discrete methods for urban intersection traffic controlling. In Proceedings of IEEE vehicular technology conference, pp 1–5 Wu J, Abbas-Turki A, El Moudni A (2009) Discrete methods for urban intersection traffic controlling. In Proceedings of IEEE vehicular technology conference, pp 1–5
24.
Zurück zum Zitat Park B, Chang M (2002) Realizing benefits of adaptive signal control at an isolated intersection. Transport Res Rec 1811:115–121CrossRef Park B, Chang M (2002) Realizing benefits of adaptive signal control at an isolated intersection. Transport Res Rec 1811:115–121CrossRef
25.
Zurück zum Zitat Abdulhai B, Pringle R, Karakoulas GJ (2003) Reinforcement learning for true adaptive traffic signal control. J Transp Eng-ASCE 129(3):278–285CrossRef Abdulhai B, Pringle R, Karakoulas GJ (2003) Reinforcement learning for true adaptive traffic signal control. J Transp Eng-ASCE 129(3):278–285CrossRef
26.
Zurück zum Zitat Lee J, Abdulhai B, Shalaby A, Chung EH (2005) Real-time optimization for adaptive traffic signal control using genetic algorithms. J Intell Transport S 9(3):111–122MATHCrossRef Lee J, Abdulhai B, Shalaby A, Chung EH (2005) Real-time optimization for adaptive traffic signal control using genetic algorithms. J Intell Transport S 9(3):111–122MATHCrossRef
27.
Zurück zum Zitat Kergaye C, Stevanovic A, Martin PT (2010) Comparative evaluation of adaptive traffic control system assessments through field and microsimulation. J Intell Transport S 14(2):109–124CrossRef Kergaye C, Stevanovic A, Martin PT (2010) Comparative evaluation of adaptive traffic control system assessments through field and microsimulation. J Intell Transport S 14(2):109–124CrossRef
28.
Zurück zum Zitat Li L, Lv Y, Wang FY (2016) Traffic signal timing via deep reinforcement learning. IEEE/CAA J Autom Sin 3(3):247–254MathSciNetCrossRef Li L, Lv Y, Wang FY (2016) Traffic signal timing via deep reinforcement learning. IEEE/CAA J Autom Sin 3(3):247–254MathSciNetCrossRef
29.
Zurück zum Zitat Araghi S, Khosravi A, Creighton D (2015) A review on computational intelligence methods for controlling traffic signal timing. Expert Syst Appl 42(3):1538–1550CrossRef Araghi S, Khosravi A, Creighton D (2015) A review on computational intelligence methods for controlling traffic signal timing. Expert Syst Appl 42(3):1538–1550CrossRef
30.
Zurück zum Zitat García-Nieto J, Alba E, Carolina Olivera A (2012) Swarm intelligence for traffic light scheduling: application to real urban areas. Eng Appl Artif Intell 25(2):274–283CrossRef García-Nieto J, Alba E, Carolina Olivera A (2012) Swarm intelligence for traffic light scheduling: application to real urban areas. Eng Appl Artif Intell 25(2):274–283CrossRef
31.
Zurück zum Zitat Srinivasan D, Choy MC, Cheu RL (2006) Neural networks for real-time traffic signal control. IEEE Trans Intell Transp 7(3):261–272CrossRef Srinivasan D, Choy MC, Cheu RL (2006) Neural networks for real-time traffic signal control. IEEE Trans Intell Transp 7(3):261–272CrossRef
32.
Zurück zum Zitat Arel I, Liu C, Urbanik T, Kohls AG (2010) Reinforcement learning-based multi-agent system for network traffic signal control. IET Intell Transp Syst 4(2):128–135CrossRef Arel I, Liu C, Urbanik T, Kohls AG (2010) Reinforcement learning-based multi-agent system for network traffic signal control. IET Intell Transp Syst 4(2):128–135CrossRef
33.
Zurück zum Zitat Bazzan ALC (2009) Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Auton Agent Multi-Agent Syst 18(3):342–375CrossRef Bazzan ALC (2009) Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Auton Agent Multi-Agent Syst 18(3):342–375CrossRef
34.
Zurück zum Zitat Box S, Waterson B (2013) An automated signalized junction controller that learns strategies by temporal difference reinforcement learning. Eng Appl Artif Intell 26(1):652–659CrossRef Box S, Waterson B (2013) An automated signalized junction controller that learns strategies by temporal difference reinforcement learning. Eng Appl Artif Intell 26(1):652–659CrossRef
35.
Zurück zum Zitat Prashanth LA, Bhatnagar S (2011) Reinforcement learning with function approximation for traffic signal control. IEEE Trans Intell Transp 12(2):412–421CrossRef Prashanth LA, Bhatnagar S (2011) Reinforcement learning with function approximation for traffic signal control. IEEE Trans Intell Transp 12(2):412–421CrossRef
36.
Zurück zum Zitat El-Tantawy S, Abdulhai B, Abdelgawad H (2013) Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto. IEEE Trans Intell Transp 14(3):1140–1150CrossRef El-Tantawy S, Abdulhai B, Abdelgawad H (2013) Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto. IEEE Trans Intell Transp 14(3):1140–1150CrossRef
37.
Zurück zum Zitat Li T, Zhao D, Yi J (2008) Adaptive dynamic programming for multi-intersections traffic signal intelligent control. In: Proceedings of IEEE conference intelligent transportation systems, pp 286–291 Li T, Zhao D, Yi J (2008) Adaptive dynamic programming for multi-intersections traffic signal intelligent control. In: Proceedings of IEEE conference intelligent transportation systems, pp 286–291
38.
Zurück zum Zitat Zhao D, Hu Z, Xia Z, Alippi C, Zhu Y, Wang D (2014) Full-range adaptive cruise control based on supervised adaptive dynamic programming. Neurocomputing 125:57–67CrossRef Zhao D, Hu Z, Xia Z, Alippi C, Zhu Y, Wang D (2014) Full-range adaptive cruise control based on supervised adaptive dynamic programming. Neurocomputing 125:57–67CrossRef
39.
Zurück zum Zitat Huang YS, Weng YS, Zhou MC (2014) Modular design of urban traffic-light control systems based on synchronized timed Petri nets. IEEE Trans Intell Transp 15(2):530–539CrossRef Huang YS, Weng YS, Zhou MC (2014) Modular design of urban traffic-light control systems based on synchronized timed Petri nets. IEEE Trans Intell Transp 15(2):530–539CrossRef
40.
Zurück zum Zitat El-Tantawy S, Abdulhai B, Abdelgawad H (2014) Design of reinforcement learning parameters for seamless application of adaptive traffic signal control. J Intell Transp Syst 18(3):227–245CrossRef El-Tantawy S, Abdulhai B, Abdelgawad H (2014) Design of reinforcement learning parameters for seamless application of adaptive traffic signal control. J Intell Transp Syst 18(3):227–245CrossRef
41.
Zurück zum Zitat Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern C 38(2):156–172CrossRef Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern C 38(2):156–172CrossRef
42.
Zurück zum Zitat Bertsekas DP (1995) Dynamic programming and optimal control vol. 1 No 2. Athena Scientific, Belmont Bertsekas DP (1995) Dynamic programming and optimal control vol. 1 No 2. Athena Scientific, Belmont
43.
Zurück zum Zitat Gartner NH, Tarnoff PJ, Andrews CM (1991) Evaluation of optimized policies for adaptive control strategy. Transp Res Rec 1324:105–114 Gartner NH, Tarnoff PJ, Andrews CM (1991) Evaluation of optimized policies for adaptive control strategy. Transp Res Rec 1324:105–114
44.
Zurück zum Zitat Yin B, Dridi M, El Moudni A (2015) Forward search algorithm based on dynamic programming for real-time adaptive traffic signal control. IET Intell Transp Syst 9(7):754–764CrossRef Yin B, Dridi M, El Moudni A (2015) Forward search algorithm based on dynamic programming for real-time adaptive traffic signal control. IET Intell Transp Syst 9(7):754–764CrossRef
45.
Zurück zum Zitat Khamis MA, Gomaa W (2012) Enhanced multiagent multi-objective reinforcement learning for urban traffic light control. In: Proceedings of IEEE conference machine learning and applications, pp 586–591 Khamis MA, Gomaa W (2012) Enhanced multiagent multi-objective reinforcement learning for urban traffic light control. In: Proceedings of IEEE conference machine learning and applications, pp 586–591
46.
Zurück zum Zitat Khamis MA, Gomaa W (2014) Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework. Eng Appl Artif Intell 29:134–151CrossRef Khamis MA, Gomaa W (2014) Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework. Eng Appl Artif Intell 29:134–151CrossRef
47.
Zurück zum Zitat Söderström T, Stoica P (2002) Instrumental variable methods for system identification. Circ Syst Signal Process 21(1):1–9MathSciNetMATHCrossRef Söderström T, Stoica P (2002) Instrumental variable methods for system identification. Circ Syst Signal Process 21(1):1–9MathSciNetMATHCrossRef
Metadaten
Titel
Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection
verfasst von
Biao Yin
Mahjoub Dridi
Abdellah El Moudni
Publikationsdatum
21.06.2017
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe Sonderheft 2/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-017-3066-9

Weitere Artikel der Sonderheft 2/2019

Neural Computing and Applications 2/2019 Zur Ausgabe