nach oben

Queueing Systems

Erschienen in:

31.03.2022

Learning to cooperate in agent-based control of queueing networks

verfasst von: Vivek S. Borkar

Erschienen in: Queueing Systems | Ausgabe 3-4/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Excerpt

Control of queueing networks has been an active area of research for many years and with a variety of motivations, ranging from communication networks to supply chains. Nevertheless the problems amenable to clean and elegant analysis tend to be few and far between, and highly stylized. Real queueing networks tend to be plagued with many difficulties, both at the level of modeling and at the level of analysis, such as:

Scale: One major issue is always the enormously large size of the network and the lack of structure because most of these networks are ‘emergent’ and structure, if any, can at best be characterized in statistical terms.

Distributed, asynchronous control: The control, be it routing, admission or rate control, is at node or edge level, i.e., local and with local information. The only information available is the rewards and the local state. Gathering additional information through message passing may be unrealistic or possible only in a limited sense.

Modeling issues: Assumptions about input processes and aspects of queueing dynamics, such as distributional assumptions or assumptions regarding independence or Markovianity, are often oversimplifications.

Closed-loop stability: Despite major successes such as the backpressure scheme, one cannot say that the final word has been said on this.

Choice of objectives: The primary objective of each queue is always its own throughput, but there can be secondary objectives such as energy efficiency and overall fairness. Also, ‘optimality’ is usually a tall order and one has to seek a ‘satisficing’ solution in the sense of Simon [13], i.e., one that meets some minimum specifications.

Choice of policies: Distributed control of queueing networks is often viewed as a network game, thereby putting it firmly in the framework of stochastic games [11], with its equilibrium concepts such as Markov perfect equilibrium and Bayesian Nash equilibrium. However, it is not a priori ruled out that choice of a non-stationary policy at one or more queues may give a strictly better performance for all.

Given this, a case may be made for a data driven approach. We shall take inspiration from some recent work on foundational issues in reinforcement learning by Benjamin van Roy and associates [7, 10]. (See also [14] for a related discussion.) I introduce this paradigm in the next section and make a case for a community of learning automata aiming for ‘satisficing’ as embodied in Blackwell optimality. …

Vorheriger Artikel Some open problems in exact simulation of stochastic differential equations

Nächster Artikel Learning and data-driven optimization in queues with strategic customers

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abernethy, J., Bartlett, P. L., Hazan, E.: Blackwell approachability and no-regret learning are equivalent. In Proceedings of the 24th Annual Conference on Learning Theory, pages 27–46. PMLR (2011)

Axelrod, R.: The Complexity of Cooperation. Princeton University Press, NJ (1997)

Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. I and II (4th ed.). Athena Scientific, 2017/2012

Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pac. J. Math. 6, 1–8 (1956)CrossRef

Brunton, S.L., Kutz, J.N.: Data-driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press, Cambridge (2019)CrossRef

Crawford, V.P., Haller, H.: Learning how to cooperate: Optimal play in repeated coordination games. Econometrica: J. Econometric Soc. 58, 571–595 (1990)CrossRef

Dong, S., van Roy, B., Zhou, Z.: Simple agent, complex environment: Efficient reinforcement learning with agent state. arXiv preprint arXiv:2102.05261, (2021)

Francis, B.A., Wonham, W.M.: The internal model principle in control theory. Automatica 12, 457–465 (1976)CrossRef

Levy, Y.J.: Discounted stochastic games with no stationary Nash equilibrium: two examples (corrigendum, with A. McLennan, in Econometrica 83(3), 1237–1252 (2015)). Econometrica 81, 1973–2007 (2013)

10.

Lu, X., van Roy, B., Dwaracherla, V., Ibrahimi, M., Osband, I., Wen, Z.: Reinforcement learning, bit by bit. arXiv preprint arXiv:2103.04047, (2021)

11.

Menache, I., Ozdaglar, A.: Network Games: Theory, Models, and Dynamics. Synthesis Lectures on Communication Networks. Morgan & Claypool Publishers, (2011)

12.

Nowak, M.A., Highfield, R.: SuperCooperators. Free Press, Mumbai (2011)

13.

Simon, H.: Rational decision making in business organizations. Am. Econ. Rev. 69, 493–513 (1979)

14.

Walton, N., Xu, K.: Learning and information in stochastic networks and queues. arXiv preprint arXiv:2105.08769, (2021)

15.

Young, H.: Strategic Learning and its Limits. Oxford University Press, Oxford (2004)CrossRef

16.

Yu, H., Bertsekas, D.: On near optimality of the set of finite-state controllers for average cost POMDP. Math. Oper. Res. 33, 1–11 (2008)CrossRef

Titel: Learning to cooperate in agent-based control of queueing networks
verfasst von: Vivek S. Borkar
Publikationsdatum: 31.03.2022
Verlag: Springer US
Erschienen in: Queueing Systems / Ausgabe 3-4/2022
Print ISSN: 0257-0130
Elektronische ISSN: 1572-9443
DOI: https://doi.org/10.1007/s11134-022-09772-9

Springer Professional

Excerpt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 3-4/2022

Some thoughts on the analysis of coupled queues

Ross’s second conjecture and supermodular stochastic ordering

Conjectures on symmetric queues in heavy traffic

On the transition times in a loss model of wireless networks

Moderate deviation asymptotics of the GI/G/n queue in the Halfin–Whitt regime

Information-based long-range dependence

Premium Partner