Skip to main content
Erschienen in: Pattern Analysis and Applications 3/2017

25.01.2016 | Theoretical Advances

The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality

verfasst von: Xuan Zhang, B. John Oommen, Ole-Christoffer Granmo

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA reported in the literature, the proofs of their \(\epsilon\)-optimal convergence has been unsolved. This is precisely the intent of this paper. In this paper, we present a single unifying analysis by which the proofs of both the continuous and discretized schemes are proven. We emphasize that unlike the ML-based pursuit schemes, the Bayesian schemes have to not only consider the estimates themselves but also the distributional forms of their conjugate posteriors and their higher order moments—all of which render the proofs to be particularly challenging. As far as we know, apart from the results themselves, the methodologies of this proof have been unreported in the literature—they are both pioneering and novel.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In the interest of compactness, unless otherwise stated, we will refer to the CBPA as the BPA.
 
2
The version of the BPA presented here, namely the absorbing Bayesian pursuit algorithm (ABPA) is distinct from the version presented in [1]. The reason for this is explained presently.
 
3
In the interest of compactness and to avoid repetition, the definition and explanations/statements are given for the ABPA in the main text, and described in a parenthesized manner separately for the DBPA.
 
4
We emphasize that proving the convergence in the case of utilizing the corresponding ML estimates is not merely a consequence of the weak law of large numbers. Indeed, one has to also take into consideration the specific details of the LA updating rules using which the actions are chosen for the estimation purposes. The arguments to do this are quite intricate, and they have been presented in fine detail in [29]. This proof is not repeated here, but can be included if requested by the referees.
 
5
In the interest of simplicity, at this juncture we have assumed that \(\bar{d}_j\) are independent of each other. We believe that this assumption can be easily relaxed by considering only the individual \(d_j\)’s as in Eq. (5), and not all of them together, as in Eq. (6).
 
6
In order to not burden the reader with cumbersome algebraic manipulations, we omit the straightforward steps.
 
Literatur
1.
Zurück zum Zitat Zhang X, Granmo OC, Oommen BJ (2011) The Bayesian pursuit algorithm: a new family of estimator learning automata. In: Proceedings of IEA-AIE 2011. Springer, New York, June 2011, pp 608–620 Zhang X, Granmo OC, Oommen BJ (2011) The Bayesian pursuit algorithm: a new family of estimator learning automata. In: Proceedings of IEA-AIE 2011. Springer, New York, June 2011, pp 608–620
2.
Zurück zum Zitat Zhang X, Granmo O-C, Oommen BJ (2013) On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl Intell 39:782–792CrossRef Zhang X, Granmo O-C, Oommen BJ (2013) On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl Intell 39:782–792CrossRef
3.
Zurück zum Zitat Zhang X, Granmo OC, Oommen BJ (2012) Discretized Bayesian pursuit—a new scheme for reinforcement learning. In: Proceedings of IEA-AIE 2012, Dalian, June 2012, pp 784–793 Zhang X, Granmo OC, Oommen BJ (2012) Discretized Bayesian pursuit—a new scheme for reinforcement learning. In: Proceedings of IEA-AIE 2012, Dalian, June 2012, pp 784–793
5.
Zurück zum Zitat Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice Hall, New Jersey, USA Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice Hall, New Jersey, USA
6.
Zurück zum Zitat Oommen BJ, Agache M (2001) Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Trans Syst Man Cybern Part B Cybern 31(3):277–287CrossRef Oommen BJ, Agache M (2001) Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Trans Syst Man Cybern Part B Cybern 31(3):277–287CrossRef
7.
Zurück zum Zitat Oommen BJ, Granmo OC, Pedersen A (2007) Using stochastic AI techniques to achieve unbounded resolution in finite player Goore games and its applications. In: Proceedings of IEEE symposium on computational intelligence and games, Honolulu, April 2007, pp 161–167 Oommen BJ, Granmo OC, Pedersen A (2007) Using stochastic AI techniques to achieve unbounded resolution in finite player Goore games and its applications. In: Proceedings of IEEE symposium on computational intelligence and games, Honolulu, April 2007, pp 161–167
8.
Zurück zum Zitat Beigy H, Meybodi MR (2000) Adaptation of parameters of BP algorithm using learning automata. In: Proceedings of sixth Brazilian symposium on neural networks, Brazil, November 2000, pp 24–31 Beigy H, Meybodi MR (2000) Adaptation of parameters of BP algorithm using learning automata. In: Proceedings of sixth Brazilian symposium on neural networks, Brazil, November 2000, pp 24–31
9.
Zurück zum Zitat Zhang X, Jiao L, Granmo OC, Oommen BJ (2013) Channel selection in cognitive radio networks: a switchable Bayesian learning automata approach. In: Proceedings of PIMRC, London, September 2013, pp 2362–2367 Zhang X, Jiao L, Granmo OC, Oommen BJ (2013) Channel selection in cognitive radio networks: a switchable Bayesian learning automata approach. In: Proceedings of PIMRC, London, September 2013, pp 2362–2367
10.
Zurück zum Zitat Jiao L, Zhang X, Granmo OC, Oommen BJ (2014) A Bayesian learning automata-based distributed channel selection scheme for cognitive radio networks. In: Proceedings of IEA-AIE, Kaohsiung, June 2014, pp 48–57 Jiao L, Zhang X, Granmo OC, Oommen BJ (2014) A Bayesian learning automata-based distributed channel selection scheme for cognitive radio networks. In: Proceedings of IEA-AIE, Kaohsiung, June 2014, pp 48–57
11.
Zurück zum Zitat Granmo O-C, Oommen BJ, Myrer S-A, Olsen MG (2007) Learning automata-based solutions to the nonlinear fractional knapsack problem with applications to optimal resource allocation. IEEE Trans Syst Man Cybern Part B 37(1):166–175CrossRef Granmo O-C, Oommen BJ, Myrer S-A, Olsen MG (2007) Learning automata-based solutions to the nonlinear fractional knapsack problem with applications to optimal resource allocation. IEEE Trans Syst Man Cybern Part B 37(1):166–175CrossRef
12.
Zurück zum Zitat Granmo OC, Oommen BJ, Myrer SA, Olsen MG (2006) Determining optimal polling frequency using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 IEEE international conferences on cybernetics and intelligent systems (CIS) and robotics, automation and mechatronics (RAM), Bangkok, June 2006, pp 1–7 Granmo OC, Oommen BJ, Myrer SA, Olsen MG (2006) Determining optimal polling frequency using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 IEEE international conferences on cybernetics and intelligent systems (CIS) and robotics, automation and mechatronics (RAM), Bangkok, June 2006, pp 1–7
13.
Zurück zum Zitat Granmo O-C, Oommen BJ (2011) Learning automata-based solutions to the optimal web polling problem modeled as a nonlinear fractional knapsack problem. Eng Appl Artif Intell 24(7):1238–1251CrossRef Granmo O-C, Oommen BJ (2011) Learning automata-based solutions to the optimal web polling problem modeled as a nonlinear fractional knapsack problem. Eng Appl Artif Intell 24(7):1238–1251CrossRef
14.
Zurück zum Zitat Granmo OC, Oommen BJ (2006) On allocating limited sampling resources using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 international intelligent information processing and web mining conference, advances in soft computing, vol 35, Ustron, June 2006, pp 263–272 Granmo OC, Oommen BJ (2006) On allocating limited sampling resources using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 international intelligent information processing and web mining conference, advances in soft computing, vol 35, Ustron, June 2006, pp 263–272
15.
Zurück zum Zitat Granmo O-C, Oommen BJ (2010) Optimal sampling for estimation with constrained resources using a learning automaton-based solution for the nonlinear fractional knapsack problem. Appl Intell 33(1):3–20CrossRef Granmo O-C, Oommen BJ (2010) Optimal sampling for estimation with constrained resources using a learning automaton-based solution for the nonlinear fractional knapsack problem. Appl Intell 33(1):3–20CrossRef
16.
Zurück zum Zitat Yazidi A, Granmo O-C, Oommen BJ (2012) Service selection in stochastic environments: a learning-automaton based solution. Appl Intell 36:617–637CrossRef Yazidi A, Granmo O-C, Oommen BJ (2012) Service selection in stochastic environments: a learning-automaton based solution. Appl Intell 36:617–637CrossRef
17.
Zurück zum Zitat Unsal C, Kachroo P, Bay JS (1999) Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Trans Syst Man Cybern Part A 29:120–128CrossRef Unsal C, Kachroo P, Bay JS (1999) Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Trans Syst Man Cybern Part A 29:120–128CrossRef
18.
Zurück zum Zitat Oommen BJ, Roberts TD (2000) Continuous learning automata solutions to the capacity assignment problem. IEEE Trans Comput 49:608–620CrossRef Oommen BJ, Roberts TD (2000) Continuous learning automata solutions to the capacity assignment problem. IEEE Trans Comput 49:608–620CrossRef
19.
Zurück zum Zitat Oommen BJ, Croix TDS (1997) String taxonomy using learning automata. IEEE Trans Syst Man Cybern 27:354–365CrossRef Oommen BJ, Croix TDS (1997) String taxonomy using learning automata. IEEE Trans Syst Man Cybern 27:354–365CrossRef
21.
Zurück zum Zitat Dean T, Angluin D, Basye K, Engelson S, Aelbling L, Maron O (1995) Inferring finite automata with stochastic output functions and an application to map learning. Mach Learn 18:81–108 Dean T, Angluin D, Basye K, Engelson S, Aelbling L, Maron O (1995) Inferring finite automata with stochastic output functions and an application to map learning. Mach Learn 18:81–108
22.
Zurück zum Zitat Thathachar MAL, Sastry PS (1986) Estimator algorithms for learning automata. In: Proceedings of the platinum jubilee conference on systems and signal processing, Bangalore, December 1986, pp 29–32 Thathachar MAL, Sastry PS (1986) Estimator algorithms for learning automata. In: Proceedings of the platinum jubilee conference on systems and signal processing, Bangalore, December 1986, pp 29–32
24.
Zurück zum Zitat Lanctôt JK, Oommen BJ (1992) Discretized estimator learning automata. IEEE Trans Syst Man Cybern Part B Cybern 22(6):1473–1483MathSciNetCrossRefMATH Lanctôt JK, Oommen BJ (1992) Discretized estimator learning automata. IEEE Trans Syst Man Cybern Part B Cybern 22(6):1473–1483MathSciNetCrossRefMATH
25.
Zurück zum Zitat Lanctôt JK, Oommen BJ (1991) On discretizing estimator-based learning algorithms. IEEE Trans Syst Man Cybern Part B Cybern 2:1417–1422 Lanctôt JK, Oommen BJ (1991) On discretizing estimator-based learning algorithms. IEEE Trans Syst Man Cybern Part B Cybern 2:1417–1422
26.
Zurück zum Zitat Rajaraman K, Sastry PS (1996) Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans Syst Man Cybern Part B Cybern 26:590–598CrossRef Rajaraman K, Sastry PS (1996) Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans Syst Man Cybern Part B Cybern 26:590–598CrossRef
27.
28.
Zurück zum Zitat Zhang X, Granmo OC, Oommen BJ, Jiao L (2013) On using the theory of regular functions to prove the \(\epsilon\)-optimality of the continuous pursuit learning automaton. In: Proceedings of IEA-AIE 2013. Springer, Amsterdan, June 2013, pp 262–271 Zhang X, Granmo OC, Oommen BJ, Jiao L (2013) On using the theory of regular functions to prove the \(\epsilon\)-optimality of the continuous pursuit learning automaton. In: Proceedings of IEA-AIE 2013. Springer, Amsterdan, June 2013, pp 262–271
29.
Zurück zum Zitat Zhang X, Granmo O-C, Oommen BJ, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of absorbing continuous pursuit algorithms using the theory of regular functions. Appl Intell 41:974–985CrossRef Zhang X, Granmo O-C, Oommen BJ, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of absorbing continuous pursuit algorithms using the theory of regular functions. Appl Intell 41:974–985CrossRef
30.
Zurück zum Zitat Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) Using the theory of regular functions to formally prove the \(\epsilon\)-optimality of discretized pursuit learning algorithms. In: Proceedings of IEA-AIE. Kaohsiung. Springer, June 2014, pp 379–388 Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) Using the theory of regular functions to formally prove the \(\epsilon\)-optimality of discretized pursuit learning algorithms. In: Proceedings of IEA-AIE. Kaohsiung. Springer, June 2014, pp 379–388
31.
Zurück zum Zitat Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of discretized pursuit algorithms. Appl Intell Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of discretized pursuit algorithms. Appl Intell
32.
33.
Metadaten
Titel
The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality
verfasst von
Xuan Zhang
B. John Oommen
Ole-Christoffer Granmo
Publikationsdatum
25.01.2016
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 3/2017
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-016-0535-1

Weitere Artikel der Ausgabe 3/2017

Pattern Analysis and Applications 3/2017 Zur Ausgabe