Skip to main content

2017 | OriginalPaper | Buchkapitel

Thompson Sampling for Optimizing Stochastic Local Search

verfasst von : Tong Yu, Branislav Kveton, Ole J. Mengshoel

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Stochastic local search (SLS), like many other stochastic optimization algorithms, has several parameters that need to be optimized in order for the algorithm to find high quality solutions within a short amount of time. In this paper, we formulate a stochastic local search bandit (\(\mathtt{SLSB}\)), which is a novel learning variant of SLS based on multi-armed bandits. \(\mathtt{SLSB}\) optimizes SLS over a sequence of stochastic optimization problems and achieves high average cumulative reward. In \(\mathtt{SLSB}\), we study how SLS can be optimized via low degree polynomials in its noise and restart parameters. To determine the coefficients of the polynomials, we present polynomial Thompson Sampling (\(\mathtt{PolyTS}\)). We derive a regret bound for \(\mathtt{PolyTS}\) and validate its performance on synthetic problems of varying difficulty as well as on feature selection problems. Compared to bandits with no assumptions of the reward function and other parameter optimization approaches, our \(\mathtt{PolyTS}\) assuming polynomial structure can provide substantially better parameter optimization for SLS. In addition, due to its simple model update, \(\mathtt{PolyTS}\) has low computational cost compared to other SLS parameter optimization methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
As a motivation for our focus on maximizing cumulative reward, consider parameter tuning for feature selection on a sequence of HAR datasets. In this application, it is important to provide usable predictions even in the initial steps of tuning feature selection, else users may stop using the HAR system.
 
2
For example, in HAR, f(A) is the activity recognition accuracy of a particular machine learning model trained on a HAR dataset, when using SLS with parameters A for feature selection.
 
3
We did not try other \(N_{\text {budget}}\) for \(\mathtt{{GPEIMCMC}}\) and \(\mathtt{{GPUCB}}\), because their computation time is prohibitive.
 
4
Breast Cancer, Wine and Australian can be found at https://​archive.​ics.​uci.​edu/​ml/​datasets.​html.
 
5
Feature descriptions for HAR data 1 are in [2]. The features for HAR data 2 include mean, variance and FFT values of sensors signals for each 1 s sliding window.
 
6
If \(b = 0\ldots 0\) and \(\kappa = 20\), we have \(N_{\text {feature}} \le 20\). The reason is that after \(\kappa = 20\) SLS operations, the bit-string found by SLS can differ from the initial b in at most 20 bit.
 
Literatur
1.
Zurück zum Zitat Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: ICML-2013, pp. 127–135 (2013) Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: ICML-2013, pp. 127–135 (2013)
2.
Zurück zum Zitat Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., Reyes Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: ESANN-2013, pp. 437–442 (2013) Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., Reyes Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: ESANN-2013, pp. 437–442 (2013)
5.
Zurück zum Zitat Audibert, J.Y., Bubeck, S., Munos, R.: Best arm identification in multi-armed bandits. In: COLT 2010, pp. 41–53 (2010) Audibert, J.Y., Bubeck, S., Munos, R.: Best arm identification in multi-armed bandits. In: COLT 2010, pp. 41–53 (2010)
6.
Zurück zum Zitat Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)CrossRefMATH Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)CrossRefMATH
8.
Zurück zum Zitat Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: GECCO-2002, pp. 11–18 (2002) Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: GECCO-2002, pp. 11–18 (2002)
9.
Zurück zum Zitat Cappé, O., Garivier, A., Maillard, O.A., Munos, R., Stoltz, G.: Kullback-Leibler upper confidence bounds for optimal sequential allocation. Ann. Stat. 41(3), 1516–1541 (2013)MathSciNetCrossRefMATH Cappé, O., Garivier, A., Maillard, O.A., Munos, R., Stoltz, G.: Kullback-Leibler upper confidence bounds for optimal sequential allocation. Ann. Stat. 41(3), 1516–1541 (2013)MathSciNetCrossRefMATH
10.
Zurück zum Zitat Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: NIPS, vol. 24, pp. 2249–2257 (2011) Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: NIPS, vol. 24, pp. 2249–2257 (2011)
11.
Zurück zum Zitat DaCosta, L., Fialho, A., Schoenauer, M., Sebag, M.: Adaptive operator selection with dynamic multi-armed bandits. In: GECCO-2008, pp. 913–920 (2008) DaCosta, L., Fialho, A., Schoenauer, M., Sebag, M.: Adaptive operator selection with dynamic multi-armed bandits. In: GECCO-2008, pp. 913–920 (2008)
12.
Zurück zum Zitat Fialho, Á., Da Costa, L., Schoenauer, M., Sebag, M.: Analyzing bandit-based adaptive operator selection mechanisms. Ann. Math. Artif. Intell. 60(1–2), 25–64 (2010)MathSciNetCrossRefMATH Fialho, Á., Da Costa, L., Schoenauer, M., Sebag, M.: Analyzing bandit-based adaptive operator selection mechanisms. Ann. Math. Artif. Intell. 60(1–2), 25–64 (2010)MathSciNetCrossRefMATH
13.
Zurück zum Zitat Hoffman, M., Shahriari, B., Freitas, N.: On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In: Artificial Intelligence and Statistics, pp. 365–374 (2014) Hoffman, M., Shahriari, B., Freitas, N.: On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In: Artificial Intelligence and Statistics, pp. 365–374 (2014)
14.
Zurück zum Zitat Hoos, H.H.: An adaptive noise mechanism for WalkSAT. In: AAAI-2002, pp. 655–660 (2002) Hoos, H.H.: An adaptive noise mechanism for WalkSAT. In: AAAI-2002, pp. 655–660 (2002)
15.
Zurück zum Zitat Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations and Applications. Morgan Kaufmann, San Francisco (2005)MATH Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations and Applications. Morgan Kaufmann, San Francisco (2005)MATH
18.
Zurück zum Zitat Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: ParamILS: an automatic algorithm configuration framework. JAIR 36, 267–306 (2009)MATH Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: ParamILS: an automatic algorithm configuration framework. JAIR 36, 267–306 (2009)MATH
19.
Zurück zum Zitat Kleinberg, R.D.: Nearly tight bounds for the continuum-armed bandit problem. NIPS, vol. 17, pp. 697–704 (2004) Kleinberg, R.D.: Nearly tight bounds for the continuum-armed bandit problem. NIPS, vol. 17, pp. 697–704 (2004)
21.
Zurück zum Zitat Li, K., Fialho, A., Kwong, S., Zhang, Q.: Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 18(1), 114–130 (2014)CrossRef Li, K., Fialho, A., Kwong, S., Zhang, Q.: Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 18(1), 114–130 (2014)CrossRef
22.
Zurück zum Zitat Lockhart, J.W., Weiss, G.M.: The benefits of personalized smartphone-based activity recognition models. In: SDM-2014, pp. 614–622 (2014) Lockhart, J.W., Weiss, G.M.: The benefits of personalized smartphone-based activity recognition models. In: SDM-2014, pp. 614–622 (2014)
23.
Zurück zum Zitat López-Ibánez, M., Dubois-Lacoste, J., Stützle, T., Birattari, M.: The irace package: iterated racing for automatic algorithm configuration (2011) López-Ibánez, M., Dubois-Lacoste, J., Stützle, T., Birattari, M.: The irace package: iterated racing for automatic algorithm configuration (2011)
25.
Zurück zum Zitat Mengshoel, O.J., Ahres, Y., Yu, T.: Markov chain analysis of noise and restart in stochastic local search. In: IJCAI, pp. 639–646 (2016) Mengshoel, O.J., Ahres, Y., Yu, T.: Markov chain analysis of noise and restart in stochastic local search. In: IJCAI, pp. 639–646 (2016)
26.
Zurück zum Zitat Mengshoel, O.J.: Understanding the role of noise in stochastic local search: analysis and experiments. Artif. Intell. 172(8), 955–990 (2008)MathSciNetCrossRefMATH Mengshoel, O.J.: Understanding the role of noise in stochastic local search: analysis and experiments. Artif. Intell. 172(8), 955–990 (2008)MathSciNetCrossRefMATH
27.
Zurück zum Zitat Mengshoel, O.J., Wilkins, D.C., Roth, D.: Initialization and restart in stochastic local search: computing a most probable explanation in Bayesian networks. IEEE Trans. Knowl. Data Eng. 23(2), 235–247 (2011)CrossRef Mengshoel, O.J., Wilkins, D.C., Roth, D.: Initialization and restart in stochastic local search: computing a most probable explanation in Bayesian networks. IEEE Trans. Knowl. Data Eng. 23(2), 235–247 (2011)CrossRef
28.
Zurück zum Zitat Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: NIPS, vol. 23, pp. 1813–1821 (2010) Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: NIPS, vol. 23, pp. 1813–1821 (2010)
31.
Zurück zum Zitat Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)CrossRefMATH Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)CrossRefMATH
32.
Zurück zum Zitat Ryvchin, V., Strichman, O.: Local restarts in SAT. Constraint Program. Lett. (CPL) 4, 3–13 (2008)MATH Ryvchin, V., Strichman, O.: Local restarts in SAT. Constraint Program. Lett. (CPL) 4, 3–13 (2008)MATH
33.
Zurück zum Zitat Selman, B., Levesque, H., Mitchell, D.: A new method for solving hard satisfiability problems. In: AAAI, pp. 440–446 (1992) Selman, B., Levesque, H., Mitchell, D.: A new method for solving hard satisfiability problems. In: AAAI, pp. 440–446 (1992)
34.
Zurück zum Zitat Shoaib, M., Scholten, H., Havinga, P.J.: Towards physical activity recognition using smartphone sensors. In: UIC/ATC-2013, pp. 80–87. IEEE (2013) Shoaib, M., Scholten, H., Havinga, P.J.: Towards physical activity recognition using smartphone sensors. In: UIC/ATC-2013, pp. 80–87. IEEE (2013)
35.
Zurück zum Zitat Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, vol. 25, pp. 2951–2959 (2012) Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, vol. 25, pp. 2951–2959 (2012)
36.
Zurück zum Zitat Srinivas, N., Krause, A., Seeger, M., Kakade, S.M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: ICML-2010, pp. 1015–1022 (2010) Srinivas, N., Krause, A., Seeger, M., Kakade, S.M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: ICML-2010, pp. 1015–1022 (2010)
37.
Zurück zum Zitat Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)CrossRefMATH Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)CrossRefMATH
38.
Zurück zum Zitat Yu, M.C., Yu, T., Wang, S.C., Lin, C.J., Chang, E.Y.: Big data small footprint: the design of a low-power classifier for detecting transportation modes. Proc. VLDB Endow. 7(13), 1429–1440 (2014)CrossRef Yu, M.C., Yu, T., Wang, S.C., Lin, C.J., Chang, E.Y.: Big data small footprint: the design of a low-power classifier for detecting transportation modes. Proc. VLDB Endow. 7(13), 1429–1440 (2014)CrossRef
39.
Zurück zum Zitat Yu, T., Zhuang, Y., Mengshoel, O.J., Yagan, O.: Hybridizing personal and impersonal machine learning models for activity recognition on mobile devices. In: MobiCASE-2016, pp. 117–126 (2016) Yu, T., Zhuang, Y., Mengshoel, O.J., Yagan, O.: Hybridizing personal and impersonal machine learning models for activity recognition on mobile devices. In: MobiCASE-2016, pp. 117–126 (2016)
Metadaten
Titel
Thompson Sampling for Optimizing Stochastic Local Search
verfasst von
Tong Yu
Branislav Kveton
Ole J. Mengshoel
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-71249-9_30