nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Thompson Sampling for Optimizing Stochastic Local Search

verfasst von : Tong Yu, Branislav Kveton, Ole J. Mengshoel

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Stochastic local search (SLS), like many other stochastic optimization algorithms, has several parameters that need to be optimized in order for the algorithm to find high quality solutions within a short amount of time. In this paper, we formulate a stochastic local search bandit (\(\mathtt{SLSB}\)), which is a novel learning variant of SLS based on multi-armed bandits. \(\mathtt{SLSB}\) optimizes SLS over a sequence of stochastic optimization problems and achieves high average cumulative reward. In \(\mathtt{SLSB}\), we study how SLS can be optimized via low degree polynomials in its noise and restart parameters. To determine the coefficients of the polynomials, we present polynomial Thompson Sampling (\(\mathtt{PolyTS}\)). We derive a regret bound for \(\mathtt{PolyTS}\) and validate its performance on synthetic problems of varying difficulty as well as on feature selection problems. Compared to bandits with no assumptions of the reward function and other parameter optimization approaches, our \(\mathtt{PolyTS}\) assuming polynomial structure can provide substantially better parameter optimization for SLS. In addition, due to its simple model update, \(\mathtt{PolyTS}\) has low computational cost compared to other SLS parameter optimization methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Speeding up Hyper-parameter Optimization by Extrapolation of Learning Curves Using Previous Builds

Nächstes Kapitel Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation

As a motivation for our focus on maximizing cumulative reward, consider parameter tuning for feature selection on a sequence of HAR datasets. In this application, it is important to provide usable predictions even in the initial steps of tuning feature selection, else users may stop using the HAR system.

For example, in HAR, f(A) is the activity recognition accuracy of a particular machine learning model trained on a HAR dataset, when using SLS with parameters A for feature selection.

We did not try other \(N_{\text {budget}}\) for \(\mathtt{{GPEIMCMC}}\) and \(\mathtt{{GPUCB}}\), because their computation time is prohibitive.

Breast Cancer, Wine and Australian can be found at https://archive.ics.uci.edu/ml/datasets.html.

Feature descriptions for HAR data 1 are in [2]. The features for HAR data 2 include mean, variance and FFT values of sensors signals for each 1 s sliding window.

If \(b = 0\ldots 0\) and \(\kappa = 20\), we have \(N_{\text {feature}} \le 20\). The reason is that after \(\kappa = 20\) SLS operations, the bit-string found by SLS can differ from the initial b in at most 20 bit.

Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: ICML-2013, pp. 127–135 (2013)

Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., Reyes Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: ESANN-2013, pp. 437–442 (2013)

Ansótegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the automatic configuration of algorithms. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 142–157. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04244-7_14 CrossRef

Audemard, G., Simon, L.: Refining restarts strategies for SAT and UNSAT. In: Milano, M. (ed.) CP 2012. LNCS, pp. 118–126. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33558-7_11 CrossRef

Audibert, J.Y., Bubeck, S., Munos, R.: Best arm identification in multi-armed bandits. In: COLT 2010, pp. 41–53 (2010)

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)CrossRefMATH

Battiti, R., Campigotto, P.: An investigation of reinforcement learning for reactive search optimization. In: Hamadi, Y., Monfroy, E., Saubion, F. (eds.) Autonomous Search, pp. 131–160. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21434-9_6 CrossRef

Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: GECCO-2002, pp. 11–18 (2002)

Cappé, O., Garivier, A., Maillard, O.A., Munos, R., Stoltz, G.: Kullback-Leibler upper confidence bounds for optimal sequential allocation. Ann. Stat. 41(3), 1516–1541 (2013)MathSciNetCrossRefMATH

10.

Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: NIPS, vol. 24, pp. 2249–2257 (2011)

11.

DaCosta, L., Fialho, A., Schoenauer, M., Sebag, M.: Adaptive operator selection with dynamic multi-armed bandits. In: GECCO-2008, pp. 913–920 (2008)

12.

Fialho, Á., Da Costa, L., Schoenauer, M., Sebag, M.: Analyzing bandit-based adaptive operator selection mechanisms. Ann. Math. Artif. Intell. 60(1–2), 25–64 (2010)MathSciNetCrossRefMATH

13.

Hoffman, M., Shahriari, B., Freitas, N.: On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In: Artificial Intelligence and Statistics, pp. 365–374 (2014)

14.

Hoos, H.H.: An adaptive noise mechanism for WalkSAT. In: AAAI-2002, pp. 655–660 (2002)

15.

Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations and Applications. Morgan Kaufmann, San Francisco (2005)MATH

16.

Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40 CrossRef

17.

Hutter, F., Hoos, H.H., Leyton-Brown, K., Murphy, K.: Time-bounded sequential parameter optimization. In: Blum, C., Battiti, R. (eds.) LION 2010. LNCS, vol. 6073, pp. 281–298. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13800-3_30 CrossRef

18.

Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: ParamILS: an automatic algorithm configuration framework. JAIR 36, 267–306 (2009)MATH

19.

Kleinberg, R.D.: Nearly tight bounds for the continuum-armed bandit problem. NIPS, vol. 17, pp. 697–704 (2004)

20.

Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1), 4–22 (1985)MathSciNetCrossRefMATH

21.

Li, K., Fialho, A., Kwong, S., Zhang, Q.: Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 18(1), 114–130 (2014)CrossRef

22.

Lockhart, J.W., Weiss, G.M.: The benefits of personalized smartphone-based activity recognition models. In: SDM-2014, pp. 614–622 (2014)

23.

López-Ibánez, M., Dubois-Lacoste, J., Stützle, T., Birattari, M.: The irace package: iterated racing for automatic algorithm configuration (2011)

24.

Maillard, O.-A., Munos, R.: Online learning in adversarial Lipschitz environments. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6322, pp. 305–320. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15883-4_20 CrossRef

25.

Mengshoel, O.J., Ahres, Y., Yu, T.: Markov chain analysis of noise and restart in stochastic local search. In: IJCAI, pp. 639–646 (2016)

26.

Mengshoel, O.J.: Understanding the role of noise in stochastic local search: analysis and experiments. Artif. Intell. 172(8), 955–990 (2008)MathSciNetCrossRefMATH

27.

Mengshoel, O.J., Wilkins, D.C., Roth, D.: Initialization and restart in stochastic local search: computing a most probable explanation in Bayesian networks. IEEE Trans. Knowl. Data Eng. 23(2), 235–247 (2011)CrossRef

28.

Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: NIPS, vol. 23, pp. 1813–1821 (2010)

29.

Pal, D.K., Mengshoel, O.J.: Stochastic CoSaMP: randomizing greedy pursuit for sparse signal recovery. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9851, pp. 761–776. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46128-1_48 CrossRef

30.

Prestwich, S.: Tuning local search by average-reward reinforcement learning. In: Maniezzo, V., Battiti, R., Watson, J.-P. (eds.) LION 2007. LNCS, vol. 5313, pp. 192–205. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-92695-5_15 CrossRef

31.

Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)CrossRefMATH

32.

Ryvchin, V., Strichman, O.: Local restarts in SAT. Constraint Program. Lett. (CPL) 4, 3–13 (2008)MATH

33.

Selman, B., Levesque, H., Mitchell, D.: A new method for solving hard satisfiability problems. In: AAAI, pp. 440–446 (1992)

34.

Shoaib, M., Scholten, H., Havinga, P.J.: Towards physical activity recognition using smartphone sensors. In: UIC/ATC-2013, pp. 80–87. IEEE (2013)

35.

Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, vol. 25, pp. 2951–2959 (2012)

36.

Srinivas, N., Krause, A., Seeger, M., Kakade, S.M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: ICML-2010, pp. 1015–1022 (2010)

37.

Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)CrossRefMATH

38.

Yu, M.C., Yu, T., Wang, S.C., Lin, C.J., Chang, E.Y.: Big data small footprint: the design of a low-power classifier for detecting transportation modes. Proc. VLDB Endow. 7(13), 1429–1440 (2014)CrossRef

39.

Yu, T., Zhuang, Y., Mengshoel, O.J., Yagan, O.: Hybridizing personal and impersonal machine learning models for activity recognition on mobile devices. In: MobiCASE-2016, pp. 117–126 (2016)

Titel: Thompson Sampling for Optimizing Stochastic Local Search
verfasst von: Tong Yu
Branislav Kveton
Ole J. Mengshoel
Verlag: Springer International Publishing
Buch: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-319-71248-2

Electronic ISBN: 978-3-319-71249-9

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-71249-9_30

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"