Skip to main content
Top

2023 | OriginalPaper | Chapter

Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback

Authors : Junfan Li, Shizhong Liao

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a \(O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})\) expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a \(O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum ^K_{i=1}L_T(f^*_i))^{\frac{2}{3}})\) expected bound where \(L_T(f^*_i)\) is the cumulative losses of optimal hypothesis in \(\mathbb {H}_{i} =\{f\in \mathcal {H}_i:\Vert f\Vert _{\mathcal {H}_i}\le U\}\). The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a \(O(U\sqrt{KT}\ln ^{\frac{2}{3}}{T})\) expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous \(O(\sqrt{T\ln {K}} +\Vert f\Vert ^2_{\mathcal {H}_i}\max \{\sqrt{T},\frac{T}{\sqrt{\mathcal {R}}}\})\) expected bound where \(\mathcal {R}\) is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
\(\textrm{poly}(\Vert f\Vert _{\mathcal {H}_i})=\Vert f\Vert ^2_{\mathcal {H}_i}+1\). The original paper shows a \(O((\Vert f\Vert ^2_{\mathcal {H}_i}+1)\sqrt{KT})\) expected regret bound. We will clarify the difference in Sect. 2.
 
Literature
1.
go back to reference Agarwal, A., Luo, H., Neyshabur, B., Schapire, R.E.: Corralling a band of bandit algorithms. In: Proceedings of the 30th Conference on Learning Theory, pp. 12–38 (2017) Agarwal, A., Luo, H., Neyshabur, B., Schapire, R.E.: Corralling a band of bandit algorithms. In: Proceedings of the 30th Conference on Learning Theory, pp. 12–38 (2017)
2.
go back to reference Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, pp. 217–226 (2009) Audibert, J., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the 22nd Annual Conference on Learning Theory, pp. 217–226 (2009)
3.
go back to reference Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)MathSciNetCrossRefMATH Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)MathSciNetCrossRefMATH
4.
go back to reference Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRefMATH Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRefMATH
5.
go back to reference Foster, D.J., Kale, S., Mohri, M., Sridharan, K.: Parameter-free online learning via model selection. Adv. Neural. Inf. Process. Syst. 30, 6022–6032 (2017) Foster, D.J., Kale, S., Mohri, M., Sridharan, K.: Parameter-free online learning via model selection. Adv. Neural. Inf. Process. Syst. 30, 6022–6032 (2017)
6.
go back to reference Ghari, P.M., Shen, Y.: Online multi-kernel learning with graph-structured feedback. In: Proceedings of the 37th International Conference on Machine Learning, pp. 3474–3483 (2020) Ghari, P.M., Shen, Y.: Online multi-kernel learning with graph-structured feedback. In: Proceedings of the 37th International Conference on Machine Learning, pp. 3474–3483 (2020)
8.
go back to reference Le, Q.V., Sarlós, T., Smola, A.J.: Fastfood-computing hilbert space expansions in loglinear time. In: Proceedings of the 30th International Conference on Machine Learning, pp. 244–252 (2013) Le, Q.V., Sarlós, T., Smola, A.J.: Fastfood-computing hilbert space expansions in loglinear time. In: Proceedings of the 30th International Conference on Machine Learning, pp. 244–252 (2013)
9.
go back to reference Li, J., Liao, S.: Worst-case regret analysis of computationally budgeted online kernel selection. Mach. Learn. 111(3), 937–976 (2022)MathSciNetCrossRefMATH Li, J., Liao, S.: Worst-case regret analysis of computationally budgeted online kernel selection. Mach. Learn. 111(3), 937–976 (2022)MathSciNetCrossRefMATH
10.
go back to reference Li, Z., Ton, J.F., Oglic, D., Sejdinovic, D.: Towards a unified analysis of random Fourier features. In: Proceedings of the 36th International Conference on Machine Learning, pp. 3905–3914 (2019) Li, Z., Ton, J.F., Oglic, D., Sejdinovic, D.: Towards a unified analysis of random Fourier features. In: Proceedings of the 36th International Conference on Machine Learning, pp. 3905–3914 (2019)
11.
go back to reference Liao, S., Li, J.: High-probability kernel alignment regret bounds for online kernel selection. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 67–83 (2021) Liao, S., Li, J.: High-probability kernel alignment regret bounds for online kernel selection. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 67–83 (2021)
12.
go back to reference Rahimi, A., Recht, B.: Random features for large-scale kernel machines. Adv. Neural. Inf. Process. Syst. 20, 1177–1184 (2007) Rahimi, A., Recht, B.: Random features for large-scale kernel machines. Adv. Neural. Inf. Process. Syst. 20, 1177–1184 (2007)
13.
go back to reference Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. Adv. Neural. Inf. Process. Syst. 21, 1313–1320 (2008) Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. Adv. Neural. Inf. Process. Syst. 21, 1313–1320 (2008)
14.
go back to reference Sahoo, D., Hoi, S.C.H., Li, B.: Online multiple kernel regression. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 293–302 (2014) Sahoo, D., Hoi, S.C.H., Li, B.: Online multiple kernel regression. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 293–302 (2014)
15.
go back to reference Seldin, Y., Bartlett, P.L., Crammer, K., Abbasi-Yadkori, Y.: Prediction with limited advice and multiarmed bandits with paid observations. In: Proceedings of the 31st International Conference on Machine Learning, pp. 280–287 (2014) Seldin, Y., Bartlett, P.L., Crammer, K., Abbasi-Yadkori, Y.: Prediction with limited advice and multiarmed bandits with paid observations. In: Proceedings of the 31st International Conference on Machine Learning, pp. 280–287 (2014)
16.
go back to reference Shen, Y., Chen, T., Giannakis, G.B.: Random feature-based online multi-kernel learning in environments with unknown dynamics. J. Mach. Learn. Res. 20(22), 1–36 (2019)MathSciNetMATH Shen, Y., Chen, T., Giannakis, G.B.: Random feature-based online multi-kernel learning in environments with unknown dynamics. J. Mach. Learn. Res. 20(22), 1–36 (2019)MathSciNetMATH
18.
go back to reference Wang, J., Hoi, S.C.H., Zhao, P., Zhuang, J., Liu, Z.: Large scale online kernel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1750–1756 (2013) Wang, J., Hoi, S.C.H., Zhao, P., Zhuang, J., Liu, Z.: Large scale online kernel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1750–1756 (2013)
19.
go back to reference Yang, T., Mahdavi, M., Jin, R., Yi, J., Hoi, S.C.H.: Online kernel selection: algorithms and evaluations. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1197–1202 (2012) Yang, T., Mahdavi, M., Jin, R., Yi, J., Hoi, S.C.H.: Online kernel selection: algorithms and evaluations. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1197–1202 (2012)
20.
go back to reference Yu, F.X., Suresh, A.T., Choromanski, K.M., Holtmann-Rice, D.N., Kumar, S.: Orthogonal random features. Adv. Neural. Inf. Process. Syst. 29, 1975–1983 (2016) Yu, F.X., Suresh, A.T., Choromanski, K.M., Holtmann-Rice, D.N., Kumar, S.: Orthogonal random features. Adv. Neural. Inf. Process. Syst. 29, 1975–1983 (2016)
21.
go back to reference Zhang, L., Yi, J., Jin, R., Lin, M., He, X.: Online kernel learning with a near optimal sparsity bound. In: Proceedings of the 30th International Conference on Machine Learning, pp. 621–629 (2013) Zhang, L., Yi, J., Jin, R., Lin, M., He, X.: Online kernel learning with a near optimal sparsity bound. In: Proceedings of the 30th International Conference on Machine Learning, pp. 621–629 (2013)
22.
go back to reference Zhang, X., Liao, S.: Online kernel selection via incremental sketched kernel alignment. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 3118–3124 (2018) Zhang, X., Liao, S.: Online kernel selection via incremental sketched kernel alignment. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 3118–3124 (2018)
23.
go back to reference Zimmert, J., Seldin, Y.: An optimal algorithm for stochastic and adversarial bandits. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, pp. 467–475 (2019) Zimmert, J., Seldin, Y.: An optimal algorithm for stochastic and adversarial bandits. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, pp. 467–475 (2019)
Metadata
Title
Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback
Authors
Junfan Li
Shizhong Liao
Copyright Year
2023
DOI
https://doi.org/10.1007/978-3-031-26412-2_21

Premium Partner