Skip to main content
Top

2016 | OriginalPaper | Chapter

Towards Automatic Composition of Multicomponent Predictive Systems

Authors : Manuel Martin Salvador, Marcin Budka, Bogdan Gabrys

Published in: Hybrid Artificial Intelligent Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Automatic composition and parametrisation of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps is a challenging task. In this paper we propose and describe an extension to the Auto-WEKA software which now allows to compose and optimise such flexible MCPSs by using a sequence of WEKA methods. In the experimental analysis we focus on examining the impact of significantly extending the search space by incorporating additional hyperparameters of the models, on the quality of the found solutions. In a range of extensive experiments three different optimisation strategies are used to automatically compose MCPSs on 21 publicly available datasets. A comparison with previous work indicates that extending the search space improves the classification accuracy in the majority of the cases. The diversity of the found MCPSs are also an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. This can have a big impact on high quality predictive models development, maintenance and scalability aspects needed in modern application and deployment scenarios.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann, San Francisco (1999) Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann, San Francisco (1999)
2.
go back to reference Linoff, G.S., Berry, M.J.A.: Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley (2011). ISBN: 978-0-470-65093-6 Linoff, G.S., Berry, M.J.A.: Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley (2011). ISBN: 978-0-470-65093-6
3.
go back to reference Teichmann, E., Demir, E., Chaussalet, T.: Data preparation for clinical data mining to identify patients at risk of readmission. In: IEEE 23rd International Symposium on Computer-Based Medical Systems, pp. 184–189 (2010) Teichmann, E., Demir, E., Chaussalet, T.: Data preparation for clinical data mining to identify patients at risk of readmission. In: IEEE 23rd International Symposium on Computer-Based Medical Systems, pp. 184–189 (2010)
4.
go back to reference Zhao, J., Wang, T.: A general framework for medical data mining. In: Future Information Technology and Management Engineering, pp. 163–165 (2010) Zhao, J., Wang, T.: A general framework for medical data mining. In: Future Information Technology and Management Engineering, pp. 163–165 (2010)
5.
go back to reference Messaoud, I., El Abed, H., Märgner, V., Amiri, H.: A design of a preprocessing framework for large database of historical documents. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, pp. 177–183 (2011) Messaoud, I., El Abed, H., Märgner, V., Amiri, H.: A design of a preprocessing framework for large database of historical documents. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, pp. 177–183 (2011)
6.
go back to reference Budka, M., Eastwood, M., Gabrys, B., Kadlec, P., Martin Salvador, M., Schwan, S., Tsakonas, A., Žliobaitė, I.: From sensor readings to predictions: on the process of developing practical soft sensors. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds.) IDA 2014. LNCS, vol. 8819, pp. 49–60. Springer, Heidelberg (2014) Budka, M., Eastwood, M., Gabrys, B., Kadlec, P., Martin Salvador, M., Schwan, S., Tsakonas, A., Žliobaitė, I.: From sensor readings to predictions: on the process of developing practical soft sensors. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds.) IDA 2014. LNCS, vol. 8819, pp. 49–60. Springer, Heidelberg (2014)
7.
go back to reference Leite, R., Brazdil, P., Vanschoren, J.: Selecting classification algorithms with active testing. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 117–131. Springer, Heidelberg (2012)CrossRef Leite, R., Brazdil, P., Vanschoren, J.: Selecting classification algorithms with active testing. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 117–131. Springer, Heidelberg (2012)CrossRef
8.
go back to reference Lemke, C., Gabrys, B.: Meta-learning for time series forecasting and forecast combination. Neurocomputing 73(10–12), 2006–2016 (2010)CrossRef Lemke, C., Gabrys, B.: Meta-learning for time series forecasting and forecast combination. Neurocomputing 73(10–12), 2006–2016 (2010)CrossRef
9.
go back to reference MacQuarrie, A., Tsai, C.L.: Regression and Time Series Model Selection. World Scientific (1998). ISBN: 978-981-02-3242-9 MacQuarrie, A., Tsai, C.L.: Regression and Time Series Model Selection. World Scientific (1998). ISBN: 978-981-02-3242-9
10.
go back to reference Bengio, Y.: Gradient-based optimization of hyperparameters. Neural Comput. 12(8), 1889–1900 (2000)CrossRef Bengio, Y.: Gradient-based optimization of hyperparameters. Neural Comput. 12(8), 1889–1900 (2000)CrossRef
11.
go back to reference Guo, X.C., Yang, J.H., Wu, C.G., Wang, C.Y., Liang, Y.C.: A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71, 3211–3215 (2008)CrossRef Guo, X.C., Yang, J.H., Wu, C.G., Wang, C.Y., Liang, Y.C.: A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71, 3211–3215 (2008)CrossRef
12.
go back to reference Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetMATH Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetMATH
13.
go back to reference Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD, pp. 847–855 (2013) Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD, pp. 847–855 (2013)
14.
go back to reference Brochu, E., Cora, V.M., de Freitas, N.: A Tutorial on Bayesian Optimization of Expensive Cost Functions with Application to Active User Modeling and Hierarchical Reinforcement Learning. Technical report, University of British Columbia, Department of Computer Science (2010) Brochu, E., Cora, V.M., de Freitas, N.: A Tutorial on Bayesian Optimization of Expensive Cost Functions with Application to Active User Modeling and Hierarchical Reinforcement Learning. Technical report, University of British Columbia, Department of Computer Science (2010)
15.
go back to reference Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)CrossRef Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)CrossRef
16.
go back to reference Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B.: Algorithms for hyper-parameter optimization. In: Advances in NIPS, vol. 24, pp. 1–9 (2011) Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B.: Algorithms for hyper-parameter optimization. In: Advances in NIPS, vol. 24, pp. 1–9 (2011)
17.
go back to reference Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in NIPS, vol. 25, pp. 2960–2968 (2012) Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in NIPS, vol. 25, pp. 2960–2968 (2012)
18.
go back to reference Eggensperger, K., Feurer, M., Hutter, F.: Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In: NIPS Workshop on Bayesian Optimization in Theory and Practice, pp. 1–5 (2013) Eggensperger, K., Feurer, M., Hutter, F.: Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In: NIPS Workshop on Bayesian Optimization in Theory and Practice, pp. 1–5 (2013)
19.
go back to reference Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Methods for improving bayesian optimization for AutoML. In: ICML (2015) Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Methods for improving bayesian optimization for AutoML. In: ICML (2015)
20.
go back to reference Serban, F., Vanschoren, J., Kietz, J.U., Bernstein, A.: A survey of intelligent assistants for data analysis. ACM Comput. Surv. 45(3), 1–35 (2013)CrossRef Serban, F., Vanschoren, J., Kietz, J.U., Bernstein, A.: A survey of intelligent assistants for data analysis. ACM Comput. Surv. 45(3), 1–35 (2013)CrossRef
21.
go back to reference Feurer, M., Springenberg, J.T., Hutter, F.: Using meta-learning to initialize bayesian optimization of hyperparameters. In: Proceedings of the Meta-Learning and Algorithm Selection Workshop at ECAI, pp. 3–10 (2014) Feurer, M., Springenberg, J.T., Hutter, F.: Using meta-learning to initialize bayesian optimization of hyperparameters. In: Proceedings of the Meta-Learning and Algorithm Selection Workshop at ECAI, pp. 3–10 (2014)
22.
go back to reference Swersky, K., Snoek, J., Adams, R.P.: Multi-task bayesian optimization. In: Advances in NIPS, vol. 26, pp. 2004–2012 (2013) Swersky, K., Snoek, J., Adams, R.P.: Multi-task bayesian optimization. In: Advances in NIPS, vol. 26, pp. 2004–2012 (2013)
23.
go back to reference Eggensperger, K., Hutter, F., Hoos, H.H., Leyton-brown, K.: Efficient benchmarking of hyperparameter optimizers via surrogates background: hyperparameter optimization. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1114–1120 (2012) Eggensperger, K., Hutter, F., Hoos, H.H., Leyton-brown, K.: Efficient benchmarking of hyperparameter optimizers via surrogates background: hyperparameter optimization. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1114–1120 (2012)
24.
go back to reference Al-Jubouri, B., Gabrys, B.: Multicriteria approaches for predictive model generation: a comparative experimental study. In: IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making, pp. 64–71 (2014) Al-Jubouri, B., Gabrys, B.: Multicriteria approaches for predictive model generation: a comparative experimental study. In: IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making, pp. 64–71 (2014)
25.
go back to reference Budka, M., Gabrys, B.: Density-preserving sampling: robust and efficient alternative to cross-validation for error estimation. IEEE Trans. Neural Netw. Learn. Syst. 24(1), 22–34 (2013)CrossRef Budka, M., Gabrys, B.: Density-preserving sampling: robust and efficient alternative to cross-validation for error estimation. IEEE Trans. Neural Netw. Learn. Syst. 24(1), 22–34 (2013)CrossRef
Metadata
Title
Towards Automatic Composition of Multicomponent Predictive Systems
Authors
Manuel Martin Salvador
Marcin Budka
Bogdan Gabrys
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-32034-2_3

Premium Partner