Skip to main content

2021 | OriginalPaper | Buchkapitel

An Empirical Analysis of Integrating Feature Extraction to Automated Machine Learning Pipeline

verfasst von : Hassan Eldeeb, Shota Amashukeli, Radwa El Shawi

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Machine learning techniques and algorithms are employed in many application domains such as financial applications, recommendation systems, medical diagnosis systems, and self-driving cars. They play a crucial role in harnessing the power of Big Data being produced every day in our digital world. In general, building a well-performing machine learning pipeline is an iterative and complex process that requires a solid understanding of various techniques that can be used in each component of the machine learning pipeline. Feature engineering (FE) is one of the most time-consuming steps in building machine learning pipelines. It requires a deep understanding of the domain and data exploration to discover relevant hand-crafted features from raw data. In this work, we empirically evaluate the impact of integrating an automated feature extraction tool (AutoFeat) into two automated machine learning frameworks, namely, Auto-Sklearn and TPOT, on their predictive performance. Besides, we discuss the limitations of AutoFeat that need to be addressed in order to improve the predictive performance of the automated machine learning frameworks on real-world datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Fan, W., et al.: Generalized and heuristic-free feature construction for improved accuracy. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 629–640. SIAM (2010) Fan, W., et al.: Generalized and heuristic-free feature construction for improved accuracy. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 629–640. SIAM (2010)
3.
Zurück zum Zitat Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: The next generation (2020). arXiv preprint arXiv:2007.04074 Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: The next generation (2020). arXiv preprint arXiv:​2007.​04074
4.
Zurück zum Zitat Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015) Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
5.
Zurück zum Zitat Gaudel, R., Sebag, M.: Feature selection as a one-player game (2010) Gaudel, R., Sebag, M.: Feature selection as a one-player game (2010)
8.
Zurück zum Zitat Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, 19–21 October 2015, pp. 1–10. IEEE (2015) Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, 19–21 October 2015, pp. 1–10. IEEE (2015)
9.
Zurück zum Zitat Katz, G., Shin, E.C.R., Song, D.: Explorekit: automatic feature generation and selection. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 979–984. IEEE (2016) Katz, G., Shin, E.C.R., Song, D.: Explorekit: automatic feature generation and selection. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 979–984. IEEE (2016)
10.
Zurück zum Zitat Kaul, A., Maheshwary, S., Pudi, V.: Autolearn–automated feature generation and selection. In: 2017 IEEE International Conference on data mining (ICDM), pp. 217–226. IEEE (2017) Kaul, A., Maheshwary, S., Pudi, V.: Autolearn–automated feature generation and selection. In: 2017 IEEE International Conference on data mining (ICDM), pp. 217–226. IEEE (2017)
11.
Zurück zum Zitat Khurana, U., Samulowitz, H., Turaga, D.: Feature engineering for predictive modeling using reinforcement learning. arXiv preprint arXiv:1709.07150 (2017) Khurana, U., Samulowitz, H., Turaga, D.: Feature engineering for predictive modeling using reinforcement learning. arXiv preprint arXiv:​1709.​07150 (2017)
12.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef
13.
Zurück zum Zitat Markovitch, S., Rosenstein, D.: Feature generation using general constructor functions. Mach. Learn. 49(1), 59–98 (2002)CrossRef Markovitch, S., Rosenstein, D.: Feature generation using general constructor functions. Mach. Learn. 49(1), 59–98 (2002)CrossRef
14.
Zurück zum Zitat Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw. 12(2), 181–201 (2001)CrossRef Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw. 12(2), 181–201 (2001)CrossRef
15.
Zurück zum Zitat Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Proceedings of the Workshop on Automatic Machine Learning (2016) Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Proceedings of the Workshop on Automatic Machine Learning (2016)
16.
Zurück zum Zitat Piramuthu, S., Sikora, R.T.: Iterative feature construction for improving inductive learning algorithms. Exp. Syst. Appl. 36(2), 3401–3406 (2009)CrossRef Piramuthu, S., Sikora, R.T.: Iterative feature construction for improving inductive learning algorithms. Exp. Syst. Appl. 36(2), 3401–3406 (2009)CrossRef
19.
Zurück zum Zitat Tonekaboni, S., Joshi, S., McCradden, M.D., Goldenberg, A.: What clinicians want: contextualizing explainable machine learning for clinical end use (2019). arXiv preprint arXiv:1905.05134 Tonekaboni, S., Joshi, S., McCradden, M.D., Goldenberg, A.: What clinicians want: contextualizing explainable machine learning for clinical end use (2019). arXiv preprint arXiv:​1905.​05134
20.
Zurück zum Zitat Wang, R., Fu, B., Fu, G., Wang, M.: Deep & cross network for ad click predictions. In: Proceedings of the ADKDD 2017, pp. 1–7 (2017) Wang, R., Fu, B., Fu, G., Wang, M.: Deep & cross network for ad click predictions. In: Proceedings of the ADKDD 2017, pp. 1–7 (2017)
21.
Zurück zum Zitat Zhang, J., Hao, J., Fogelman-Soulié, F., Wang, Z.: Automatic feature engineering by deep reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2312–2314 (2019) Zhang, J., Hao, J., Fogelman-Soulié, F., Wang, Z.: Automatic feature engineering by deep reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2312–2314 (2019)
22.
Zurück zum Zitat Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks (2019) Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks (2019)
Metadaten
Titel
An Empirical Analysis of Integrating Feature Extraction to Automated Machine Learning Pipeline
verfasst von
Hassan Eldeeb
Shota Amashukeli
Radwa El Shawi
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-68799-1_24

Premium Partner