nach oben

Erschienen in:

2021 | OriginalPaper | Buchkapitel

An Empirical Analysis of Integrating Feature Extraction to Automated Machine Learning Pipeline

verfasst von : Hassan Eldeeb, Shota Amashukeli, Radwa El Shawi

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Machine learning techniques and algorithms are employed in many application domains such as financial applications, recommendation systems, medical diagnosis systems, and self-driving cars. They play a crucial role in harnessing the power of Big Data being produced every day in our digital world. In general, building a well-performing machine learning pipeline is an iterative and complex process that requires a solid understanding of various techniques that can be used in each component of the machine learning pipeline. Feature engineering (FE) is one of the most time-consuming steps in building machine learning pipelines. It requires a deep understanding of the domain and data exploration to discover relevant hand-crafted features from raw data. In this work, we empirically evaluate the impact of integrating an automated feature extraction tool (AutoFeat) into two automated machine learning frameworks, namely, Auto-Sklearn and TPOT, on their predictive performance. Besides, we discuss the limitations of AutoFeat that need to be addressed in order to improve the predictive performance of the automated machine learning frameworks on real-world datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Neural Network Model for Lead Optimization of MMP12 Inhibitors

Nächstes Kapitel Input-Aware Neural Knowledge Tracing Machine

https://medium.com/kaggle-blog/grupo-bimbo-inventory-demand-winners-interview-clustifier-alex-andrey-1e3b6cec8a20.

https://github.com/DataSystemsGroupUT/auto_feature_engineering.

https://scikit-learn.org/.

https://www.openml.org/d/1107.

Bengio, Y.: Deep learning of representations: looking forward. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS (LNAI), vol. 7978, pp. 1–37. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39593-2_1CrossRef

Fan, W., et al.: Generalized and heuristic-free feature construction for improved accuracy. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 629–640. SIAM (2010)

Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: The next generation (2020). arXiv preprint arXiv:2007.04074

Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)

Gaudel, R., Sebag, M.: Feature selection as a one-player game (2010)

He, X., Zhao, K., Chu, X.: Automl: A survey of the state-of-the-art. arXiv preprint arXiv:1908.00709 (2019)

Horn, F., Pack, R., Rieger, M.: The autofeat python library for automated feature engineering and selection. In: Cellier, P., Driessens, K. (eds.) ECML PKDD 2019. CCIS, vol. 1167, pp. 111–120. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43823-4_10CrossRef

Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, 19–21 October 2015, pp. 1–10. IEEE (2015)

Katz, G., Shin, E.C.R., Song, D.: Explorekit: automatic feature generation and selection. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 979–984. IEEE (2016)

10.

Kaul, A., Maheshwary, S., Pudi, V.: Autolearn–automated feature generation and selection. In: 2017 IEEE International Conference on data mining (ICDM), pp. 217–226. IEEE (2017)

11.

Khurana, U., Samulowitz, H., Turaga, D.: Feature engineering for predictive modeling using reinforcement learning. arXiv preprint arXiv:1709.07150 (2017)

12.

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef

13.

Markovitch, S., Rosenstein, D.: Feature generation using general constructor functions. Mach. Learn. 49(1), 59–98 (2002)CrossRef

14.

Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw. 12(2), 181–201 (2001)CrossRef

15.

Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Proceedings of the Workshop on Automatic Machine Learning (2016)

16.

Piramuthu, S., Sikora, R.T.: Iterative feature construction for improving inductive learning algorithms. Exp. Syst. Appl. 36(2), 3401–3406 (2009)CrossRef

17.

Shawi, R.E., Maher, M., Sakr, S.: Automated machine learning: state-of-the-art and open challenges. CoRR abs/1906.02287 (2019). http://arxiv.org/abs/1906.02287

18.

Thornton, C., Hutter, F., Hoos, H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: KDD (2012). https://doi.org/10.1145/2487575.2487629

19.

Tonekaboni, S., Joshi, S., McCradden, M.D., Goldenberg, A.: What clinicians want: contextualizing explainable machine learning for clinical end use (2019). arXiv preprint arXiv:1905.05134

20.

Wang, R., Fu, B., Fu, G., Wang, M.: Deep & cross network for ad click predictions. In: Proceedings of the ADKDD 2017, pp. 1–7 (2017)

21.

Zhang, J., Hao, J., Fogelman-Soulié, F., Wang, Z.: Automatic feature engineering by deep reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2312–2314 (2019)

22.

Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks (2019)

Titel: An Empirical Analysis of Integrating Feature Extraction to Automated Machine Learning Pipeline
verfasst von: Hassan Eldeeb
Shota Amashukeli
Radwa El Shawi
Verlag: Springer International Publishing
Buch: Pattern Recognition. ICPR International Workshops and Challenges
Print ISBN: 978-3-030-68798-4

Electronic ISBN: 978-3-030-68799-1

Copyright-Jahr: 2021
DOI: https://doi.org/10.1007/978-3-030-68799-1_24

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner