Skip to main content

2024 | OriginalPaper | Buchkapitel

Turning Logs into Lumber: Preprocessing Tasks in Process Mining

verfasst von : Ying Liu, Vinicius Stein Dani, Iris Beerepoot, Xixi Lu

Erschienen in: Process Mining Workshops

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Event logs are invaluable for conducting process mining projects, offering insights into process improvement and data-driven decision-making. However, data quality issues affect the correctness and trustworthiness of these insights, making preprocessing tasks a necessity. Despite the recognized importance, the execution of preprocessing tasks remains ad-hoc, lacking support. This paper presents a systematic literature review that establishes a comprehensive repository of preprocessing tasks and their usage in case studies. We identify six high-level and 20 low-level preprocessing tasks in case studies. Log filtering, transformation, and abstraction are commonly used, while log enriching, integration, and reduction are less frequent. These results can be considered a first step in contributing to more structured, transparent event log preprocessing, enhancing process mining reliability.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Benevento, E., Aloini, D., van der Aalst, W.M.: How can interactive process discovery address data quality issues in real business settings? Evidence from a case study in healthcare. J. Biomed. Inform. 130, 104083 (2022) Benevento, E., Aloini, D., van der Aalst, W.M.: How can interactive process discovery address data quality issues in real business settings? Evidence from a case study in healthcare. J. Biomed. Inform. 130, 104083 (2022)
3.
Zurück zum Zitat Birk, A., Wilhelm, Y., Dreher, S., Flack, C., Reimann, P., Gröger, C.: A real-world application of process mining for data-driven analysis of multi-level interlinked manufacturing processes. Procedia CIRP 104, 417–422 (2021)CrossRef Birk, A., Wilhelm, Y., Dreher, S., Flack, C., Reimann, P., Gröger, C.: A real-world application of process mining for data-driven analysis of multi-level interlinked manufacturing processes. Procedia CIRP 104, 417–422 (2021)CrossRef
4.
Zurück zum Zitat Cenka, B.A.N., Santoso, H.B., Junus, K.: Analysing student behaviour in a learning management system using a process mining approach. Knowl. Manage. E-Learn.: Int. J. 14, 62–80 (2022) Cenka, B.A.N., Santoso, H.B., Junus, K.: Analysing student behaviour in a learning management system using a process mining approach. Knowl. Manage. E-Learn.: Int. J. 14, 62–80 (2022)
5.
Zurück zum Zitat Chen, L., Klasky, H.B.: Six machine-learning methods for predicting hospital-stay duration for patients with sepsis: a comparative study. In: SoutheastCon 2022. IEEE (2022) Chen, L., Klasky, H.B.: Six machine-learning methods for predicting hospital-stay duration for patients with sepsis: a comparative study. In: SoutheastCon 2022. IEEE (2022)
6.
Zurück zum Zitat Chen, Q., Lu, Y., Tam, C.S., Poon, S.K.: A multi-view framework to detect redundant activity labels for more representative event logs in process mining. Future Internet 14(6), 181 (2022)CrossRef Chen, Q., Lu, Y., Tam, C.S., Poon, S.K.: A multi-view framework to detect redundant activity labels for more representative event logs in process mining. Future Internet 14(6), 181 (2022)CrossRef
7.
Zurück zum Zitat Cho, M., Park, G., Song, M., Lee, J., Lee, B., Kum, E.: Discovery of resource-oriented transition systems for yield enhancement in semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 34(1), 17–24 (2020)CrossRef Cho, M., Park, G., Song, M., Lee, J., Lee, B., Kum, E.: Discovery of resource-oriented transition systems for yield enhancement in semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 34(1), 17–24 (2020)CrossRef
8.
Zurück zum Zitat Dogan, O.: A process-centric performance management in a call center. Appl. Intell. 53(3), 3304–3317 (2022)CrossRef Dogan, O.: A process-centric performance management in a call center. Appl. Intell. 53(3), 3304–3317 (2022)CrossRef
9.
Zurück zum Zitat Du, L., Cheng, L., Liu, C.: Process mining for wind turbine maintenance process analysis: a case study. In: 2021 IEEE 5th Conference on Energy Internet and Energy System Integration (EI2). IEEE (2021) Du, L., Cheng, L., Liu, C.: Process mining for wind turbine maintenance process analysis: a case study. In: 2021 IEEE 5th Conference on Energy Internet and Energy System Integration (EI2). IEEE (2021)
11.
Zurück zum Zitat Esposito, L., Leotta, F., Mecella, M., Veneruso, S.: Unsupervised segmentation of smart home logs for human habit discovery. In: 2022 18th International Conference on Intelligent Environments (IE). IEEE (2022) Esposito, L., Leotta, F., Mecella, M., Veneruso, S.: Unsupervised segmentation of smart home logs for human habit discovery. In: 2022 18th International Conference on Intelligent Environments (IE). IEEE (2022)
12.
Zurück zum Zitat Fahland, D.: Extracting and pre-processing event logs (2022) Fahland, D.: Extracting and pre-processing event logs (2022)
13.
Zurück zum Zitat Fahrenkrog-Petersen, S.A., et al.: Fire now, fire later: alarm-based systems for prescriptive process monitoring. Knowl. Inf. Syst. 64(2), 559–587 (2021)CrossRef Fahrenkrog-Petersen, S.A., et al.: Fire now, fire later: alarm-based systems for prescriptive process monitoring. Knowl. Inf. Syst. 64(2), 559–587 (2021)CrossRef
14.
Zurück zum Zitat Gao, W., Wu, C., Huang, W., Lin, B., Su, X.: A data structure for studying 3D modeling design behavior based on event logs. Autom. Constr. 132, 103967 (2021) Gao, W., Wu, C., Huang, W., Lin, B., Su, X.: A data structure for studying 3D modeling design behavior based on event logs. Autom. Constr. 132, 103967 (2021)
15.
Zurück zum Zitat Goel, K., Leemans, S., Wynn, M.T., ter Hofstede, A., Barnes, J.: Improving PhD student journeys with process mining: insights from a higher education institution. In: Proceedings of the Industry Forum (BPM IF 2021) Co-located with 19th International Conference on Business Process Management (BPM 2021), pp. 39–49 (2021) Goel, K., Leemans, S., Wynn, M.T., ter Hofstede, A., Barnes, J.: Improving PhD student journeys with process mining: insights from a higher education institution. In: Proceedings of the Industry Forum (BPM IF 2021) Co-located with 19th International Conference on Business Process Management (BPM 2021), pp. 39–49 (2021)
16.
Zurück zum Zitat Han, J., Pei, J., Tong, H.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2022) Han, J., Pei, J., Tong, H.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2022)
17.
Zurück zum Zitat Huda, S., Aripin, Naufal, M.F., Yudianingtias, V.M.: Identification of fraud attributes for detecting fraud based online sales transaction. Indian J. Comput. Sci. Eng. 12(5), 1409–1424 (2021) Huda, S., Aripin, Naufal, M.F., Yudianingtias, V.M.: Identification of fraud attributes for detecting fraud based online sales transaction. Indian J. Comput. Sci. Eng. 12(5), 1409–1424 (2021)
18.
Zurück zum Zitat van Hulzen, G.A., Li, C.Y., Martin, N., van Zelst, S.J., Depaire, B.: Mining context-aware resource profiles in the presence of multitasking. Artif. Intell. Med. 134, 102434 (2022) van Hulzen, G.A., Li, C.Y., Martin, N., van Zelst, S.J., Depaire, B.: Mining context-aware resource profiles in the presence of multitasking. Artif. Intell. Med. 134, 102434 (2022)
19.
Zurück zum Zitat Kitchenham, B., Brereton, O.P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009)CrossRef Kitchenham, B., Brereton, O.P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51(1), 7–15 (2009)CrossRef
20.
Zurück zum Zitat Lamghari, Z.: Process mining: a new approach for simplifying the process model control flow visualization. Transdisc. J. Eng. Sci. 13 (2022) Lamghari, Z.: Process mining: a new approach for simplifying the process model control flow visualization. Transdisc. J. Eng. Sci. 13 (2022)
22.
Zurück zum Zitat Lim, J., et al.: Assessment of the feasibility of developing a clinical pathway using a clinical order log. J. Biomed. Inform. 128, 104038 (2022) Lim, J., et al.: Assessment of the feasibility of developing a clinical pathway using a clinical order log. J. Biomed. Inform. 128, 104038 (2022)
24.
Zurück zum Zitat Marin-Castro, H.M., Tello-Leal, E.: Event log preprocessing for process mining: a review. Appl. Sci. 11(22), 10556 (2021) Marin-Castro, H.M., Tello-Leal, E.: Event log preprocessing for process mining: a review. Appl. Sci. 11(22), 10556 (2021)
25.
Zurück zum Zitat Mivule, K.: Utilizing noise addition for data privacy, an overview (2013) Mivule, K.: Utilizing noise addition for data privacy, an overview (2013)
26.
Zurück zum Zitat Pan, Y., Zhang, L.: Automated process discovery from event logs in BIM construction projects. Autom. Constr. 127, 103713 (2021) Pan, Y., Zhang, L.: Automated process discovery from event logs in BIM construction projects. Autom. Constr. 127, 103713 (2021)
27.
Zurück zum Zitat Pang, J., et al.: Process mining framework with time perspective for understanding acute care: a case study of AIS in hospitals. BMC Med. Inform. Decis. Making 21(1), 1–10 (2021)CrossRef Pang, J., et al.: Process mining framework with time perspective for understanding acute care: a case study of AIS in hospitals. BMC Med. Inform. Decis. Making 21(1), 1–10 (2021)CrossRef
28.
Zurück zum Zitat Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M.: Systematic mapping studies in software engineering. In: EASE (2008) Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M.: Systematic mapping studies in software engineering. In: EASE (2008)
29.
Zurück zum Zitat Pradana, M.I.A., Kurniati, A.P., Wisudiawan, G.A.A.: Inductive miner implementation to improve healthcare efficiency on Indonesia national health insurance data. In: 2022 International Conference on Data Science and Its Applications (ICoDSA). IEEE (2022) Pradana, M.I.A., Kurniati, A.P., Wisudiawan, G.A.A.: Inductive miner implementation to improve healthcare efficiency on Indonesia national health insurance data. In: 2022 International Conference on Data Science and Its Applications (ICoDSA). IEEE (2022)
30.
Zurück zum Zitat Ramos-Gutiérrez, B., Varela-Vaca, Á.J., Galindo, J.A., Gómez-López, M.T., Benavides, D.: Discovering configuration workflows from existing logs using process mining. Empir. Softw. Eng. 26(1), 1–41 (2021)CrossRef Ramos-Gutiérrez, B., Varela-Vaca, Á.J., Galindo, J.A., Gómez-López, M.T., Benavides, D.: Discovering configuration workflows from existing logs using process mining. Empir. Softw. Eng. 26(1), 1–41 (2021)CrossRef
31.
Zurück zum Zitat Ridwanah, R.D., Andreswari, R., Fauzi, R.: Analysis and implementation of TELKOM university lecture business processes evaluation on heuristic miner algorithm: a process mining approach. In: ISMODE. IEEE (2022) Ridwanah, R.D., Andreswari, R., Fauzi, R.: Analysis and implementation of TELKOM university lecture business processes evaluation on heuristic miner algorithm: a process mining approach. In: ISMODE. IEEE (2022)
32.
Zurück zum Zitat Rismanchian, F., Kassani, S.H., Shavarani, S.M., Lee, Y.H.: A data-driven approach to support the understanding and improvement of patients’ journeys: a case study using electronic health records of an emergency department. Value Health 26(1), 18–27 (2023)CrossRef Rismanchian, F., Kassani, S.H., Shavarani, S.M., Lee, Y.H.: A data-driven approach to support the understanding and improvement of patients’ journeys: a case study using electronic health records of an emergency department. Value Health 26(1), 18–27 (2023)CrossRef
33.
Zurück zum Zitat Sohail, S.A., Bukhsh, F.A., van Keulen, M.: Multilevel privacy assurance evaluation of healthcare metadata. Appl. Sci. 11(22), 10686 (2021) Sohail, S.A., Bukhsh, F.A., van Keulen, M.: Multilevel privacy assurance evaluation of healthcare metadata. Appl. Sci. 11(22), 10686 (2021)
35.
Zurück zum Zitat Stephan, S., Lahann, J., Fettke, P.: A case study on the application of process mining in combination with journal entry tests for financial auditing (2021) Stephan, S., Lahann, J., Fettke, P.: A case study on the application of process mining in combination with journal entry tests for financial auditing (2021)
36.
Zurück zum Zitat Suriadi, S., Andrews, R., ter Hofstede, A.H.M., Wynn, M.T.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)CrossRef Suriadi, S., Andrews, R., ter Hofstede, A.H.M., Wynn, M.T.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)CrossRef
37.
Zurück zum Zitat Tang, J., Liu, Y., Lin, K., Li, L.: Process bottlenecks identification and its root cause analysis using fusion-based clustering and knowledge graph. Adv. Eng. Inform. 55, 101862 (2023) Tang, J., Liu, Y., Lin, K., Li, L.: Process bottlenecks identification and its root cause analysis using fusion-based clustering and knowledge graph. Adv. Eng. Inform. 55, 101862 (2023)
38.
Zurück zum Zitat Tariq, Z., Charles, D., McClean, S., McChesney, I., Taylor, P.: Anomaly detection for service-oriented business processes using conformance analysis. Algorithms 15(8), 257 (2022)CrossRef Tariq, Z., Charles, D., McClean, S., McChesney, I., Taylor, P.: Anomaly detection for service-oriented business processes using conformance analysis. Algorithms 15(8), 257 (2022)CrossRef
39.
Zurück zum Zitat Tavakoli-Zaniani, M., Gholamian, M.R., Hashemi-Golpayegani, S.A.: Improving heuristics miners for healthcare applications by discovering optimal dependency graphs. J. Supercomput. 78(18), 19628–19661 (2022)CrossRef Tavakoli-Zaniani, M., Gholamian, M.R., Hashemi-Golpayegani, S.A.: Improving heuristics miners for healthcare applications by discovering optimal dependency graphs. J. Supercomput. 78(18), 19628–19661 (2022)CrossRef
40.
Zurück zum Zitat van Zelst, S.J., Mannhardt, F., de Leoni, M., Koschmider, A.: Event abstraction in process mining: literature review and taxonomy. Granular Comput. 6(3), 719–736 (2021)CrossRef van Zelst, S.J., Mannhardt, F., de Leoni, M., Koschmider, A.: Event abstraction in process mining: literature review and taxonomy. Granular Comput. 6(3), 719–736 (2021)CrossRef
Metadaten
Titel
Turning Logs into Lumber: Preprocessing Tasks in Process Mining
verfasst von
Ying Liu
Vinicius Stein Dani
Iris Beerepoot
Xixi Lu
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-56107-8_8

Premium Partner