Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 1/2006

01.07.2006

A Rule-Based Approach for Process Discovery: Dealing with Noise and Imbalance in Process Logs

verfasst von: Laura Măruşter, A. J. M. M. (TON) Weijters, Wil M. P. Van Der Aalst, Antal Van Den Bosch

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 1/2006

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Effective information systems require the existence of explicit process models. A completely specified process design needs to be developed in order to enact a given business process. This development is time consuming and often subjective and incomplete. We propose a method that constructs the process model from process log data, by determining the relations between process tasks. To predict these relations, we employ machine learning technique to induce rule sets. These rule sets are induced from simulated process log data generated by varying process characteristics such as noise and log size. Tests reveal that the induced rule sets have a high predictive accuracy on new data. The effects of noise and imbalance of execution priorities during the discovery of the relations between process tasks are also discussed. Knowing the causal, exclusive, and parallel relations, a process model expressed in the Petri net formalism can be built. We illustrate our approach with real world data in a case study.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
For more information see http://www.processmining.org.
 
2
T is the set of all sequences that are composed of zero or more tasks of T. W: T \(\mathcal{N}\) is a function from the elements of T to N (i.e., the number of times an element of T appears in the process log).
 
3
We use a capital letter and || when referring to the number of occurrences of some task.
 
4
The name of the organization is not given for confidentiality reasons.
 
Literatur
Zurück zum Zitat Aalst, W. van der. 1998. The application of Petri nets to workflow management. The Journal of Circuits, Systems and Computers, 8(1):21–66. Aalst, W. van der. 1998. The application of Petri nets to workflow management. The Journal of Circuits, Systems and Computers, 8(1):21–66.
Zurück zum Zitat Aalst, W. van der, Dongen, B. van, Herbst, J., Măruşter, L., Schimm, G., and Weijters, A. 2003. Workflow mining: A survey of issues and approaches. Data and Knowledge Engineering, 47(2):237–267. Aalst, W. van der, Dongen, B. van, Herbst, J., Măruşter, L., Schimm, G., and Weijters, A. 2003. Workflow mining: A survey of issues and approaches. Data and Knowledge Engineering, 47(2):237–267.
Zurück zum Zitat Aalst, W. van der and Weijters, A. 2004. Process mining: A research agenda. Computers in Industry, 53(3):231–244. Aalst, W. van der and Weijters, A. 2004. Process mining: A research agenda. Computers in Industry, 53(3):231–244.
Zurück zum Zitat Aalst, W. van der Weijters, A., and Măruşter, L. 2004. Workflow mining: Discovering process models from event logs. IEEE Transactions on Data and Knowledge Engineering 16(9):1128–1142. Aalst, W. van der Weijters, A., and Măruşter, L. 2004. Workflow mining: Discovering process models from event logs. IEEE Transactions on Data and Knowledge Engineering 16(9):1128–1142.
Zurück zum Zitat Agrawal, R., Gunopulos, D., and Leymann, F. 1998. Mining process models from workflow logs. In Sixth International Conference on Extending Database Technology, pp. 469–483. Agrawal, R., Gunopulos, D., and Leymann, F. 1998. Mining process models from workflow logs. In Sixth International Conference on Extending Database Technology, pp. 469–483.
Zurück zum Zitat Cohen, W. 1995. Fast effective rule induction. In Proceedings of the Twelfth Int. Conference of Machine Learning ICML95. Cohen, W. 1995. Fast effective rule induction. In Proceedings of the Twelfth Int. Conference of Machine Learning ICML95.
Zurück zum Zitat Cook, J. and Wolf, A. 1998a. Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3):215–249. Cook, J. and Wolf, A. 1998a. Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3):215–249.
Zurück zum Zitat Cook, J. and Wolf, A. 1998b. Event-based detection of concurrency. Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6), pp. 35–45. Cook, J. and Wolf, A. 1998b. Event-based detection of concurrency. Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6), pp. 35–45.
Zurück zum Zitat Herbst, J. 2000a. Dealing with concurrency in workflow induction. In U. Baake, R. Zobel, and M. Al-Akaidi (Eds.), European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe. Herbst, J. 2000a. Dealing with concurrency in workflow induction. In U. Baake, R. Zobel, and M. Al-Akaidi (Eds.), European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe.
Zurück zum Zitat Herbst, J. (2000b). Inducing Workflow models from workflow instances. In Proceedings of the 6th European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe, pp. 175–182. Herbst, J. (2000b). Inducing Workflow models from workflow instances. In Proceedings of the 6th European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe, pp. 175–182.
Zurück zum Zitat Herbst, J. and Karagiannis, D. 2000. Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. International Journal of Intelligent Systems in Accounting, Finance and Management, 9:67–92. Herbst, J. and Karagiannis, D. 2000. Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. International Journal of Intelligent Systems in Accounting, Finance and Management, 9:67–92.
Zurück zum Zitat IDS Scheer. 2002. ARIS Process Performance Manager (ARIS PPM): Measure, analyze and optimize your business process performance (whitepaper). (IDS Scheer, Saarbruecken, Gemany, http://www.ids-scheer.com) IDS Scheer. 2002. ARIS Process Performance Manager (ARIS PPM): Measure, analyze and optimize your business process performance (whitepaper). (IDS Scheer, Saarbruecken, Gemany, http://​www.​ids-scheer.​com)
Zurück zum Zitat Keller, G. and Teufel, T. 1998. SAP R/3 Process Oriented Implementation. Reading MA: Addison-Wesley. Keller, G. and Teufel, T. 1998. SAP R/3 Process Oriented Implementation. Reading MA: Addison-Wesley.
Zurück zum Zitat Măruşter, L., Aalst, W. van der, Weijters, A., Bosch, A. van den, and Daelemans, W. 2002. Automated discovery of workflow models from hospital data. In C. Dousson, F. Höppner, and R. Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery from Temporal and Spatial Data, pp. 32–37. Măruşter, L., Aalst, W. van der, Weijters, A., Bosch, A. van den, and Daelemans, W. 2002. Automated discovery of workflow models from hospital data. In C. Dousson, F. Höppner, and R. Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery from Temporal and Spatial Data, pp. 32–37.
Zurück zum Zitat Măruşter, L., Weijters, A., Aalst, W., and Bosch, A. 2002. Process mining: Discovering direct successors in process logs. In S. Lange, K. Satoh, and C.H. Smith (Eds.), Proceedings of the 5th International Conference on Discovery Science (Discovery Science 2002), Berlin: Springer-Verlag, vol. 2534: pp. 364–373. Măruşter, L., Weijters, A., Aalst, W., and Bosch, A. 2002. Process mining: Discovering direct successors in process logs. In S. Lange, K. Satoh, and C.H. Smith (Eds.), Proceedings of the 5th International Conference on Discovery Science (Discovery Science 2002), Berlin: Springer-Verlag, vol. 2534: pp. 364–373.
Zurück zum Zitat Medeiros, A. de, Dongen, B. van, Aalst, W. van der and Weijters, A. 2004. Process Mining: Extending the α-algorithm to Mine Short Loops. BETA Working Paper Series, WP 113, Eindhoven University of Technology, Eindhoven, 2004. Medeiros, A. de, Dongen, B. van, Aalst, W. van der and Weijters, A. 2004. Process Mining: Extending the α-algorithm to Mine Short Loops. BETA Working Paper Series, WP 113, Eindhoven University of Technology, Eindhoven, 2004.
Zurück zum Zitat Medeiros, A. de, Weijters, A. and Aalst, W. van der. 2004. Using genetic algorithms to mine process models: Representation, operators and results. BETA Working Paper Series, WP 124, Eindhoven University of Technology, Eindhoven, 2004. Medeiros, A. de, Weijters, A. and Aalst, W. van der. 2004. Using genetic algorithms to mine process models: Representation, operators and results. BETA Working Paper Series, WP 124, Eindhoven University of Technology, Eindhoven, 2004.
Zurück zum Zitat Mitchell, T. 1995. Machine Learning. McGraw-Hill. Mitchell, T. 1995. Machine Learning. McGraw-Hill.
Zurück zum Zitat Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan-Kaufmann. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan-Kaufmann.
Zurück zum Zitat Reisig, W. and Rosenberg, G. (Eds.). 1998. Lectures on Petri nets I. Basic models, Berlin: Springer-Verlag. Reisig, W. and Rosenberg, G. (Eds.). 1998. Lectures on Petri nets I. Basic models, Berlin: Springer-Verlag.
Zurück zum Zitat Veld, A. 2002. WFM, een last of een lust? (Confidential Report), Eindhoven University of Technology. Veld, A. 2002. WFM, een last of een lust? (Confidential Report), Eindhoven University of Technology.
Zurück zum Zitat Weijters, A. and Aalst, W. 2001. Process mining: Discovering workflow models from event-based data, B. Kröse, M. Rijke, G. Schreiber, and M. Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 283–290. Weijters, A. and Aalst, W. 2001. Process mining: Discovering workflow models from event-based data, B. Kröse, M. Rijke, G. Schreiber, and M. Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 283–290.
Zurück zum Zitat Weiss, S. and Indhurkya, N. 1998. Predictive Data Mining. San Francisco: Morgan Kaufmann. Weiss, S. and Indhurkya, N. 1998. Predictive Data Mining. San Francisco: Morgan Kaufmann.
Zurück zum Zitat Weiss, S. and Kulikowski, C. 1991. Computer Systems That Learn. Morgan Kaufmann. Weiss, S. and Kulikowski, C. 1991. Computer Systems That Learn. Morgan Kaufmann.
Metadaten
Titel
A Rule-Based Approach for Process Discovery: Dealing with Noise and Imbalance in Process Logs
verfasst von
Laura Măruşter
A. J. M. M. (TON) Weijters
Wil M. P. Van Der Aalst
Antal Van Den Bosch
Publikationsdatum
01.07.2006
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 1/2006
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-005-0029-z

Weitere Artikel der Ausgabe 1/2006

Data Mining and Knowledge Discovery 1/2006 Zur Ausgabe