Skip to main content

2015 | OriginalPaper | Buchkapitel

A Generic Data Warehouse Architecture for Analyzing Workflow Logs

verfasst von : Christian Koncilia, Horst Pichler, Robert Wrembel

Erschienen in: Advances in Databases and Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper proposes an approach to represent and analyze the content of workflow logs in a data warehouse. When analyzing workflow logs one big problem arises: typically, an underlying workflow model consists of loops (frequently interleaving), often implemented by using goto-statements. These structures increase the number of possible execution paths significantly - in theory even indefinitely. In a naive Data Warehouse (DWH) implementation one would represent all possible execution paths by means of a dimension. However, this would lead to a huge or even infinite number of elements in the dimension. In this paper, we present a novel approach for analyzing workflow logs including loops and goto-statements.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
PHP PEG has been developed by Hamish Friedlander. Available at: https://​github.​com/​hafriedlander/​php-peg.
 
Literatur
6.
Zurück zum Zitat Andrzejewski, W., Bȩbel, B.: FOCUS: An Index FOr ContinuoUS subsequence pattern queries. In: Morzy, T., Härder, T., Wrembel, R. (eds.) ADBIS 2012. LNCS, vol. 7503, pp. 29–42. Springer, Heidelberg (2012)CrossRef Andrzejewski, W., Bȩbel, B.: FOCUS: An Index FOr ContinuoUS subsequence pattern queries. In: Morzy, T., Härder, T., Wrembel, R. (eds.) ADBIS 2012. LNCS, vol. 7503, pp. 29–42. Springer, Heidelberg (2012)CrossRef
7.
Zurück zum Zitat Bȩbel, B., Morzy, M., Morzy, T., Królikowski, Z., Wrembel, R.: OLAP-like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V.S., Lee, M.L. (eds.) ER 2012 Workshops 2012. LNCS, vol. 7518, pp. 153–161. Springer, Heidelberg (2012) CrossRef Bȩbel, B., Morzy, M., Morzy, T., Królikowski, Z., Wrembel, R.: OLAP-like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V.S., Lee, M.L. (eds.) ER 2012 Workshops 2012. LNCS, vol. 7518, pp. 153–161. Springer, Heidelberg (2012) CrossRef
8.
Zurück zum Zitat Bebel, B., Morzy, T., Królikowski, Z., Wrembel, R.: Formal model of time point-based sequential data for OLAP-like analysis. Bull. Pol. Acad. Sci. Tech. Sci. 62(2), 331–340 (2014) Bebel, B., Morzy, T., Królikowski, Z., Wrembel, R.: Formal model of time point-based sequential data for OLAP-like analysis. Bull. Pol. Acad. Sci. Tech. Sci. 62(2), 331–340 (2014)
9.
Zurück zum Zitat Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf.Technol. 51(5), 241–242 (2009) Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf.Technol. 51(5), 241–242 (2009)
10.
Zurück zum Zitat Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011)CrossRef Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011)CrossRef
11.
Zurück zum Zitat Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma,S.: Managing RFID data. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2004) Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma,S.: Managing RFID data. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2004)
12.
Zurück zum Zitat Chui, C.K., Kao, B. Lo, E.Cheung, D.: S-OLAP: an olap system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2010) Chui, C.K., Kao, B. Lo, E.Cheung, D.: S-OLAP: an olap system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2010)
13.
Zurück zum Zitat Chui, C.K. Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2009) Chui, C.K. Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2009)
14.
Zurück zum Zitat Dong, G., Pei, J.: Sequence Data Mining, vol. 33. Springer, New York (2007)MATH Dong, G., Pei, J.: Sequence Data Mining, vol. 33. Springer, New York (2007)MATH
15.
Zurück zum Zitat Eder, J., Olivotto, G.E., Gruber, W.: A data warehouse for workflow logs. In: Han, Y., Tai, S., Wikarski, D. (eds.) EDCIS 2002. LNCS, vol. 2480, pp. 1–15. Springer, Heidelberg (2002) CrossRef Eder, J., Olivotto, G.E., Gruber, W.: A data warehouse for workflow logs. In: Han, Y., Tai, S., Wikarski, D. (eds.) EDCIS 2002. LNCS, vol. 2480, pp. 1–15. Springer, Heidelberg (2002) CrossRef
16.
Zurück zum Zitat Ezeife, C., Monwar, M.: Ssm : A frequent sequential data stream patterns miner. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (2007) Ezeife, C., Monwar, M.: Ssm : A frequent sequential data stream patterns miner. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (2007)
17.
Zurück zum Zitat Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2006) Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2006)
18.
Zurück zum Zitat Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analyzing massive RFID data sets. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 83-93 (2006) Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analyzing massive RFID data sets. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 83-93 (2006)
19.
Zurück zum Zitat Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases 18(2), 173–197 (2005)CrossRef Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases 18(2), 173–197 (2005)CrossRef
20.
Zurück zum Zitat Han, J.-W., Pei, J., Yan, X.-F.: From sequential pattern mining to structured pattern mining: a pattern-growth approach. J. Comput. Sci. Technol. 19(3), 257–279 (2004)MathSciNetCrossRef Han, J.-W., Pei, J., Yan, X.-F.: From sequential pattern mining to structured pattern mining: a pattern-growth approach. J. Comput. Sci. Technol. 19(3), 257–279 (2004)MathSciNetCrossRef
21.
Zurück zum Zitat Koncilia, C., Morzy, T., Wrembel, R., Eder, J.: Interval OLAP: analyzing interval data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 233–244. Springer, Heidelberg (2014) Koncilia, C., Morzy, T., Wrembel, R., Eder, J.: Interval OLAP: analyzing interval data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 233–244. Springer, Heidelberg (2014)
22.
Zurück zum Zitat Liu, M. Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2011) Liu, M. Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2011)
23.
Zurück zum Zitat Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR) (2010) Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR) (2010)
24.
Zurück zum Zitat Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2008) Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2008)
25.
Zurück zum Zitat Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 1–41 (2010)CrossRef Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 1–41 (2010)CrossRef
26.
Zurück zum Zitat Marascu, A., Masseglia, F.: Mining sequential patterns from data streams: a centroid approach. J. Intell. Inf. Syst. 27(3), 291–307 (2006)CrossRef Marascu, A., Masseglia, F.: Mining sequential patterns from data streams: a centroid approach. J. Intell. Inf. Syst. 27(3), 291–307 (2006)CrossRef
27.
Zurück zum Zitat Masseglia, F., Teisseire, M., Poncelet, P.: Sequential pattern mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining. IGI Global, Hershey (2009) Masseglia, F., Teisseire, M., Poncelet, P.: Sequential pattern mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining. IGI Global, Hershey (2009)
28.
Zurück zum Zitat Melton, J. (ed.).: Working draft database language sql - part 15: Row pattern recognition (sql/rpr). ANSI INCITS DM32.2-2011-00005 (2011) Melton, J. (ed.).: Working draft database language sql - part 15: Row pattern recognition (sql/rpr). ANSI INCITS DM32.2-2011-00005 (2011)
29.
Zurück zum Zitat Mendes, L.F., Ding, B., Han, J.: Stream sequential pattern mining with precise error bounds. In: Proceedings of the IEEE International Conference on Data Mining (ICDM) (2008) Mendes, L.F., Ding, B., Han, J.: Stream sequential pattern mining with precise error bounds. In: Proceedings of the IEEE International Conference on Data Mining (ICDM) (2008)
30.
Zurück zum Zitat Mooney, C.H., Roddick, J.F.: Sequential pattern mining - approaches and algorithms. ACM Comput.Surv. 45(2), 19 (2013)CrossRefMATH Mooney, C.H., Roddick, J.F.: Sequential pattern mining - approaches and algorithms. ACM Comput.Surv. 45(2), 19 (2013)CrossRefMATH
31.
Zurück zum Zitat Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of Internatiional Conference on Data Engineering (ICDE) (2001) Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of Internatiional Conference on Data Engineering (ICDE) (2001)
32.
Zurück zum Zitat Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: Proceedings of Internatonal Conference on Scientific and Statistical Database Management (SSDBM) (1998) Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: Proceedings of Internatonal Conference on Scientific and Statistical Database Management (SSDBM) (1998)
33.
Zurück zum Zitat Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Optimization of sequence queries in database systems. In: Procedings of ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System (PODS) (2001) Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Optimization of sequence queries in database systems. In: Procedings of ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System (PODS) (2001)
34.
Zurück zum Zitat Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB) (2001) Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB) (2001)
35.
Zurück zum Zitat Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. SIGMOD Record 23(2), 430–441 (1994)CrossRef Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. SIGMOD Record 23(2), 430–441 (1994)CrossRef
36.
Zurück zum Zitat Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: A model for sequence databases. In: Proceedings of International Conference on Data Engineering (ICDE) (1995) Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: A model for sequence databases. In: Proceedings of International Conference on Data Engineering (ICDE) (1995)
37.
Zurück zum Zitat Seshadri, P., Livny, M., Ramakrishnan, R.: The design and implementation of a sequence database system. In: Proceedings of Interntional Conference on Very Large Data Bases (VLDB) (1996) Seshadri, P., Livny, M., Ramakrishnan, R.: The design and implementation of a sequence database system. In: Proceedings of Interntional Conference on Very Large Data Bases (VLDB) (1996)
38.
Zurück zum Zitat Vaisman, A., Zimányi, E.: Data Warehouse Systems. Springer, Heidelberg (2014). ISBN 978-3-642-54655-6 CrossRef Vaisman, A., Zimányi, E.: Data Warehouse Systems. Springer, Heidelberg (2014). ISBN 978-3-642-54655-6 CrossRef
39.
Zurück zum Zitat van der Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1–22. Springer, Heidelberg (2013) CrossRef van der Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1–22. Springer, Heidelberg (2013) CrossRef
40.
Zurück zum Zitat van Dongen, B., van der Aalst, W.M.P.: A meta model for process mining data. In: Proceedings of of CAiSE Workshops (2005) van Dongen, B., van der Aalst, W.M.P.: A meta model for process mining data. In: Proceedings of of CAiSE Workshops (2005)
41.
Zurück zum Zitat Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011) CrossRef Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011) CrossRef
42.
Zurück zum Zitat Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2006) Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2006)
43.
Zurück zum Zitat Zheng, Q., Xu, K., Ma, S.: When to update the sequential patterns of stream data? In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 545–550. Springer, Heidelberg (2003) CrossRef Zheng, Q., Xu, K., Ma, S.: When to update the sequential patterns of stream data? In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 545–550. Springer, Heidelberg (2003) CrossRef
Metadaten
Titel
A Generic Data Warehouse Architecture for Analyzing Workflow Logs
verfasst von
Christian Koncilia
Horst Pichler
Robert Wrembel
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-23135-8_8