Skip to main content
Erschienen in: Knowledge and Information Systems 1/2019

15.03.2019 | Regular Paper

Discovering frequent chain episodes

verfasst von: Avinash Achar, P. S. Sastry

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Frequent episode discovery is a popular framework in temporal data mining with many applications. An episode is a partially ordered set of nodes with each node associated with an event-type. The episodes literature has seen different notions of frequency and a variety of associated discovery algorithms under these different frequencies when the associated partial order is total (serial episode) or trivial (parallel episode). Recently an apriori-based discovery algorithm for mining episodes where the associated partial order has no restriction but the node to event-type association is one–one (general injective episodes) was proposed based on the non-overlapped frequency measure. This work pointed out that frequency alone is not a sufficient indicator of interestingness in the context of episodes with general partial orders and introduced a new measure of interestingness called bidirectional evidence (BE) to address this issue. This algorithm discovers episodes by incorporating both frequency and BE thresholds in the level-wise procedure. In this paper, we extend this BE-based algorithm to a much larger class of episodes that we call chain episodes. This class encompasses all serial and parallel episodes (injective or otherwise) and also many other non-injective episodes with unrestricted partial orders. We first discuss how the BE measure can be generalized to chain episodes and prove the monotonicity property it satisfies in this general context. We then describe our candidate generation step (with correctness proofs) which nicely exploits this new monotonicity property. We further describe the frequency counting (with correctness proofs) and BE computation steps for chain episodes. The experimental results demonstrate the effectiveness of our algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
An episode is injective if the associated node to event-type map is one to one.
 
2
The class was originally introduced in [6] while the term “Chain Episodes” was explicitly used for the first time in [1].
 
3
The correctness proof for chain episode counting we present here is a minor extension of the correctness proofs for injective episodes first reported in [1]. An alternate correctness proof for chain episode counting is given in [25] which appeared after [1].
 
4
Given any set V, a relationR over V (which is a subset of \(V \times V\)) is said to be a strict partial order if it is irreflexive (i.e. for all \(v \in V\), \((v, v) \notin R\)), asymmetric (i.e. \((v_1, v_2) \in R\) implies that \((v_2, v_1) \notin R\), for all distinct \(v_1, v_2 \in V\)) and transitive (i.e. \(\forall v_1, v_2, v_3 \in V\), \((v_1, v_2) \in R\) and \((v_2, v_3) \in R\) implies that \((v_1, v_3) \in R\)).
 
5
We will elaborate later in Sect. 6 on how finite state automata can be used to track occurrences of episodes and a strategy for counting with expiry time constraints.
 
6
By frequent here, we mean episodes which satisfy both the frequency and BE thresholds.
 
7
An element in \(V_{\alpha }\) is minimal if there is no other element less than it as per \(<_{\alpha }\). Note that a poset can in general have multiple minimal elements.
 
8
In \(t_e\), subscript e denotes the end time of the window. In \(h_i^e\), superscript e refers to earliest transiting.
 
9
An episode is said to be frequency closed if every superepisode has a strictly lower frequency.
 
10
A generator is an episode whose every subepisode has a strictly greater frequency.
 
11
Recall from Definition 6, an injective episode \(\alpha \) can be viewed as a partially ordered set of event-types \((X^\alpha ,R^\alpha )\). \((X^\beta ,R^\beta )\) is a maximal subepisode of an injective episode \(\alpha \) if \(X^\beta \subseteq X^\alpha \) and \(R^\beta \) is the restriction of \(R^\alpha \) on to \(X^\beta \). The notion of a maximal subepisode of a general episode is discussed in the next section.
 
12
A serial extension of a chain episode \((V_\alpha , <_\alpha , g_\alpha )\) is a serial episode \(\beta = (V_\beta ,<_\beta ,g_\beta )\) where \(V_{\beta }=V_{\alpha }\) and \(g_\alpha = g_\beta \) such that \(<_\alpha \subseteq <_\beta \).
 
13
The source codes have all been written in C++. The experiments have been run on a 2.5GHz Pentium PC under a Linux operating system.
 
Literatur
1.
Zurück zum Zitat Achar A (2010) Discovering Frequent episodes with general partial orders. PhD thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India Achar A (2010) Discovering Frequent episodes with general partial orders. PhD thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India
3.
Zurück zum Zitat Achar A, Laxman S, Sastry PS (2012) A unified view of the apriori-based algorithms for frequent episode discovery. Knowl Inf Syst 31(2):223–250CrossRef Achar A, Laxman S, Sastry PS (2012) A unified view of the apriori-based algorithms for frequent episode discovery. Knowl Inf Syst 31(2):223–250CrossRef
4.
Zurück zum Zitat Achar A, Ibrahim A, Sastry PS (2013) Pattern-growth based frequent serial episode discovery. Data Knowl Eng 87:91–108CrossRef Achar A, Ibrahim A, Sastry PS (2013) Pattern-growth based frequent serial episode discovery. Data Knowl Eng 87:91–108CrossRef
5.
Zurück zum Zitat Achar A, Laxman S, Raajay V, Sastry PS (2012) Discovering injective episodes with general partial orders. Data Min Knowl Discov 25(1):67–108MathSciNetCrossRefMATH Achar A, Laxman S, Raajay V, Sastry PS (2012) Discovering injective episodes with general partial orders. Data Min Knowl Discov 25(1):67–108MathSciNetCrossRefMATH
6.
Zurück zum Zitat Achar A, Laxman S, Raajay V, Sastry PS (2009) Discovering general partial orders from event streams. In: Technical report arXiv: 0902.1227v2 [cs.AI] Achar A, Laxman S, Raajay V, Sastry PS (2009) Discovering general partial orders from event streams. In: Technical report arXiv:​ 0902.​1227v2 [cs.AI]
7.
Zurück zum Zitat Atallah MJ, Gwadera R, Szpankowski W (2004) Detection of significant sets of episodes in event sequences. In: Proceedings of the 4th IEEE international conference on data mining (ICDM), Brighton, UK, pp 3–10 Atallah MJ, Gwadera R, Szpankowski W (2004) Detection of significant sets of episodes in event sequences. In: Proceedings of the 4th IEEE international conference on data mining (ICDM), Brighton, UK, pp 3–10
8.
Zurück zum Zitat Bouqata B, Caraothers CD, Szymanski BK, Zaki MJ (2006) Vogue: a novel variable order-gap state machine for modeling sequences. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases, vol 4213. Springer, Berlin, pp 42–54 Bouqata B, Caraothers CD, Szymanski BK, Zaki MJ (2006) Vogue: a novel variable order-gap state machine for modeling sequences. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases, vol 4213. Springer, Berlin, pp 42–54
9.
Zurück zum Zitat Brown EN, Kass RE, Mitra PP (2004) Multiple neuronal spike train data analysis: state of art and future challenges. Nat Neurosci 7(5):456–461CrossRef Brown EN, Kass RE, Mitra PP (2004) Multiple neuronal spike train data analysis: state of art and future challenges. Nat Neurosci 7(5):456–461CrossRef
10.
Zurück zum Zitat Gan M, Dai H (2012) An efficient one-pass method for discovering bases of recently frequent episodes over online data streams. Int J Innov Comput Inf Control 8(7(A)):4675–4690 Gan M, Dai H (2012) An efficient one-pass method for discovering bases of recently frequent episodes over online data streams. Int J Innov Comput Inf Control 8(7(A)):4675–4690
11.
Zurück zum Zitat Gan M, Dai H (2014) Detecting and monitoring abrupt emergences and submergences of episodes over data streams. Inf Syst 39:277–289CrossRef Gan M, Dai H (2014) Detecting and monitoring abrupt emergences and submergences of episodes over data streams. Inf Syst 39:277–289CrossRef
12.
Zurück zum Zitat Huang K, Chang C (2008) Efficient mining of frequent episodes from complex sequences. Inf Syst 33(1):96–114CrossRef Huang K, Chang C (2008) Efficient mining of frequent episodes from complex sequences. Inf Syst 33(1):96–114CrossRef
13.
Zurück zum Zitat Ibrahim A, Sastry S, Sastry PS (2016) Discovering compressing serial episodes from event sequences. Knowl Inf Syst 47(2):405–432CrossRef Ibrahim A, Sastry S, Sastry PS (2016) Discovering compressing serial episodes from event sequences. Knowl Inf Syst 47(2):405–432CrossRef
14.
Zurück zum Zitat Iwanuma K, Takano Y, Nabeshima H (2004) On anti-monotone frequency measures for extracting sequential patterns from a single very-long sequence. In: Proceedings of the 2004 IEEE conference on cybernetics and intelligent systems, vol 1, pp 213–217 Iwanuma K, Takano Y, Nabeshima H (2004) On anti-monotone frequency measures for extracting sequential patterns from a single very-long sequence. In: Proceedings of the 2004 IEEE conference on cybernetics and intelligent systems, vol 1, pp 213–217
15.
Zurück zum Zitat Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17:1505–1517CrossRef Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17:1505–1517CrossRef
16.
Zurück zum Zitat Laxman S, Tankasali V, White RW (2008) Stream prediction using a generative model based on frequent episodes in event sequences. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08), pp 453–461 Laxman S, Tankasali V, White RW (2008) Stream prediction using a generative model based on frequent episodes in event sequences. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08), pp 453–461
17.
Zurück zum Zitat Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequent episodes for intrusion detection. Int J Intell Syst 15:687–703CrossRefMATH Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequent episodes for intrusion detection. Int J Intell Syst 15:687–703CrossRefMATH
18.
Zurück zum Zitat Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1(3):259–289CrossRef Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1(3):259–289CrossRef
19.
Zurück zum Zitat Nag A, Fu AW (2003) Mining frequent episodes for relating financial events and stock trends. In: Proceedings of 7th Pacific-Asia conference on knowledge discovery and data mining (PAKDD 2003). Springer, Berlin, pp 27–39 Nag A, Fu AW (2003) Mining frequent episodes for relating financial events and stock trends. In: Proceedings of 7th Pacific-Asia conference on knowledge discovery and data mining (PAKDD 2003). Springer, Berlin, pp 27–39
20.
Zurück zum Zitat Patnaik D, Sastry PS, Unnikrishnan KP (2008) Inferring neuronal network connectivity from spike data: a temporal data mining approach. Sci Program 16:49–77 Patnaik D, Sastry PS, Unnikrishnan KP (2008) Inferring neuronal network connectivity from spike data: a temporal data mining approach. Sci Program 16:49–77
21.
Zurück zum Zitat Patnaik D, Laxman S, Chandramouli B, Ramakrishnan N (2012) Efficient episode mining of dynamic event streams. In: IEEE international conference on data mining, pp 605–614 Patnaik D, Laxman S, Chandramouli B, Ramakrishnan N (2012) Efficient episode mining of dynamic event streams. In: IEEE international conference on data mining, pp 605–614
22.
Zurück zum Zitat Sastry PS, Unnikrishnan KP (2010) Conditional probability based significance tests for sequential patterns in multi-neuronal spike trains. Neural Comput 22(4):1025–1059CrossRefMATH Sastry PS, Unnikrishnan KP (2010) Conditional probability based significance tests for sequential patterns in multi-neuronal spike trains. Neural Comput 22(4):1025–1059CrossRefMATH
23.
Zurück zum Zitat Tatti N (2009) Significance of episodes based on minimal windows. In: Proceedings of 2009 IEEE international conference on data mining Tatti N (2009) Significance of episodes based on minimal windows. In: Proceedings of 2009 IEEE international conference on data mining
26.
Zurück zum Zitat Tatti N, Cule B (2011) Mining closed episodes with simultaneous events. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1172–1180 Tatti N, Cule B (2011) Mining closed episodes with simultaneous events. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1172–1180
27.
Zurück zum Zitat Tatti N, Cule B (2010) Mining closed strict episodes. In: Proceedings of 2010 IEEE international conference on data mining, pp 501–510 Tatti N, Cule B (2010) Mining closed strict episodes. In: Proceedings of 2010 IEEE international conference on data mining, pp 501–510
28.
Zurück zum Zitat Unnikrishnan KP, Shadid BQ, Sastry PS, Laxman S (2009) Root cause diagnostics using temporal datamining, U.S.Patent no. 7509234, 24 Mar Unnikrishnan KP, Shadid BQ, Sastry PS, Laxman S (2009) Root cause diagnostics using temporal datamining, U.S.Patent no. 7509234, 24 Mar
29.
Zurück zum Zitat Wang MF, Wu YC, Tsai MF (2008) Exploiting frequent episodes in weighted suffix tree to improve intrusion detection system. In: Proceedings of the 22nd international conference on advanced information networking and applications-workshops. IEEE Computer Society, Washington, pp 1246–1252 Wang MF, Wu YC, Tsai MF (2008) Exploiting frequent episodes in weighted suffix tree to improve intrusion detection system. In: Proceedings of the 22nd international conference on advanced information networking and applications-workshops. IEEE Computer Society, Washington, pp 1246–1252
Metadaten
Titel
Discovering frequent chain episodes
verfasst von
Avinash Achar
P. S. Sastry
Publikationsdatum
15.03.2019
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-019-01349-y

Weitere Artikel der Ausgabe 1/2019

Knowledge and Information Systems 1/2019 Zur Ausgabe