Skip to main content
Erschienen in: Knowledge and Information Systems 3/2017

30.07.2016 | Regular Paper

Sequential pattern mining in databases with temporal uncertainty

verfasst von: Jiaqi Ge, Yuni Xia, Jian Wang, Chandima Hewa Nadungodage, Sunil Prabhakar

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Temporally uncertain data widely exist in many real-world applications. Temporal uncertainty can be caused by various reasons such as conflicting or missing event timestamps, network latency, granularity mismatch, synchronization problems, device precision limitations, data aggregation. In this paper, we propose an efficient algorithm to mine sequential patterns from data with temporal uncertainty. We propose an uncertain model in which timestamps are modeled by random variables and then design a new approach to manage temporal uncertainty. We integrate it into the pattern-growth sequential pattern mining algorithm to discover probabilistic frequent sequential patterns. Extensive experiments on both synthetic and real datasets prove that the proposed algorithm is both efficient and scalable.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Aggarwal C, Yu P (2009) A survey of uncertain data algorithms and applications. IEEE Trans Knowl Data Eng 21(5):609–623CrossRef Aggarwal C, Yu P (2009) A survey of uncertain data algorithms and applications. IEEE Trans Knowl Data Eng 21(5):609–623CrossRef
2.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases VLDB’94, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases VLDB’94, pp 487–499
3.
Zurück zum Zitat Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering, ICDE ’95, pp 3–14 Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering, ICDE ’95, pp 3–14
4.
Zurück zum Zitat Allen J (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843CrossRefMATH Allen J (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843CrossRefMATH
5.
Zurück zum Zitat Ayres J, Flannick J, Gehrke J et al (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’02, pp 429–435 Ayres J, Flannick J, Gehrke J et al (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’02, pp 429–435
6.
Zurück zum Zitat Bernecker T, Kriegel H, Renz M et al (2009) Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD’09, pp 119–128 Bernecker T, Kriegel H, Renz M et al (2009) Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD’09, pp 119–128
7.
Zurück zum Zitat Cheng R, Kalashnikov D, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of the ACM international conference on management of data, SIGMOD ’03, pp 551–562 Cheng R, Kalashnikov D, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of the ACM international conference on management of data, SIGMOD ’03, pp 551–562
8.
Zurück zum Zitat Chui C, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’08, pp 64–75 Chui C, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’08, pp 64–75
9.
Zurück zum Zitat Chiu D, Wu Y, Chen A (2004) An efficient algorithm for mining frequent sequences by a new strategy without support counting. In: Proceedings of the 20th international conference on data engineering, ICDE ’04, pp 275–286 Chiu D, Wu Y, Chen A (2004) An efficient algorithm for mining frequent sequences by a new strategy without support counting. In: Proceedings of the 20th international conference on data engineering, ICDE ’04, pp 275–286
10.
Zurück zum Zitat Chui C, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: Proceedings of the 11th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’07, pp 47–58 Chui C, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: Proceedings of the 11th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’07, pp 47–58
11.
Zurück zum Zitat Dyreson C, Snodgrass R (1998) Supporting valid-time indeterminacy. ACM Trans Datab Syst 23(1):1–57CrossRef Dyreson C, Snodgrass R (1998) Supporting valid-time indeterminacy. ACM Trans Datab Syst 23(1):1–57CrossRef
12.
Zurück zum Zitat Ge J, Xia Y, Wang J (2015) Towards efficient sequential pattern mining in temporal uncertain databases. In: Proceedings of the 19th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’15, pp 268-279 Ge J, Xia Y, Wang J (2015) Towards efficient sequential pattern mining in temporal uncertain databases. In: Proceedings of the 19th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’15, pp 268-279
13.
Zurück zum Zitat Han J, Pei J, Mortazavi-Asl B et al (2000) Freespan: frequent pattern-projected sequential pattern mining. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00, pp 355–359 Han J, Pei J, Mortazavi-Asl B et al (2000) Freespan: frequent pattern-projected sequential pattern mining. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00, pp 355–359
14.
Zurück zum Zitat Höppner F (2001) Discovery of temporal patterns. learning rules about the qualitative behaviour of time series. In: Proceedings of the 5th European conference on principles of data mining and knowledge discovery, PKDD ’01, pp 192–203 Höppner F (2001) Discovery of temporal patterns. learning rules about the qualitative behaviour of time series. In: Proceedings of the 5th European conference on principles of data mining and knowledge discovery, PKDD ’01, pp 192–203
15.
Zurück zum Zitat Jestes J, Cormode G, Li F et al (2011) Semantics of ranking queries for probabilistic data. IEEE Trans Knowl Data Eng 23(12):1903–1917CrossRef Jestes J, Cormode G, Li F et al (2011) Semantics of ranking queries for probabilistic data. IEEE Trans Knowl Data Eng 23(12):1903–1917CrossRef
16.
Zurück zum Zitat Li Y, Bailey J, Kulik L et al (2013) Mining probabilistic frequent spatio-temporal sequential patterns with gap constraints from uncertain databases. In: IEEE 13th international conference on data mining, ICDM’13, pp 448–457 Li Y, Bailey J, Kulik L et al (2013) Mining probabilistic frequent spatio-temporal sequential patterns with gap constraints from uncertain databases. In: IEEE 13th international conference on data mining, ICDM’13, pp 448–457
17.
Zurück zum Zitat Muzammal M, Raman R (2011) Mining sequential patterns from probabilistic databases. In: Proceedings of the 15th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’11, pp 210–221 Muzammal M, Raman R (2011) Mining sequential patterns from probabilistic databases. In: Proceedings of the 15th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’11, pp 210–221
18.
Zurück zum Zitat Papapetrou P, Kollios G, Sclaroff S et al (2005) Discovering frequent arrangements of temporal intervals. In: Proceedings of the fifth IEEE international conference on data mining, ICDM ’05, pp 354–361 Papapetrou P, Kollios G, Sclaroff S et al (2005) Discovering frequent arrangements of temporal intervals. In: Proceedings of the fifth IEEE international conference on data mining, ICDM ’05, pp 354–361
19.
Zurück zum Zitat Pei J, Han J, Mortazavi-asl B et al (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th international conference on data engineering, ICDE’01, pp 215–224 Pei J, Han J, Mortazavi-asl B et al (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th international conference on data engineering, ICDE’01, pp 215–224
20.
Zurück zum Zitat Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the eleventh international conference on information and knowledge management, CIKM ’02, pp 18–25 Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the eleventh international conference on information and knowledge management, CIKM ’02, pp 18–25
21.
Zurück zum Zitat Sadri R, Zaniolo C, Zarkesh A et al (2004) Expressing and optimizing sequence queries in database systems. ACM Trans Database Syst 29(2):282–318CrossRef Sadri R, Zaniolo C, Zarkesh A et al (2004) Expressing and optimizing sequence queries in database systems. ACM Trans Database Syst 29(2):282–318CrossRef
22.
Zurück zum Zitat Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology: advances in database technology, EDBT ’96, pp 3–17 Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology: advances in database technology, EDBT ’96, pp 3–17
23.
Zurück zum Zitat Sun X, Orlowska M, Li X (2003) Introducing uncertainty into pattern discovery in temporal event sequences. In: Proceedings of the third IEEE international conference on data mining, pp 299–306 Sun X, Orlowska M, Li X (2003) Introducing uncertainty into pattern discovery in temporal event sequences. In: Proceedings of the third IEEE international conference on data mining, pp 299–306
24.
Zurück zum Zitat Sun L, Cheng R, Cheung D et al (2010a) Mining uncertain data with probabilistic guarantees. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, pp 273–282 Sun L, Cheng R, Cheung D et al (2010a) Mining uncertain data with probabilistic guarantees. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, pp 273–282
25.
Zurück zum Zitat Sun L, Cheng R, Cheung D et al (2010b) Mining uncertain data with probabilistic guarantees. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, pp 273–282 Sun L, Cheng R, Cheung D et al (2010b) Mining uncertain data with probabilistic guarantees. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, pp 273–282
26.
Zurück zum Zitat Wan L, Chen L, Zhang C (2013) Mining frequent serial episodes over uncertain sequence data. In: Proceedings of the 16th international conference on extending database technology, EDBT’13, pp 215–226 Wan L, Chen L, Zhang C (2013) Mining frequent serial episodes over uncertain sequence data. In: Proceedings of the 16th international conference on extending database technology, EDBT’13, pp 215–226
27.
Zurück zum Zitat Winarko E, Roddick J (2007) Armada—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 63(1):76–90CrossRef Winarko E, Roddick J (2007) Armada—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 63(1):76–90CrossRef
28.
Zurück zum Zitat Yang J, Wang W, Yu P et al (2002) Mining long sequential patterns in a noisy environment. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data, SIGMOD ’02, pp 406–417 Yang J, Wang W, Yu P et al (2002) Mining long sequential patterns in a noisy environment. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data, SIGMOD ’02, pp 406–417
29.
Zurück zum Zitat Zaki M (2001) Spade: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60CrossRefMATH Zaki M (2001) Spade: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60CrossRefMATH
30.
Zurück zum Zitat Zhang H, Diao Y, Immerman N (2010) Recognizing patterns in streams with imprecise timestamps. Proc VLDB Endow 3(1–2):244–255CrossRef Zhang H, Diao Y, Immerman N (2010) Recognizing patterns in streams with imprecise timestamps. Proc VLDB Endow 3(1–2):244–255CrossRef
31.
Zurück zum Zitat Zhao Z, Yan D, Ng W (2012) Mining probabilistically frequent sequential patterns in uncertain databases. In: Proceedings of the 15th international conference on extending database technology, EDBT’12, pp 74–85 Zhao Z, Yan D, Ng W (2012) Mining probabilistically frequent sequential patterns in uncertain databases. In: Proceedings of the 15th international conference on extending database technology, EDBT’12, pp 74–85
32.
Zurück zum Zitat Zhao Z, Yan D, Ng W (2013) Mining probabilistically frequent sequential patterns in large uncertain databases. IEEE Trans Knowl Data Eng 26(5):1171–1184CrossRef Zhao Z, Yan D, Ng W (2013) Mining probabilistically frequent sequential patterns in large uncertain databases. IEEE Trans Knowl Data Eng 26(5):1171–1184CrossRef
33.
Zurück zum Zitat Zhou Y, Ma C, Guo Q et al (2014) Sequence pattern matching over time-series data with temporal uncertainty. In: Proceedings of the 17th international conference on extending database technology, EDBT’14, pp 205–216 Zhou Y, Ma C, Guo Q et al (2014) Sequence pattern matching over time-series data with temporal uncertainty. In: Proceedings of the 17th international conference on extending database technology, EDBT’14, pp 205–216
Metadaten
Titel
Sequential pattern mining in databases with temporal uncertainty
verfasst von
Jiaqi Ge
Yuni Xia
Jian Wang
Chandima Hewa Nadungodage
Sunil Prabhakar
Publikationsdatum
30.07.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2017
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-016-0977-1

Weitere Artikel der Ausgabe 3/2017

Knowledge and Information Systems 3/2017 Zur Ausgabe