Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 5/2023

30.06.2023

TIRPClo: efficient and complete mining of time intervals-related patterns

verfasst von: Omer Harel, Robert Moskovitch

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 5/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Mining frequent Time Intervals-Related Patterns (TIRPs) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequent TIRPs is still a high computational challenge which has also not yet been investigated in its full complexity. The majority of previous methods discover only the first instances of the TIRPs within each series of symbolic time intervals, whereas their re-occurring instances are ignored. This eventually results in an incomplete discovery of frequent TIRPs, a problem that lies also in the challenge of mining only the frequent closed TIRPs, which was only recently investigated for the first time. In this paper, we introduce TIRPClo—an efficient algorithm for the complete mining of either the entire set of frequent TIRPs, or only the frequent closed TIRPs. The algorithm proposes a non-ambiguous sequential representation of symbolic time intervals series through the intervals’ end-points, as well as a memory-efficient index and a novel method for data projection, due to which it is the first algorithm to guarantee a complete discovery of frequent closed TIRPs. The experimental evaluation conducted on eleven real-world and four synthetic datasets demonstrates that TIRPClo is up to 10 times faster when mining the entire set of frequent TIRPs, and up to more than 100 times faster when mining only the frequent closed TIRPs compared to four state-of-the-art methods, while also reporting lower memory measurements.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Ayres J, Flannick J, Gehrke J, et al (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’02, pp 429–435, https://doi.org/10.1145/775047.775109 Ayres J, Flannick J, Gehrke J, et al (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’02, pp 429–435, https://​doi.​org/​10.​1145/​775047.​775109
Zurück zum Zitat Batal I, Sacchi L, Bellazzi R, et al (2009) A temporal abstraction framework for classifying clinical temporal data. In: AMIA Annual Symposium Proceedings, vol 2009. American Medical Informatics Association, Rockville, MD, p 29 Batal I, Sacchi L, Bellazzi R, et al (2009) A temporal abstraction framework for classifying clinical temporal data. In: AMIA Annual Symposium Proceedings, vol 2009. American Medical Informatics Association, Rockville, MD, p 29
Zurück zum Zitat Benavoli A, Corani G, Mangili F (2016) Should we really use post-hoc tests based on mean-ranks? J Mach Learn Res 17(1):152–161MathSciNetMATH Benavoli A, Corani G, Mangili F (2016) Should we really use post-hoc tests based on mean-ranks? J Mach Learn Res 17(1):152–161MathSciNetMATH
Zurück zum Zitat Chang L, Wang T, Yang D, et al (2008) Seqstream: mining closed sequential patterns over stream sliding windows. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE Computer Society, Washington, DC, USA, ICDM ’08, pp 83–92, https://doi.org/10.1109/ICDM.2008.36 Chang L, Wang T, Yang D, et al (2008) Seqstream: mining closed sequential patterns over stream sliding windows. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE Computer Society, Washington, DC, USA, ICDM ’08, pp 83–92, https://​doi.​org/​10.​1109/​ICDM.​2008.​36
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
Zurück zum Zitat Ezeife CI, Lu Y, Liu Y (2005) Plwap sequential mining: open source code. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. Association for Computing Machinery, New York, NY, USA, OSDM ’05, pp 26–35, https://doi.org/10.1145/1133905.1133910 Ezeife CI, Lu Y, Liu Y (2005) Plwap sequential mining: open source code. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. Association for Computing Machinery, New York, NY, USA, OSDM ’05, pp 26–35, https://​doi.​org/​10.​1145/​1133905.​1133910
Zurück zum Zitat Fournier-Viger P, Lin JCW, Kiran RU et al (2017) A survey of sequential pattern mining. Data Sci Pattern Recogn 1(1):54–77 Fournier-Viger P, Lin JCW, Kiran RU et al (2017) A survey of sequential pattern mining. Data Sci Pattern Recogn 1(1):54–77
Zurück zum Zitat Garcia S, Herrera F (2008) An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res 9(12):2677MATH Garcia S, Herrera F (2008) An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res 9(12):2677MATH
Zurück zum Zitat Han J, Pei J, Mortazavi-Asl B, et al (2000) Freespan: Frequent pattern-projected sequential pattern mining. In: Proceedings of the Sixth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’00, pp 355–359, https://doi.org/10.1145/347090.347167 Han J, Pei J, Mortazavi-Asl B, et al (2000) Freespan: Frequent pattern-projected sequential pattern mining. In: Proceedings of the Sixth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’00, pp 355–359, https://​doi.​org/​10.​1145/​347090.​347167
Zurück zum Zitat Han J, Pei J, Mortazavi-Asl B, et al (2001) Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: proceedings of the 17th international conference on data engineering. IEEE Computer Society, Washington, DC, USA, pp 215–224 Han J, Pei J, Mortazavi-Asl B, et al (2001) Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: proceedings of the 17th international conference on data engineering. IEEE Computer Society, Washington, DC, USA, pp 215–224
Zurück zum Zitat Harel OD, Moskovitch R (2021) Complete closed time intervals-related patterns mining. In: proceedings of the 35th AAAI conference on artificial intelligence. AAAI Press, Palo Alto, CA Harel OD, Moskovitch R (2021) Complete closed time intervals-related patterns mining. In: proceedings of the 35th AAAI conference on artificial intelligence. AAAI Press, Palo Alto, CA
Zurück zum Zitat Höppner F (2001) Learning temporal rules from state sequences. In: IJCAI Workshop on Learning from Temporal and Spatial Data, Citeseer Höppner F (2001) Learning temporal rules from state sequences. In: IJCAI Workshop on Learning from Temporal and Spatial Data, Citeseer
Zurück zum Zitat Höppner F (2002) Time series abstraction methods: a survey. Informatik bewegt: Informatik 2002–32 Jahrestagung der Gesellschaft für Informatik ev (GI) Höppner F (2002) Time series abstraction methods: a survey. Informatik bewegt: Informatik 2002–32 Jahrestagung der Gesellschaft für Informatik ev (GI)
Zurück zum Zitat Huang KY, Chang CH, Tung JH, et al (2006) Cobra: Closed sequential pattern mining using bi-phase reduction approach. In: International Conference on data warehousing and knowledge discovery. Springer, Berlin, pp 280–291, https://doi.org/10.1007/11823728_27 Huang KY, Chang CH, Tung JH, et al (2006) Cobra: Closed sequential pattern mining using bi-phase reduction approach. In: International Conference on data warehousing and knowledge discovery. Springer, Berlin, pp 280–291, https://​doi.​org/​10.​1007/​11823728_​27
Zurück zum Zitat Itzhak N, Jaroszewicz S, Moskovitch R (2023) Continuously predicting a time intervals based pattern completion towards event prediction. PAKDD, Osaka, Japan Itzhak N, Jaroszewicz S, Moskovitch R (2023) Continuously predicting a time intervals based pattern completion towards event prediction. PAKDD, Osaka, Japan
Zurück zum Zitat Jakkula VR, Cook DJ (2011) Detecting anomalous sensor events in smart home data for enhancing the living experience. Artif Intell Smart Living 11(201):1 Jakkula VR, Cook DJ (2011) Detecting anomalous sensor events in smart home data for enhancing the living experience. Artif Intell Smart Living 11(201):1
Zurück zum Zitat Lavrac N, Keravnou E, Zupan B (2000) Intelligent data analysis in medicine. Encycl Comput Sci Technol 42(9):113–157MATH Lavrac N, Keravnou E, Zupan B (2000) Intelligent data analysis in medicine. Encycl Comput Sci Technol 42(9):113–157MATH
Zurück zum Zitat Lee Z, Lindgren T, Papapetrou P (2020) Z-miner: An efficient method for mining frequent arrangements of event intervals. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 524–534, https://doi.org/10.1145/3394486.3403095 Lee Z, Lindgren T, Papapetrou P (2020) Z-miner: An efficient method for mining frequent arrangements of event intervals. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 524–534, https://​doi.​org/​10.​1145/​3394486.​3403095
Zurück zum Zitat Lin J, Keogh E, Lonardi S, et al (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery. Association for Computing Machinery, New York, NY, USA, DMKD ’03, pp 2–11, https://doi.org/10.1145/882082.882086 Lin J, Keogh E, Lonardi S, et al (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery. Association for Computing Machinery, New York, NY, USA, DMKD ’03, pp 2–11, https://​doi.​org/​10.​1145/​882082.​882086
Zurück zum Zitat Mörchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of the 2010 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 315–326, https://doi.org/10.1137/1.9781611972801.28 Mörchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of the 2010 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 315–326, https://​doi.​org/​10.​1137/​1.​9781611972801.​28
Zurück zum Zitat Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD international conference on knowledge discovery in data mining. Association for Computing Machinery, New York, NY, USA, KDD ’05, pp 660–665, https://doi.org/10.1145/1081870.1081953 Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD international conference on knowledge discovery in data mining. Association for Computing Machinery, New York, NY, USA, KDD ’05, pp 660–665, https://​doi.​org/​10.​1145/​1081870.​1081953
Zurück zum Zitat Moskovitch R (2022) Multivariate time series mining. Wiley’s Data Mining and Knowledge Discovery Moskovitch R (2022) Multivariate time series mining. Wiley’s Data Mining and Knowledge Discovery
Zurück zum Zitat Moskovitch R, Peek N, Shahar Y (2009) Classification of ICU patients via temporal abstraction and temporal patterns mining. Notes of the intelligent data analysis in medicine and pharmacology (IDAMAP 2009) Workshop. American Medical Informatics Association, Verona, Italy, pp 35–40 Moskovitch R, Peek N, Shahar Y (2009) Classification of ICU patients via temporal abstraction and temporal patterns mining. Notes of the intelligent data analysis in medicine and pharmacology (IDAMAP 2009) Workshop. American Medical Informatics Association, Verona, Italy, pp 35–40
Zurück zum Zitat Patel D, Hsu W, Lee ML (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. Association for Computing Machinery, New York, NY, USA, SIGMOD ’08, pp 393–404, https://doi.org/10.1145/1376616.1376658 Patel D, Hsu W, Lee ML (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. Association for Computing Machinery, New York, NY, USA, SIGMOD ’08, pp 393–404, https://​doi.​org/​10.​1145/​1376616.​1376658
Zurück zum Zitat Shknevsky A, Shahar Y, Moskovitch R (2021) The semantic adjacency criterion in time intervals mining. arXiv preprint arXiv:2101.03842 Shknevsky A, Shahar Y, Moskovitch R (2021) The semantic adjacency criterion in time intervals mining. arXiv preprint arXiv:​2101.​03842
Zurück zum Zitat Yan X, Han J, Afshar R (2003) Clospan: mining: closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 166–177, https://doi.org/10.1137/1.9781611972733.15 Yan X, Han J, Afshar R (2003) Clospan: mining: closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 166–177, https://​doi.​org/​10.​1137/​1.​9781611972733.​15
Zurück zum Zitat Yang CW, Jaysawal BP, Huang JW (2017) Subsequence search considering duration and relations of events in time interval-based events sequences. In: 2017 IEEE International conference on data science and advanced analytics (DSAA), IEEE, pp 293–302, https://doi.org/10.1109/DSAA.2017.47 Yang CW, Jaysawal BP, Huang JW (2017) Subsequence search considering duration and relations of events in time interval-based events sequences. In: 2017 IEEE International conference on data science and advanced analytics (DSAA), IEEE, pp 293–302, https://​doi.​org/​10.​1109/​DSAA.​2017.​47
Zurück zum Zitat Yang Z, Wang Y, Kitsuregawa M (2007) Lapin: effective sequential pattern mining algorithms by last position induction for dense databases. In: International conference on database systems for advanced applications. Springer, Berlin, pp 1020–1023, https://doi.org/10.1007/978-3-540-71703-4_95 Yang Z, Wang Y, Kitsuregawa M (2007) Lapin: effective sequential pattern mining algorithms by last position induction for dense databases. In: International conference on database systems for advanced applications. Springer, Berlin, pp 1020–1023, https://​doi.​org/​10.​1007/​978-3-540-71703-4_​95
Zurück zum Zitat Zhao Q, Bhowmick SS (2003) Sequential pattern mining: a survey. ITechnical Report CAIS Nayang Technological University Singapore 1(26):135 Zhao Q, Bhowmick SS (2003) Sequential pattern mining: a survey. ITechnical Report CAIS Nayang Technological University Singapore 1(26):135
Metadaten
Titel
TIRPClo: efficient and complete mining of time intervals-related patterns
verfasst von
Omer Harel
Robert Moskovitch
Publikationsdatum
30.06.2023
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 5/2023
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-023-00944-6

Weitere Artikel der Ausgabe 5/2023

Data Mining and Knowledge Discovery 5/2023 Zur Ausgabe

Premium Partner