Skip to main content

2017 | OriginalPaper | Buchkapitel

Top-k Pattern Matching Using an Information-Theoretic Criterion over Probabilistic Data Streams

verfasst von : Kento Sugiura, Yoshiharu Ishikawa

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the development of data mining technologies for sensor data streams, more sophisticated methods for complex event processing are demanded. In the case of event recognition, since event recognition results may contain errors, we need to deal with the uncertainty of events. We therefore consider probabilistic event data streams with occurrence probabilities of events, and develop a pattern matching method based on regular expressions. In this paper, we first analyze the semantics of pattern matching over non-probabilistic data streams, and then propose the problem of top-k pattern matching over probabilistic data streams. We introduce the use of an information-theoretic criterion to select appropriate matches as the result of pattern matching. Then, we present an efficient algorithm to detect top-k matches, and evaluate the effectiveness of our approach using real and synthetic datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: 2008 IEEE 24th ICDE, pp. 150–159 (2008) Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: 2008 IEEE 24th ICDE, pp. 150–159 (2008)
2.
3.
Zurück zum Zitat Akdere, M., Çetintemel, U., Tatbul, N.: Plan-based complex event detection across distributed sources. Proc. VLDB Endow. 1(1), 66–77 (2008)CrossRef Akdere, M., Çetintemel, U., Tatbul, N.: Plan-based complex event detection across distributed sources. Proc. VLDB Endow. 1(1), 66–77 (2008)CrossRef
4.
Zurück zum Zitat Chandramouli, B., Goldstein, J., Maier, D.: High-performance dynamic pattern matching over disordered streams. Proc. VLDB Endow. 3(1–2), 220–231 (2010)CrossRef Chandramouli, B., Goldstein, J., Maier, D.: High-performance dynamic pattern matching over disordered streams. Proc. VLDB Endow. 3(1–2), 220–231 (2010)CrossRef
5.
Zurück zum Zitat Chen, L., Nugent, C., Wang, H.: A knowledge-driven approach to activity recognition in smart homes. IEEE TKDE 24(6), 961–974 (2012) Chen, L., Nugent, C., Wang, H.: A knowledge-driven approach to activity recognition in smart homes. IEEE TKDE 24(6), 961–974 (2012)
6.
Zurück zum Zitat Cormode, G., Garofalakis, M.: Sketching probabilistic data streams. In: Proceedings of 2007 ACM SIGMOD, pp. 281–292 (2007) Cormode, G., Garofalakis, M.: Sketching probabilistic data streams. In: Proceedings of 2007 ACM SIGMOD, pp. 281–292 (2007)
7.
Zurück zum Zitat Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)MATH Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)MATH
8.
Zurück zum Zitat Cugola, G., Margara, A.: Processing flows of information: from data stream to complex event processing. ACM Comput. Surv. 44(3), 15:1–15:62 (2012)CrossRef Cugola, G., Margara, A.: Processing flows of information: from data stream to complex event processing. ACM Comput. Surv. 44(3), 15:1–15:62 (2012)CrossRef
9.
Zurück zum Zitat Diao, Y., Fischer, P., Franklin, M.J., To, R.: YFilter: efficient and scalable filtering of XML documents. In: Proceedings of 18th ICDE, pp. 341–342 (2002) Diao, Y., Fischer, P., Franklin, M.J., To, R.: YFilter: efficient and scalable filtering of XML documents. In: Proceedings of 18th ICDE, pp. 341–342 (2002)
11.
Zurück zum Zitat Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley, Boston (2000)MATH Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley, Boston (2000)MATH
12.
Zurück zum Zitat Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queries on uncertain streams. Proc. VLDB Endow. 1(1), 301–312 (2008)CrossRef Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queries on uncertain streams. Proc. VLDB Endow. 1(1), 301–312 (2008)CrossRef
13.
14.
Zurück zum Zitat Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 15(3), 1192–1209 (2013)CrossRef Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 15(3), 1192–1209 (2013)CrossRef
15.
Zurück zum Zitat Li, Z., Ge, T., Chen, C.X.: \(\varepsilon \)-matching: event processing over noisy sequences in real time. In: Proceedings of 2013 ACM SIGMOD, pp. 601–612 (2013) Li, Z., Ge, T., Chen, C.X.: \(\varepsilon \)-matching: event processing over noisy sequences in real time. In: Proceedings of 2013 ACM SIGMOD, pp. 601–612 (2013)
16.
Zurück zum Zitat Liu, M., Golovnya, D., Rundensteiner, E.A., Claypool, K.T.: Sequence pattern query processing over out-of-order event streams. In: 2009 IEEE 25th ICDE, pp. 784–795 (2009) Liu, M., Golovnya, D., Rundensteiner, E.A., Claypool, K.T.: Sequence pattern query processing over out-of-order event streams. In: 2009 IEEE 25th ICDE, pp. 784–795 (2009)
17.
Zurück zum Zitat Mei, Y., Madden, S.: ZStream: a cost-based query processor for adaptively detecting composite events. In: Proceedings of 2009 ACM SIGMOD, pp. 193–206 (2009) Mei, Y., Madden, S.: ZStream: a cost-based query processor for adaptively detecting composite events. In: Proceedings of 2009 ACM SIGMOD, pp. 193–206 (2009)
18.
Zurück zum Zitat Nakata, I.: Generation of pattern-matching algorithms by extended regular expressions. Japan Soc. Softw. Sci. Tech. 5, 1–9 (1993)CrossRef Nakata, I.: Generation of pattern-matching algorithms by extended regular expressions. Japan Soc. Softw. Sci. Tech. 5, 1–9 (1993)CrossRef
19.
Zurück zum Zitat Ré, C., Letchner, J., Balazinska, M., Suciu, D.: Event queries on correlated probabilistic streams. In: Proceedings of 2008 ACM SIGMOD, pp. 715–728 (2008) Ré, C., Letchner, J., Balazinska, M., Suciu, D.: Event queries on correlated probabilistic streams. In: Proceedings of 2008 ACM SIGMOD, pp. 715–728 (2008)
20.
Zurück zum Zitat Santini, S.: Querying streams using regular expressions: some semantics, decidability, and efficiency issues. VLDB J. 24(6), 801–821 (2015)MathSciNetCrossRef Santini, S.: Querying streams using regular expressions: some semantics, decidability, and efficiency issues. VLDB J. 24(6), 801–821 (2015)MathSciNetCrossRef
21.
Zurück zum Zitat Thompson, K.: Programming techniques: regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)CrossRefMATH Thompson, K.: Programming techniques: regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)CrossRefMATH
22.
Zurück zum Zitat Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDB J. 21(5), 651–676 (2012)CrossRef Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDB J. 21(5), 651–676 (2012)CrossRef
23.
Zurück zum Zitat Woods, L., Teubner, J., Alonso, G.: Complex event detection at wire speed with FPGAs. Proc. VLDB Endow. 3(1–2), 660–669 (2010)CrossRef Woods, L., Teubner, J., Alonso, G.: Complex event detection at wire speed with FPGAs. Proc. VLDB Endow. 3(1–2), 660–669 (2010)CrossRef
24.
Zurück zum Zitat Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of 2006 ACM SIGMOD, pp. 407–418 (2006) Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of 2006 ACM SIGMOD, pp. 407–418 (2006)
25.
Zurück zum Zitat Yin, J., Yang, Q., Pan, J.J.: Sensor-based abnormal human-activity detection. IEEE TKDE 20(8), 1082–1090 (2008) Yin, J., Yang, Q., Pan, J.J.: Sensor-based abnormal human-activity detection. IEEE TKDE 20(8), 1082–1090 (2008)
26.
Zurück zum Zitat Zhang, H., Diao, Y., Immerman, N.: On complexity and optimization of expensive queries in complex event processing. In: Proceedings of 2014 ACM SIGMOD, pp. 217–228 (2014) Zhang, H., Diao, Y., Immerman, N.: On complexity and optimization of expensive queries in complex event processing. In: Proceedings of 2014 ACM SIGMOD, pp. 217–228 (2014)
27.
Zurück zum Zitat Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proceedings of 2008 ACM SIGMOD, pp. 819–832 (2008) Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proceedings of 2008 ACM SIGMOD, pp. 819–832 (2008)
Metadaten
Titel
Top-k Pattern Matching Using an Information-Theoretic Criterion over Probabilistic Data Streams
verfasst von
Kento Sugiura
Yoshiharu Ishikawa
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63579-8_39