Skip to main content
Erschienen in: Pattern Analysis and Applications 2/2017

01.09.2016 | Industrial and Commercial Application

RLC: ranking lag correlations with flexible sliding windows in data streams

verfasst von: Shanshan Wu, Huaizhong Lin, Wenxiang Wang, Dongming Lu, Leong Hou U, Yunjun Gao

Erschienen in: Pattern Analysis and Applications | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Lag correlation between two time series is the correlation shifted in time relative to one another. Existing work focuses on two computation models, landmark (where the lag correlation is computed over the entire stream) and sliding window (where the lag correlation is computed over the current window). However, these models may suffer from problems like result freshness (e.g., perished items in the landmark model) and parametric tuning (e.g., setting a proper length in the sliding window model). In this work, we attempt to analyze the lag correlation which is computed based on flexible sliding windows. In view of that, a new query called RLC (ranking lag correlations with flexible sliding windows in data streams) is proposed. The key challenge in answering the RLC query is that the number of windows to be analyzed will grow quadratically with the length of the stream, resulting a quadratic computation cost. To boost the computation, we employ the running sum and the geometric probing techniques to facilitate the query processing. We also present an approximate solution that further reduces the computation cost with an acceptable error rate in practice. The extensive experiments verify the efficiency of our proposed methods. We also demonstrate some lag correlations discovered from real datasets to show the practicality of this work.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Regarding the extreme cases, one case is that there is no good match throughout the entire sequence. In our query, there is no minimum correlation constraint; we still return top-k subsequences with the highest lag correlations (even though all correlations are very low). The other case is that all object sequences are entirely identical to the query sequences. In our problem definition, we return the longer subsequence when multiple correlations are identical. Hence, in this case, we will return the entire sequence.
 
Literatur
1.
Zurück zum Zitat Athitsos V, Papapetrou P, Potamias M, Kollios G, Gunopulos D (2008) Approximate embedding-based subsequence matching of time series. In: SIGMOD, pp 365–378 Athitsos V, Papapetrou P, Potamias M, Kollios G, Gunopulos D (2008) Approximate embedding-based subsequence matching of time series. In: SIGMOD, pp 365–378
2.
Zurück zum Zitat Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis: forecasting and control. Prentice Hall, Upper Saddle RiverMATH Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis: forecasting and control. Prentice Hall, Upper Saddle RiverMATH
3.
Zurück zum Zitat Cai Y, Tong H, Fan W, Ji P (2015a) Fast mining of a network of coevolving time series. In: SDM Cai Y, Tong H, Fan W, Ji P (2015a) Fast mining of a network of coevolving time series. In: SDM
4.
Zurück zum Zitat Cai Y, Tong H, Fan W, Ji P, He Q (2015b) Facets: fast comprehensive mining of coevolving high-order time series. In: SIGKDD, pp 79–88 Cai Y, Tong H, Fan W, Ji P, He Q (2015b) Facets: fast comprehensive mining of coevolving high-order time series. In: SIGKDD, pp 79–88
5.
Zurück zum Zitat Cao J, Zhou Y, Wu M (2015) Adaptive grid-based k-median clustering of streaming data with accuracy guarantee. In: DASFAA, pp 75–91 Cao J, Zhou Y, Wu M (2015) Adaptive grid-based k-median clustering of streaming data with accuracy guarantee. In: DASFAA, pp 75–91
6.
Zurück zum Zitat Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: SIGMOD, pp 419–429 Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: SIGMOD, pp 419–429
7.
Zurück zum Zitat Gong X, Xiong Y, Huang W, Chen L, Lu Q, Hu Y (2015) Fast similarity search of multi-dimensional time series via segment rotation. In: DASFAA, pp 108–124 Gong X, Xiong Y, Huang W, Chen L, Lu Q, Hu Y (2015) Fast similarity search of multi-dimensional time series via segment rotation. In: DASFAA, pp 108–124
8.
Zurück zum Zitat Kahveci T, Singh AK (2004) Optimizing similarity search for arbitrary length time series queries. IEEE Trans Knowl Data Eng 16:418–433CrossRef Kahveci T, Singh AK (2004) Optimizing similarity search for arbitrary length time series queries. IEEE Trans Knowl Data Eng 16:418–433CrossRef
9.
Zurück zum Zitat Koper KD, Wallace TC, Taylor SR, Hartse HE (2001) Forensic seismology and the sinking of the kursk. EOS Trans Am Geophys Union 82:37–46CrossRef Koper KD, Wallace TC, Taylor SR, Hartse HE (2001) Forensic seismology and the sinking of the kursk. EOS Trans Am Geophys Union 82:37–46CrossRef
10.
Zurück zum Zitat Kusmierczyk T, Nørvåg K (2015) Mining correlations on massive bursty time series collections. In: DASFAA, pp 55–71 Kusmierczyk T, Nørvåg K (2015) Mining correlations on massive bursty time series collections. In: DASFAA, pp 55–71
11.
Zurück zum Zitat Lee ML, Hsu W, Li L, Tok WH (2009) Consistent top-k queries over time. In: DASFAA, pp 51–65 Lee ML, Hsu W, Li L, Tok WH (2009) Consistent top-k queries over time. In: DASFAA, pp 51–65
12.
Zurück zum Zitat Li Y, Yiu ML, Gong Z (2013) Discovering longest-lasting correlation in sequence databases. Proc VLDB Endow 6:1666–1677CrossRef Li Y, Yiu ML, Gong Z (2013) Discovering longest-lasting correlation in sequence databases. Proc VLDB Endow 6:1666–1677CrossRef
13.
Zurück zum Zitat Li Y, Leong Hou U, Yiu ML, Gong Z (2015) Quick-motif: an efficient and scalable framework for exact motif discovery. In: ICDE Li Y, Leong Hou U, Yiu ML, Gong Z (2015) Quick-motif: an efficient and scalable framework for exact motif discovery. In: ICDE
14.
Zurück zum Zitat Mueen A (2013) Enumeration of time series motifs of all lengths. In: ICDM, pp 547–556 Mueen A (2013) Enumeration of time series motifs of all lengths. In: ICDM, pp 547–556
15.
Zurück zum Zitat Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484 Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484
16.
Zurück zum Zitat Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: SIGKDD, pp 1154–1162 Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: SIGKDD, pp 1154–1162
17.
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: SIGKDD, pp 262–270 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: SIGKDD, pp 262–270
18.
Zurück zum Zitat Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGKDD, pp 599–610 Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGKDD, pp 599–610
19.
Zurück zum Zitat Traina A, Traina C Jr, Faloutsos C (2001) Similarity search without tears: the omni-family of all-purpose access methods. In: ICDE, pp 623–630 Traina A, Traina C Jr, Faloutsos C (2001) Similarity search without tears: the omni-family of all-purpose access methods. In: ICDE, pp 623–630
20.
Zurück zum Zitat Wu D, Ke Y, Yu JX, Philip SY, Chen L (2010) Detecting leaders from correlated time series. In: DASFAA, pp 352–367 Wu D, Ke Y, Yu JX, Philip SY, Chen L (2010) Detecting leaders from correlated time series. In: DASFAA, pp 352–367
21.
Zurück zum Zitat Wu D, Ke Y, Yu JX, Philip SY, Chen L (2011) Leadership discovery when data correlatively evolve. World Wide Web 14:1–25CrossRef Wu D, Ke Y, Yu JX, Philip SY, Chen L (2011) Leadership discovery when data correlatively evolve. World Wide Web 14:1–25CrossRef
22.
Zurück zum Zitat Xu E, Hsu W, Lee ML, Patel D (2015) k-Consistent influencers in network data. In: DASFAA, pp 452–468 Xu E, Hsu W, Lee ML, Patel D (2015) k-Consistent influencers in network data. In: DASFAA, pp 452–468
23.
Zurück zum Zitat Zhou X, Hong H, Xing X, Huang W, Bian K, Xie K (2015) Mining dependencies considering time lag in spatio-temporal traffic data. Web-age information management. Springer, Berlin Zhou X, Hong H, Xing X, Huang W, Bian K, Xie K (2015) Mining dependencies considering time lag in spatio-temporal traffic data. Web-age information management. Springer, Berlin
Metadaten
Titel
RLC: ranking lag correlations with flexible sliding windows in data streams
verfasst von
Shanshan Wu
Huaizhong Lin
Wenxiang Wang
Dongming Lu
Leong Hou U
Yunjun Gao
Publikationsdatum
01.09.2016
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 2/2017
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-016-0577-4

Weitere Artikel der Ausgabe 2/2017

Pattern Analysis and Applications 2/2017 Zur Ausgabe