Skip to main content
Top
Published in: Pattern Analysis and Applications 2/2017

01-09-2016 | Industrial and Commercial Application

RLC: ranking lag correlations with flexible sliding windows in data streams

Authors: Shanshan Wu, Huaizhong Lin, Wenxiang Wang, Dongming Lu, Leong Hou U, Yunjun Gao

Published in: Pattern Analysis and Applications | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Lag correlation between two time series is the correlation shifted in time relative to one another. Existing work focuses on two computation models, landmark (where the lag correlation is computed over the entire stream) and sliding window (where the lag correlation is computed over the current window). However, these models may suffer from problems like result freshness (e.g., perished items in the landmark model) and parametric tuning (e.g., setting a proper length in the sliding window model). In this work, we attempt to analyze the lag correlation which is computed based on flexible sliding windows. In view of that, a new query called RLC (ranking lag correlations with flexible sliding windows in data streams) is proposed. The key challenge in answering the RLC query is that the number of windows to be analyzed will grow quadratically with the length of the stream, resulting a quadratic computation cost. To boost the computation, we employ the running sum and the geometric probing techniques to facilitate the query processing. We also present an approximate solution that further reduces the computation cost with an acceptable error rate in practice. The extensive experiments verify the efficiency of our proposed methods. We also demonstrate some lag correlations discovered from real datasets to show the practicality of this work.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Regarding the extreme cases, one case is that there is no good match throughout the entire sequence. In our query, there is no minimum correlation constraint; we still return top-k subsequences with the highest lag correlations (even though all correlations are very low). The other case is that all object sequences are entirely identical to the query sequences. In our problem definition, we return the longer subsequence when multiple correlations are identical. Hence, in this case, we will return the entire sequence.
 
Literature
1.
go back to reference Athitsos V, Papapetrou P, Potamias M, Kollios G, Gunopulos D (2008) Approximate embedding-based subsequence matching of time series. In: SIGMOD, pp 365–378 Athitsos V, Papapetrou P, Potamias M, Kollios G, Gunopulos D (2008) Approximate embedding-based subsequence matching of time series. In: SIGMOD, pp 365–378
2.
go back to reference Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis: forecasting and control. Prentice Hall, Upper Saddle RiverMATH Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis: forecasting and control. Prentice Hall, Upper Saddle RiverMATH
3.
go back to reference Cai Y, Tong H, Fan W, Ji P (2015a) Fast mining of a network of coevolving time series. In: SDM Cai Y, Tong H, Fan W, Ji P (2015a) Fast mining of a network of coevolving time series. In: SDM
4.
go back to reference Cai Y, Tong H, Fan W, Ji P, He Q (2015b) Facets: fast comprehensive mining of coevolving high-order time series. In: SIGKDD, pp 79–88 Cai Y, Tong H, Fan W, Ji P, He Q (2015b) Facets: fast comprehensive mining of coevolving high-order time series. In: SIGKDD, pp 79–88
5.
go back to reference Cao J, Zhou Y, Wu M (2015) Adaptive grid-based k-median clustering of streaming data with accuracy guarantee. In: DASFAA, pp 75–91 Cao J, Zhou Y, Wu M (2015) Adaptive grid-based k-median clustering of streaming data with accuracy guarantee. In: DASFAA, pp 75–91
6.
go back to reference Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: SIGMOD, pp 419–429 Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: SIGMOD, pp 419–429
7.
go back to reference Gong X, Xiong Y, Huang W, Chen L, Lu Q, Hu Y (2015) Fast similarity search of multi-dimensional time series via segment rotation. In: DASFAA, pp 108–124 Gong X, Xiong Y, Huang W, Chen L, Lu Q, Hu Y (2015) Fast similarity search of multi-dimensional time series via segment rotation. In: DASFAA, pp 108–124
8.
go back to reference Kahveci T, Singh AK (2004) Optimizing similarity search for arbitrary length time series queries. IEEE Trans Knowl Data Eng 16:418–433CrossRef Kahveci T, Singh AK (2004) Optimizing similarity search for arbitrary length time series queries. IEEE Trans Knowl Data Eng 16:418–433CrossRef
9.
go back to reference Koper KD, Wallace TC, Taylor SR, Hartse HE (2001) Forensic seismology and the sinking of the kursk. EOS Trans Am Geophys Union 82:37–46CrossRef Koper KD, Wallace TC, Taylor SR, Hartse HE (2001) Forensic seismology and the sinking of the kursk. EOS Trans Am Geophys Union 82:37–46CrossRef
10.
go back to reference Kusmierczyk T, Nørvåg K (2015) Mining correlations on massive bursty time series collections. In: DASFAA, pp 55–71 Kusmierczyk T, Nørvåg K (2015) Mining correlations on massive bursty time series collections. In: DASFAA, pp 55–71
11.
go back to reference Lee ML, Hsu W, Li L, Tok WH (2009) Consistent top-k queries over time. In: DASFAA, pp 51–65 Lee ML, Hsu W, Li L, Tok WH (2009) Consistent top-k queries over time. In: DASFAA, pp 51–65
12.
go back to reference Li Y, Yiu ML, Gong Z (2013) Discovering longest-lasting correlation in sequence databases. Proc VLDB Endow 6:1666–1677CrossRef Li Y, Yiu ML, Gong Z (2013) Discovering longest-lasting correlation in sequence databases. Proc VLDB Endow 6:1666–1677CrossRef
13.
go back to reference Li Y, Leong Hou U, Yiu ML, Gong Z (2015) Quick-motif: an efficient and scalable framework for exact motif discovery. In: ICDE Li Y, Leong Hou U, Yiu ML, Gong Z (2015) Quick-motif: an efficient and scalable framework for exact motif discovery. In: ICDE
14.
go back to reference Mueen A (2013) Enumeration of time series motifs of all lengths. In: ICDM, pp 547–556 Mueen A (2013) Enumeration of time series motifs of all lengths. In: ICDM, pp 547–556
15.
go back to reference Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484 Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484
16.
go back to reference Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: SIGKDD, pp 1154–1162 Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: SIGKDD, pp 1154–1162
17.
go back to reference Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: SIGKDD, pp 262–270 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: SIGKDD, pp 262–270
18.
go back to reference Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGKDD, pp 599–610 Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGKDD, pp 599–610
19.
go back to reference Traina A, Traina C Jr, Faloutsos C (2001) Similarity search without tears: the omni-family of all-purpose access methods. In: ICDE, pp 623–630 Traina A, Traina C Jr, Faloutsos C (2001) Similarity search without tears: the omni-family of all-purpose access methods. In: ICDE, pp 623–630
20.
go back to reference Wu D, Ke Y, Yu JX, Philip SY, Chen L (2010) Detecting leaders from correlated time series. In: DASFAA, pp 352–367 Wu D, Ke Y, Yu JX, Philip SY, Chen L (2010) Detecting leaders from correlated time series. In: DASFAA, pp 352–367
21.
go back to reference Wu D, Ke Y, Yu JX, Philip SY, Chen L (2011) Leadership discovery when data correlatively evolve. World Wide Web 14:1–25CrossRef Wu D, Ke Y, Yu JX, Philip SY, Chen L (2011) Leadership discovery when data correlatively evolve. World Wide Web 14:1–25CrossRef
22.
go back to reference Xu E, Hsu W, Lee ML, Patel D (2015) k-Consistent influencers in network data. In: DASFAA, pp 452–468 Xu E, Hsu W, Lee ML, Patel D (2015) k-Consistent influencers in network data. In: DASFAA, pp 452–468
23.
go back to reference Zhou X, Hong H, Xing X, Huang W, Bian K, Xie K (2015) Mining dependencies considering time lag in spatio-temporal traffic data. Web-age information management. Springer, Berlin Zhou X, Hong H, Xing X, Huang W, Bian K, Xie K (2015) Mining dependencies considering time lag in spatio-temporal traffic data. Web-age information management. Springer, Berlin
Metadata
Title
RLC: ranking lag correlations with flexible sliding windows in data streams
Authors
Shanshan Wu
Huaizhong Lin
Wenxiang Wang
Dongming Lu
Leong Hou U
Yunjun Gao
Publication date
01-09-2016
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 2/2017
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-016-0577-4

Other articles of this Issue 2/2017

Pattern Analysis and Applications 2/2017 Go to the issue

Premium Partner