Skip to main content
Erschienen in: Knowledge and Information Systems 1/2017

12.04.2016 | Regular Paper

Correlation analysis techniques for uncertain time series

verfasst von: Mahsa Orang, Nematollaah Shiri

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many applications such as location-based services and wireless sensor networks generate and deal with uncertain time series (UTS), where the “exact” value at each timestamp is unknown. Traditional correlation analysis and search techniques developed for standard time series are inadequate for UTS data analysis required in such applications. Motivated by this need, we propose suitable concepts and techniques for UTS correlation analysis. We formalize the notion of normalization and correlation for UTS in two general settings based on the available information at each timestamp: (1) PDF-based UTS (having probability density function) and (2) multiset-based UTS (having multiset of observed values). For each case, we formulate correlation as a random variable and develop techniques to determine the underlying probability density function. For setup (2), we also present probabilistic pruning and sampling techniques to speed up the search process. We conducted numerous experiments to evaluate the performance of the proposed techniques under different configurations using the UCR benchmark datasets. Our results indicate effectiveness of the proposed techniques. For setup (2), in particular, our results show significant improvement in space utilization and computation time. We believe the proposed ideas and solutions lend themselves to powerful tools for UTS analysis and search tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Asfalg J, Kriegel HP, Kröger P, Renz M (2009) Probabilistic similarity search for uncertain time series. In: Proceedings of international conference on scientific and statistical database management (SSDBM), pp 435–443 Asfalg J, Kriegel HP, Kröger P, Renz M (2009) Probabilistic similarity search for uncertain time series. In: Proceedings of international conference on scientific and statistical database management (SSDBM), pp 435–443
2.
Zurück zum Zitat Bagnall A, Ratanamahatana CA, Keogh E, Lonardi S, Janacek G (2006) A bit level representation for time series data mining with shape based similarity. Proc Data Min Knowl Discov J 13(1):11–40MathSciNetCrossRef Bagnall A, Ratanamahatana CA, Keogh E, Lonardi S, Janacek G (2006) A bit level representation for time series data mining with shape based similarity. Proc Data Min Knowl Discov J 13(1):11–40MathSciNetCrossRef
3.
Zurück zum Zitat Bernecker T, Kriegel H-P, Renz M, Zuefle A (2009) Probabilistic ranking in uncertain vector spaces. In: Proceedings of workshop on managing data quality in collaborative information systems Bernecker T, Kriegel H-P, Renz M, Zuefle A (2009) Probabilistic ranking in uncertain vector spaces. In: Proceedings of workshop on managing data quality in collaborative information systems
4.
Zurück zum Zitat Bohm C, Pryakhin A, Schubert M (2006) The Gauss-tree: efficient object identification of probabilistic feature vectors. In: Proceedings of international conference on data engineering (ICDE) Bohm C, Pryakhin A, Schubert M (2006) The Gauss-tree: efficient object identification of probabilistic feature vectors. In: Proceedings of international conference on data engineering (ICDE)
5.
Zurück zum Zitat Cheng R, Kalashnikov DV, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of ACM SIGMOD international conference on management of data, pp 551–562 Cheng R, Kalashnikov DV, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of ACM SIGMOD international conference on management of data, pp 551–562
6.
Zurück zum Zitat Cheng R, Kalashnikov DV, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng 9(16):1112–1127CrossRef Cheng R, Kalashnikov DV, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng 9(16):1112–1127CrossRef
7.
Zurück zum Zitat Cheng R, Singh S, Prabhakar S, Shah R, Vitter JS, Xia Y (2006) Efficient join processing over uncertain data. In: Proceedings of ACM international conference on information and knowledge management (CIKM), pp 738–747 Cheng R, Singh S, Prabhakar S, Shah R, Vitter JS, Xia Y (2006) Efficient join processing over uncertain data. In: Proceedings of ACM international conference on information and knowledge management (CIKM), pp 738–747
9.
Zurück zum Zitat Dallachiesa M, Jacques-Silva G, Gedik B, Wu KL, Palpanas T (2014) Sliding windows over uncertain data streams. Knowl Inf Syst J 45(1):159–190 Dallachiesa M, Jacques-Silva G, Gedik B, Wu KL, Palpanas T (2014) Sliding windows over uncertain data streams. Knowl Inf Syst J 45(1):159–190
10.
Zurück zum Zitat Dallachiesa M, Nushi B, Mirylenka K, Palpanas T (2012) Uncertain time series similarity: return to the basics. Proc VLDB Endow 5(11):1662–1673CrossRef Dallachiesa M, Nushi B, Mirylenka K, Palpanas T (2012) Uncertain time series similarity: return to the basics. Proc VLDB Endow 5(11):1662–1673CrossRef
11.
Zurück zum Zitat Dallachiesa M, Palpanas T, Ilyas IF (2014) Top-k nearest neighbor search in uncertain data series. Proc VLDB Endow J 8(1):13–24CrossRef Dallachiesa M, Palpanas T, Ilyas IF (2014) Top-k nearest neighbor search in uncertain data series. Proc VLDB Endow J 8(1):13–24CrossRef
12.
Zurück zum Zitat Dvoretzky A, Kiefer J, Wolfowitz J (1956) Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann Math Stat 27(3):642–669MathSciNetCrossRefMATH Dvoretzky A, Kiefer J, Wolfowitz J (1956) Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann Math Stat 27(3):642–669MathSciNetCrossRefMATH
13.
Zurück zum Zitat Emrich T, Kriegel H-P, Mamoulis N, Renz M, Zufle A (2012) Querying uncertain spatio-temporal data. In: Proceedings of international conference on data engineering (ICDE), pp 354–365 Emrich T, Kriegel H-P, Mamoulis N, Renz M, Zufle A (2012) Querying uncertain spatio-temporal data. In: Proceedings of international conference on data engineering (ICDE), pp 354–365
14.
Zurück zum Zitat Hong Y (2013) On computing the distribution function for the Poisson binomial distribution. Comput Stat Data Anal 59:41–51MathSciNetCrossRef Hong Y (2013) On computing the distribution function for the Poisson binomial distribution. Comput Stat Data Anal 59:41–51MathSciNetCrossRef
16.
Zurück zum Zitat Kriegel H-P, Kunath P, Renz M (2007) Probabilistic nearest-neighbor query on uncertain objects. In: Proceedings of international conference on database systems for advanced, pp 337–348 Kriegel H-P, Kunath P, Renz M (2007) Probabilistic nearest-neighbor query on uncertain objects. In: Proceedings of international conference on database systems for advanced, pp 337–348
17.
Zurück zum Zitat Lian X, Chen L, Yu JW (2008) Pattern matching over cloaked time series. In: Proceedings of international conference on data engineering (ICDE), pp 1462–1464 Lian X, Chen L, Yu JW (2008) Pattern matching over cloaked time series. In: Proceedings of international conference on data engineering (ICDE), pp 1462–1464
18.
Zurück zum Zitat Ljosa V, Singh AK (2007) APLA: indexing arbitrary probability distributions. In: Proceedings of international conference on data engineering (ICDE), pp 946–955 Ljosa V, Singh AK (2007) APLA: indexing arbitrary probability distributions. In: Proceedings of international conference on data engineering (ICDE), pp 946–955
19.
Zurück zum Zitat Lomnicki ZA, Zaremba SK (1955) Some applications of zero-one processes. Proc J R Stat Soc 17(2):243–255MathSciNetMATH Lomnicki ZA, Zaremba SK (1955) Some applications of zero-one processes. Proc J R Stat Soc 17(2):243–255MathSciNetMATH
20.
21.
Zurück zum Zitat Nguyen P, Shiri N (2008) Fast correlation analysis on time series datasets. In: Proceedings of the ACM conference on information and knowledge management (CIKM), pp 787–796 Nguyen P, Shiri N (2008) Fast correlation analysis on time series datasets. In: Proceedings of the ACM conference on information and knowledge management (CIKM), pp 787–796
22.
Zurück zum Zitat Orang M, Shiri N (2012) A probabilistic approach to correlation queries in uncertain time series data. In: Proceedings of the ACM conference on information and knowledge management (CIKM), pp 2229–2233 Orang M, Shiri N (2012) A probabilistic approach to correlation queries in uncertain time series data. In: Proceedings of the ACM conference on information and knowledge management (CIKM), pp 2229–2233
23.
Zurück zum Zitat Orang M, Shiri N (2014) An experimental evaluation of similarity measures for uncertain time series. In: Proceedings of international database engineering and applications symposium (IDEAS), pp 261–264 Orang M, Shiri N (2014) An experimental evaluation of similarity measures for uncertain time series. In: Proceedings of international database engineering and applications symposium (IDEAS), pp 261–264
24.
Zurück zum Zitat Orang M, Shiri N (2015) Improving performance of similarity measures for uncertain time series using preprocessing techniques. In: Proceedings of international conference on scientific and statistical database management (SSDBM), vol 31, pp 1–12 Orang M, Shiri N (2015) Improving performance of similarity measures for uncertain time series using preprocessing techniques. In: Proceedings of international conference on scientific and statistical database management (SSDBM), vol 31, pp 1–12
25.
Zurück zum Zitat Ross SM (2009) Introductory statistics. Academic Press, San DiegoMATH Ross SM (2009) Introductory statistics. Academic Press, San DiegoMATH
26.
Zurück zum Zitat Sarangi SR, Murth K (2010) DUST: a generalized notion of similarity between uncertain time series. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 383–392 Sarangi SR, Murth K (2010) DUST: a generalized notion of similarity between uncertain time series. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 383–392
27.
Zurück zum Zitat Shasha D, Zhu Y (2004) High performance discovery in time series: techniques and case studies. Springer, New YorkCrossRefMATH Shasha D, Zhu Y (2004) High performance discovery in time series: techniques and case studies. Springer, New YorkCrossRefMATH
28.
Zurück zum Zitat Shorack GR, Wellner JA (2009) Empirical processes with applications to statistics. Society for Industrial and Applied Mathematics, Philadelphia Shorack GR, Wellner JA (2009) Empirical processes with applications to statistics. Society for Industrial and Applied Mathematics, Philadelphia
29.
Zurück zum Zitat Tao Y, Cheng R, Xiao X, Ngai W, Kao B, Prabhakar S (2005) Indexing multidimensional uncertain data with arbitrary probability density functions. In: Proceedings of international conference on very large data bases (VLDB), pp 922–933 Tao Y, Cheng R, Xiao X, Ngai W, Kao B, Prabhakar S (2005) Indexing multidimensional uncertain data with arbitrary probability density functions. In: Proceedings of international conference on very large data bases (VLDB), pp 922–933
30.
Zurück zum Zitat Weld DS, de Kleer J (1990) Readings in qualitative reasoning about physical systems. Morgan Kaufmann, Burlington Weld DS, de Kleer J (1990) Readings in qualitative reasoning about physical systems. Morgan Kaufmann, Burlington
31.
Zurück zum Zitat Wu WCH, Yeh MY, Pei J (2012) Random error reduction in similarity search on time series: a statistical approach. In: Proceedings of IEEE international conference on data engineering (ICDE), pp 858–869 Wu WCH, Yeh MY, Pei J (2012) Random error reduction in similarity search on time series: a statistical approach. In: Proceedings of IEEE international conference on data engineering (ICDE), pp 858–869
32.
Zurück zum Zitat Yeh MY, Wu KL, Yu PS, Chen MS (2009) PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. In: Proceedings of international conference on extending database technology, advances in database technology (EDBT), pp 684–695 Yeh MY, Wu KL, Yu PS, Chen MS (2009) PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. In: Proceedings of international conference on extending database technology, advances in database technology (EDBT), pp 684–695
33.
Zurück zum Zitat Zhang L, Li J, Wang Z (2011) Uneven two-step sampling and distance calculation for uncertain trajectory. J Inf Comput Sci 9(8):1505–1513 Zhang L, Li J, Wang Z (2011) Uneven two-step sampling and distance calculation for uncertain trajectory. J Inf Comput Sci 9(8):1505–1513
34.
Zurück zum Zitat Zhang T, Yue D, Yu G, Gu Y (2007) Correlation analysis based on hierarchical Boolean representation over time series data streams. In: Proceedings of international conference on fuzzy systems and knowledge discovery (FSKD), vol 2, pp 740–744 Zhang T, Yue D, Yu G, Gu Y (2007) Correlation analysis based on hierarchical Boolean representation over time series data streams. In: Proceedings of international conference on fuzzy systems and knowledge discovery (FSKD), vol 2, pp 740–744
35.
Zurück zum Zitat Zhao Y, Aggarwal CC, Yu PS (2010) On wavelet decomposition of uncertain time series data sets. In: Proceedings of ACM international conference on information and knowledge management (CIKM), pp 129–138 Zhao Y, Aggarwal CC, Yu PS (2010) On wavelet decomposition of uncertain time series data sets. In: Proceedings of ACM international conference on information and knowledge management (CIKM), pp 129–138
Metadaten
Titel
Correlation analysis techniques for uncertain time series
verfasst von
Mahsa Orang
Nematollaah Shiri
Publikationsdatum
12.04.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2017
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-016-0939-7

Weitere Artikel der Ausgabe 1/2017

Knowledge and Information Systems 1/2017 Zur Ausgabe