Skip to main content

2014 | OriginalPaper | Buchkapitel

2. Geodata Stream Summarization

verfasst von : Annalisa Appice, Anna Ciampi, Fabio Fumarola, Donato Malerba

Erschienen in: Data Mining Techniques in Sensor Networks

Verlag: Springer London

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The management of massive amounts of geodata collected by sensor networks creates several challenges, including the real-time application of summarization techniques, which should allow the storage of this unbounded volume of georeferenced and timestamped data in a server with a limited memory for any future query. SUMATRA is a summarization technique, which accounts for spatial and temporal information of sensor data to produce the appropriate trade-off between size and accuracy of geodata summarization. It uses the count-based model to process the stream. In particular, it segments the stream into windows, computes summaries window-by-window, and stores these summaries in a database. The trend clusters are discovered as a summary of each window. They are clusters of georeferenced data, which vary according to a similar trend along the time horizon of the window. Signal compression techniques are also considered to derive a compact representation of these trends for storage in the database. The empirical analysis of trend clusters contributes to assess the summarization capability, the accuracy, and the efficiency of the trend cluster-based summarization schema in real applications. Finally, a stream cube, called geo-trend stream cube, is defined. It uses trends to aggregate a numeric measure, which is streamed by a sensor network and is organized around space and time dimensions. Space-time roll-up and drill-down operators allow the exploration of trends from a coarse-grained and inner-grained hierarchical view.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The investigation of the in-network modality for this anomaly detection service is postponed to future developments of this study.
 
2
\(Z_{h}\) and \(Z_{w-h}\) are complex conjugates [23].
 
3
This identity expresses in some way the law of conservation of energy.
 
8
PVGIS is a map-based inventory of the PV plants electricity productions.
 
Literatur
1.
Zurück zum Zitat R. Motwani, P. Raghavan, Randomized Algorithms (Cambridge University Press, New York, 1995)CrossRefMATH R. Motwani, P. Raghavan, Randomized Algorithms (Cambridge University Press, New York, 1995)CrossRefMATH
2.
Zurück zum Zitat S. Acharya, P.B. Gibbons, V. Poosala, Congressional samples for approximate answering of group-by queries, in Proceedings of the International Conference on Management of Data (SIGMOD 2000) (ACM, 2000), pp. 487–498 S. Acharya, P.B. Gibbons, V. Poosala, Congressional samples for approximate answering of group-by queries, in Proceedings of the International Conference on Management of Data (SIGMOD 2000) (ACM, 2000), pp. 487–498
3.
Zurück zum Zitat Y. Zhu, D. Shasha, Statstream: statistical monitoring of thousands of data streams in real time, in Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002) (VLDB Endowment, 2002), pp. 358–369 Y. Zhu, D. Shasha, Statstream: statistical monitoring of thousands of data streams in real time, in Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002) (VLDB Endowment, 2002), pp. 358–369
4.
Zurück zum Zitat H.V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K.C. Sevcik, T. Suel, Optimal histograms with quality guarantees, in Proceedings of the 24th International Conference on Very Large Data Bases (VLDB 1998) (Morgan Kaufmann, 1998), pp. 275–286 H.V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K.C. Sevcik, T. Suel, Optimal histograms with quality guarantees, in Proceedings of the 24th International Conference on Very Large Data Bases (VLDB 1998) (Morgan Kaufmann, 1998), pp. 275–286
5.
Zurück zum Zitat A.C. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, M.J. Strauss, Fast, small-space algorithms for approximate histogram maintenance, in Proceedings of the 24th Annual ACM Symposium on Theory of Computing (STOC 2002) (ACM, 2002), pp. 389–398 A.C. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, M.J. Strauss, Fast, small-space algorithms for approximate histogram maintenance, in Proceedings of the 24th Annual ACM Symposium on Theory of Computing (STOC 2002) (ACM, 2002), pp. 389–398
6.
Zurück zum Zitat M. Greenwald, S. Khanna, Space-efficient online computation of quantile summaries. ACM SIGMOD Rec. 30(2), 58–66 (2001)CrossRef M. Greenwald, S. Khanna, Space-efficient online computation of quantile summaries. ACM SIGMOD Rec. 30(2), 58–66 (2001)CrossRef
7.
Zurück zum Zitat Y.E. Ioannidis, V. Poosala, Balancing histogram optimality and practicality for query result size estimation, in Proceedings of the International Conference on Management of Data (SIGMOD 1995) (ACM, 1995), pp. 233–244 Y.E. Ioannidis, V. Poosala, Balancing histogram optimality and practicality for query result size estimation, in Proceedings of the International Conference on Management of Data (SIGMOD 1995) (ACM, 1995), pp. 233–244
8.
Zurück zum Zitat N. Thaper, S. Guha, P. Indyk, N. Koudas, Dynamic multidimensional histograms, in Proceedings of the International Conference on Management of Data (SIGMOD 2002) (ACM, 2002), pp. 428–439 N. Thaper, S. Guha, P. Indyk, N. Koudas, Dynamic multidimensional histograms, in Proceedings of the International Conference on Management of Data (SIGMOD 2002) (ACM, 2002), pp. 428–439
9.
Zurück zum Zitat F. Furfaro, G.M. Mazzeo, D. Saccà, C. Sirangelo, Compressed hierarchical binary histograms for summarizing multi-dimensional data. Knowl. Inf. Syst. 15(3), 335–380 (2008)CrossRef F. Furfaro, G.M. Mazzeo, D. Saccà, C. Sirangelo, Compressed hierarchical binary histograms for summarizing multi-dimensional data. Knowl. Inf. Syst. 15(3), 335–380 (2008)CrossRef
10.
Zurück zum Zitat N. Alon, Y. Matias, M. Szegedy, The space complexity of approximating the frequency moments, in Proceedings of the 28th Annual ACM Symposium on Theory of Computing (STOC 1996), (ACM, 1996), pp. 20–29 N. Alon, Y. Matias, M. Szegedy, The space complexity of approximating the frequency moments, in Proceedings of the 28th Annual ACM Symposium on Theory of Computing (STOC 1996), (ACM, 1996), pp. 20–29
11.
Zurück zum Zitat F. Rusu, A. Dobra, Sketching sampled data streams, in Proceedings of the 25th International Conference on Data Engineering (ICDE 2009) (IEEE Computer Society, 2009), pp. 381–392 F. Rusu, A. Dobra, Sketching sampled data streams, in Proceedings of the 25th International Conference on Data Engineering (ICDE 2009) (IEEE Computer Society, 2009), pp. 381–392
12.
Zurück zum Zitat J. Hershberger, N. Shrivastava, S. Suri, C.D. Toth, Adaptive spatial partitioning for multidimensional data streams. Algorithmica 46(1), 97–117 (2006)MathSciNetCrossRefMATH J. Hershberger, N. Shrivastava, S. Suri, C.D. Toth, Adaptive spatial partitioning for multidimensional data streams. Algorithmica 46(1), 97–117 (2006)MathSciNetCrossRefMATH
13.
Zurück zum Zitat Y. Matias, J. S. Vitter, M. Wang, Dynamic maintenance of wavelet-based histograms, in Proceedings of the 26th International Conference on Very Large Data Bases (VLDB 2000) (Morgan Kaufmann, 2000), pp. 101–110 Y. Matias, J. S. Vitter, M. Wang, Dynamic maintenance of wavelet-based histograms, in Proceedings of the 26th International Conference on Very Large Data Bases (VLDB 2000) (Morgan Kaufmann, 2000), pp. 101–110
14.
Zurück zum Zitat J. Lin, E.J. Keogh, L. Wei, S. Lonardi, Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107–144 (2007)MathSciNetCrossRef J. Lin, E.J. Keogh, L. Wei, S. Lonardi, Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107–144 (2007)MathSciNetCrossRef
15.
Zurück zum Zitat C.C. Aggarwal, J. Han, J. Wang, P.S. Yu, On clustering massive data streams: a summarization paradigm, in Advances in Database Systems: Data Streams Models and Algorithms, vol. 31, ed. by C.C. Aggarwal (Springer, New York, 2007), pp. 9–38 C.C. Aggarwal, J. Han, J. Wang, P.S. Yu, On clustering massive data streams: a summarization paradigm, in Advances in Database Systems: Data Streams Models and Algorithms, vol. 31, ed. by C.C. Aggarwal (Springer, New York, 2007), pp. 9–38
16.
Zurück zum Zitat S. Nassar, J. Sander, Effective summarization of multi-dimensional data streams for historical stream mining, in Proceedings of the 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007) (IEEE Computer Society, 2007), p. 30 S. Nassar, J. Sander, Effective summarization of multi-dimensional data streams for historical stream mining, in Proceedings of the 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007) (IEEE Computer Society, 2007), p. 30
17.
Zurück zum Zitat X. Ma, S. Li, Q. Luo, D. Yang, S. Tang, Distributed, hierarchical clustering and summarization in sensor networks, in Proceedings of the Joint 9th Asia-Pacific Web and 8th International Conference on Web-age Information Management and Advances in Data and Web Management (APWeb/WAIM 2007) (Springer, 2007), pp. 168–175 X. Ma, S. Li, Q. Luo, D. Yang, S. Tang, Distributed, hierarchical clustering and summarization in sensor networks, in Proceedings of the Joint 9th Asia-Pacific Web and 8th International Conference on Web-age Information Management and Advances in Data and Web Management (APWeb/WAIM 2007) (Springer, 2007), pp. 168–175
18.
Zurück zum Zitat P.P. Rodrigues, J. Gama, L.M.B. Lopes, Clustering distributed sensor data streams, in Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, vol. 5212 of LNCS, (Springer, 2008), pp. 282–297 P.P. Rodrigues, J. Gama, L.M.B. Lopes, Clustering distributed sensor data streams, in Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, vol. 5212 of LNCS, (Springer, 2008), pp. 282–297
19.
Zurück zum Zitat M. Kontaki, A.N. Papadopoulos, Y. Manolopoulos, Continuous trend-based clustering in data streams, in Proceedings of the 10th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2008), vol. 5182 of LNCS (Springer, 2008), pp. 251–262 M. Kontaki, A.N. Papadopoulos, Y. Manolopoulos, Continuous trend-based clustering in data streams, in Proceedings of the 10th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2008), vol. 5182 of LNCS (Springer, 2008), pp. 251–262
20.
Zurück zum Zitat A. Ciampi, A. Appice, D. Malerba, Summarization for geographically distributed data streams, in Proceedings of the 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES 2010), vol. 6278 of LNCS (Springer, 2010), pp. 339–348 A. Ciampi, A. Appice, D. Malerba, Summarization for geographically distributed data streams, in Proceedings of the 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES 2010), vol. 6278 of LNCS (Springer, 2010), pp. 339–348
21.
Zurück zum Zitat D. Malerba, A. Appice, A. Varlaro, A. Lanza, Spatial clustering of structured objects in Proceedings of the 15th International Conference of Inductive Logic Programming (ILP 2005), vol. 3625 of LNCS (Springer, 2005), pp. 227–245 D. Malerba, A. Appice, A. Varlaro, A. Lanza, Spatial clustering of structured objects in Proceedings of the 15th International Conference of Inductive Logic Programming (ILP 2005), vol. 3625 of LNCS (Springer, 2005), pp. 227–245
22.
Zurück zum Zitat A. Ciampi, A. Appice, D. Malerba, P. Guccione, Trend cluster based compression of geographically distributed data streams, in Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), part of the IEEE Symposium Series on Computational Intelligence 2011 (IEEE, 2011), pp. 168–175 A. Ciampi, A. Appice, D. Malerba, P. Guccione, Trend cluster based compression of geographically distributed data streams, in Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), part of the IEEE Symposium Series on Computational Intelligence 2011 (IEEE, 2011), pp. 168–175
23.
Zurück zum Zitat J.G. Proakis, D.G. Manolakis, Digital Signal Processing: Principles, Algorithms, and Applications (Prentice-Hall, Upper Saddle River, 1996) J.G. Proakis, D.G. Manolakis, Digital Signal Processing: Principles, Algorithms, and Applications (Prentice-Hall, Upper Saddle River, 1996)
24.
Zurück zum Zitat S. Mallat, A Wavelet Tour for Signal Processing (Academic, New York, 1998) S. Mallat, A Wavelet Tour for Signal Processing (Academic, New York, 1998)
25.
Zurück zum Zitat M. Garofalakis, A. Kumar, Deterministic wavelet thresholding for maximum-error metrics, in Proceedings of the 23rd Symposium on Principles of Database Systems (PODS 2004) (ACM, 2004), pp. 166–176 M. Garofalakis, A. Kumar, Deterministic wavelet thresholding for maximum-error metrics, in Proceedings of the 23rd Symposium on Principles of Database Systems (PODS 2004) (ACM, 2004), pp. 166–176
26.
Zurück zum Zitat S. Al Wadi, M.T. Ismail, S.A.A. Karim, A comparison between Haar wavelet transform and fast fourier transform in analyzing financial time series data. Res. J. Appl. Sci. 5(5), 352–360 (2010)MathSciNetCrossRef S. Al Wadi, M.T. Ismail, S.A.A. Karim, A comparison between Haar wavelet transform and fast fourier transform in analyzing financial time series data. Res. J. Appl. Sci. 5(5), 352–360 (2010)MathSciNetCrossRef
27.
Zurück zum Zitat S. Chaudhuri, U. Dayal, An overview of data warehousing and olap technology. SIGMOD Rec. 26(1), 65–74 (1997). ACM Special Interest Group on Management of Data, cited by (since 1996) 672CrossRef S. Chaudhuri, U. Dayal, An overview of data warehousing and olap technology. SIGMOD Rec. 26(1), 65–74 (1997). ACM Special Interest Group on Management of Data, cited by (since 1996) 672CrossRef
28.
Zurück zum Zitat A. Ciampi, A. Appice, D. Malerba, A. Muolo, Space-time roll-up and drill-down into geo-trend stream cubes, in Proceedings of the 19th International Symposium on Foundations of Intelligent Systems (ISMIS 2011), vol. 6804, LNCS, ed. by M. Kryszkiewicz, H. Rybinski, A. Skowron, Z.W. Ras (Springer, 2011), pp. 365–375 A. Ciampi, A. Appice, D. Malerba, A. Muolo, Space-time roll-up and drill-down into geo-trend stream cubes, in Proceedings of the 19th International Symposium on Foundations of Intelligent Systems (ISMIS 2011), vol. 6804, LNCS, ed. by M. Kryszkiewicz, H. Rybinski, A. Skowron, Z.W. Ras (Springer, 2011), pp. 365–375
Metadaten
Titel
Geodata Stream Summarization
verfasst von
Annalisa Appice
Anna Ciampi
Fabio Fumarola
Donato Malerba
Copyright-Jahr
2014
Verlag
Springer London
DOI
https://doi.org/10.1007/978-1-4471-5454-9_2