Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 5-6/2014

01.09.2014

Detecting localized homogeneous anomalies over spatio-temporal data

verfasst von: Aditya Telang, P. Deepak, Salil Joshi, Prasad Deshpande, Ranjana Rajendran

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 5-6/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The last decade has witnessed an unprecedented growth in availability of data having spatio-temporal characteristics. Given the scale and richness of such data, finding spatio-temporal patterns that demonstrate significantly different behavior from their neighbors could be of interest for various application scenarios such as—weather modeling, analyzing spread of disease outbreaks, monitoring traffic congestions, and so on. In this paper, we propose an automated approach of exploring and discovering such anomalous patterns irrespective of the underlying domain from which the data is recovered. Our approach differs significantly from traditional methods of spatial outlier detection, and employs two phases—(i) discovering homogeneous regions, and (ii) evaluating these regions as anomalies based on their statistical difference from a generalized neighborhood. We evaluate the quality of our approach and distinguish it from existing techniques via an extensive experimental evaluation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
In this paper, we extensively use color-based figures to illustrate the concepts of anomalies. Hence, we request the reader to refer to the electronic version or a colored printout of the paper for better readability.
 
6
For sake of clarity, we illustrate a spatial grid; however, the formulation is extendible to the temporal dimension.
 
11
We do not include outlier detection techniques in our comparative analysis since it is not clear as to how outlier detection techniques that estimate divergent behavior at each data object level may be fairly compared with techniques that discover groups of objects that exhibit divergent behavior.
 
12
It must be noted that conducting user surveys is a difficult task. Hence, we conducted the user survey on \(Dataset_1\) only and not on on \(Dataset_2\).
 
Literatur
Zurück zum Zitat Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. 2012 IEEE Conf Comput Vis Pattern Recognit 0:1597–1604 Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. 2012 IEEE Conf Comput Vis Pattern Recognit 0:1597–1604
Zurück zum Zitat Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Analy Mach Intell 33(5):898–916CrossRef Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Analy Mach Intell 33(5):898–916CrossRef
Zurück zum Zitat Birant D, Kut A (2007) St-dbscan: An algorithm for clustering spatial-temporal data. Data Knowl Eng 60(1):208–221CrossRef Birant D, Kut A (2007) St-dbscan: An algorithm for clustering spatial-temporal data. Data Knowl Eng 60(1):208–221CrossRef
Zurück zum Zitat Bonnet N, Cutrona J, Herbin M (2002) A no-thresholdhistogram-based image segmentation method. Pattern Recognit 35(10):2319–2322CrossRefMATH Bonnet N, Cutrona J, Herbin M (2002) A no-thresholdhistogram-based image segmentation method. Pattern Recognit 35(10):2319–2322CrossRefMATH
Zurück zum Zitat Ceriani L, Verme P (2012) The origins of the gini index: extracts from variabilità e mutabilità (1912) by corrado gini. J Econ Inequal 10(3):421–443CrossRef Ceriani L, Verme P (2012) The origins of the gini index: extracts from variabilità e mutabilità (1912) by corrado gini. J Econ Inequal 10(3):421–443CrossRef
Zurück zum Zitat Cheng T, Li Z (2004) A hybrid approach to detect spatial-temporal outliers. In Proceedings of the 12th International Conference on Geoinformatics Geospatial Information Research, pp. 173–178. Cheng T, Li Z (2004) A hybrid approach to detect spatial-temporal outliers. In Proceedings of the 12th International Conference on Geoinformatics Geospatial Information Research, pp. 173–178.
Zurück zum Zitat Deaton A (1997) The analysis of household surveys: a microeconometric approach to development policy. Johns Hopkins University Press, BaltimoreCrossRef Deaton A (1997) The analysis of household surveys: a microeconometric approach to development policy. Johns Hopkins University Press, BaltimoreCrossRef
Zurück zum Zitat Duczmal L (2004) A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Comput Stat Data Anal 45(2):269–286CrossRefMATHMathSciNet Duczmal L (2004) A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Comput Stat Data Anal 45(2):269–286CrossRefMATHMathSciNet
Zurück zum Zitat El-Hamdouchi A, Willett P (1989) Comparison of hierarchie agglomerative clustering methods for document retrieval. Comput J 32(3):220–227CrossRef El-Hamdouchi A, Willett P (1989) Comparison of hierarchie agglomerative clustering methods for document retrieval. Comput J 32(3):220–227CrossRef
Zurück zum Zitat Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, pp. 226–231 Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, pp. 226–231
Zurück zum Zitat Fan J, Yau DK, Elmagarmid AK, Aref WG (2001) Automatic image segmentation by integrating color-edge extraction and seeded region growing. IEEE Trans Image Process 10(10):1454–1466CrossRefMATH Fan J, Yau DK, Elmagarmid AK, Aref WG (2001) Automatic image segmentation by integrating color-edge extraction and seeded region growing. IEEE Trans Image Process 10(10):1454–1466CrossRefMATH
Zurück zum Zitat Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRef Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRef
Zurück zum Zitat Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9(2):123–143CrossRef Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9(2):123–143CrossRef
Zurück zum Zitat Grady L, Schwartz EL (2006) Isoperimetric graph partitioning for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(3):469–475CrossRef Grady L, Schwartz EL (2006) Isoperimetric graph partitioning for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(3):469–475CrossRef
Zurück zum Zitat Huelsenbeck JP, Crandall KA (1997) Phylogeny estimation and hypothesis testing using maximum likelihood. Ann Rev Ecol Syst 28:437–466CrossRef Huelsenbeck JP, Crandall KA (1997) Phylogeny estimation and hypothesis testing using maximum likelihood. Ann Rev Ecol Syst 28:437–466CrossRef
Zurück zum Zitat Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRef Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRef
Zurück zum Zitat Joseph FL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRef Joseph FL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRef
Zurück zum Zitat Kisilevich S, Mansmann F, Nanni M, Rinzivillo S (2010) Spatio-temporal clustering: a survey. Data mining and knowledge discovery handbook. Springer, New York, pp 855–874 Kisilevich S, Mansmann F, Nanni M, Rinzivillo S (2010) Spatio-temporal clustering: a survey. Data mining and knowledge discovery handbook. Springer, New York, pp 855–874
Zurück zum Zitat Kou Y, tien Lu C (2006) Spatial weighted outlier detection. In Proceedings of SIAM Conference on Data Mining Kou Y, tien Lu C (2006) Spatial weighted outlier detection. In Proceedings of SIAM Conference on Data Mining
Zurück zum Zitat Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(11):159–174 Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(11):159–174
Zurück zum Zitat Lukasová A (1979) Hierarchical agglomerative clustering procedure. Pattern Recognit 11(5–6):365–381CrossRefMATH Lukasová A (1979) Hierarchical agglomerative clustering procedure. Pattern Recognit 11(5–6):365–381CrossRefMATH
Zurück zum Zitat Mankiewicz R (2000) The story of mathematics. Princeton University Department of Art, PrincetonMATH Mankiewicz R (2000) The story of mathematics. Princeton University Department of Art, PrincetonMATH
Zurück zum Zitat Mood A, Graybill F, Boes D (1963) Introduction to the theory of statistics. Mc-graw hill book company. Inc., New York Mood A, Graybill F, Boes D (1963) Introduction to the theory of statistics. Mc-graw hill book company. Inc., New York
Zurück zum Zitat Neill DB, Moore AW (2004) Rapid detection of significant spatial clusters. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’04, pp. 256–265, New York, NY. ACM. Neill DB, Moore AW (2004) Rapid detection of significant spatial clusters. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’04, pp. 256–265, New York, NY. ACM.
Zurück zum Zitat Neill DB, Moore AW, Cooper GF (2005) A bayesian spatial scan statistic. In NIPS Neill DB, Moore AW, Cooper GF (2005) A bayesian spatial scan statistic. In NIPS
Zurück zum Zitat Ohlander R, Price K, Reddy DR (1978) Picture segmentation using a recursive region splitting method. Comput Gr Image Process 8(3):313–333CrossRef Ohlander R, Price K, Reddy DR (1978) Picture segmentation using a recursive region splitting method. Comput Gr Image Process 8(3):313–333CrossRef
Zurück zum Zitat Pang LX, Chawla S, Liu W, Zheng Y (2011) On mining anomalous patterns in road traffic streams. In Advanced Data Mining and Applications, pp. 237–251. Springer Pang LX, Chawla S, Liu W, Zheng Y (2011) On mining anomalous patterns in road traffic streams. In Advanced Data Mining and Applications, pp. 237–251. Springer
Zurück zum Zitat Patil GP, Taillie C (2004) Upper level set scan statistic for detecting arbitrarily shaped hotspots. Environ Ecol Stat 11:183–197CrossRefMathSciNet Patil GP, Taillie C (2004) Upper level set scan statistic for detecting arbitrarily shaped hotspots. Environ Ecol Stat 11:183–197CrossRefMathSciNet
Zurück zum Zitat Reades J, Calabrese F, Sevtsuk A, Ratti C (2007) Cellular census: explorations in urban data collection. IEEE Pervasive Comput 6(3):30–38CrossRef Reades J, Calabrese F, Sevtsuk A, Ratti C (2007) Cellular census: explorations in urban data collection. IEEE Pervasive Comput 6(3):30–38CrossRef
Zurück zum Zitat Revol C, Jourlin M (1997) A new minimum variance region growing algorithm for image segmentation. Pattern Recognit Lett 18(3):249–258CrossRef Revol C, Jourlin M (1997) A new minimum variance region growing algorithm for image segmentation. Pattern Recognit Lett 18(3):249–258CrossRef
Zurück zum Zitat Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Know Discov 28(1):190–237CrossRefMATHMathSciNet Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Know Discov 28(1):190–237CrossRefMATHMathSciNet
Zurück zum Zitat Shekar S, Lu C-T, Zhang P (2002) Detecting graph-based spatial outliers. Intell Data Anal 6(5):451–468 Shekar S, Lu C-T, Zhang P (2002) Detecting graph-based spatial outliers. Intell Data Anal 6(5):451–468
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
Zurück zum Zitat Sindhu B, Suresh I, Unnikrishnan A, Bhatkar N, Neetu S, Michael G (2007) Improved bathymetric datasets for the shallow water regions in the indian ocean. J Earth Syst Sci 116(3):261–274CrossRef Sindhu B, Suresh I, Unnikrishnan A, Bhatkar N, Neetu S, Michael G (2007) Improved bathymetric datasets for the shallow water regions in the indian ocean. J Earth Syst Sci 116(3):261–274CrossRef
Zurück zum Zitat Stolorz PE, Nakamura H, Mesrobian E, Muntz RR, Shek EC, Santos JR, Yi J, Ng KW, Chien S-Y, Mechoso CR, Farrara JD (1995) Fast spatio-temporal data mining of large geophysical datasets. In KDD, pp. 300–305 Stolorz PE, Nakamura H, Mesrobian E, Muntz RR, Shek EC, Santos JR, Yi J, Ng KW, Chien S-Y, Mechoso CR, Farrara JD (1995) Fast spatio-temporal data mining of large geophysical datasets. In KDD, pp. 300–305
Zurück zum Zitat Tango T, Takahashi K (2005) A flexibly shaped spatial scan statistic for detecting clusters. Int J Health Geogr 4:11CrossRef Tango T, Takahashi K (2005) A flexibly shaped spatial scan statistic for detecting clusters. Int J Health Geogr 4:11CrossRef
Metadaten
Titel
Detecting localized homogeneous anomalies over spatio-temporal data
verfasst von
Aditya Telang
P. Deepak
Salil Joshi
Prasad Deshpande
Ranjana Rajendran
Publikationsdatum
01.09.2014
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 5-6/2014
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-014-0366-x

Weitere Artikel der Ausgabe 5-6/2014

Data Mining and Knowledge Discovery 5-6/2014 Zur Ausgabe

Premium Partner