research-article

Mining statistically sound co-location patterns at multiple distances

Authors:
Sajib Barua

University of Alberta, Edmonton, Canada

University of Alberta, Edmonton, Canada
View Profile

,
Jörg Sander

University of Alberta, Edmonton, Canada

University of Alberta, Edmonton, Canada
View Profile

SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database ManagementJune 2014Article No.: 7Pages 1–12https://doi.org/10.1145/2618243.2618261

Published:30 June 2014Publication History

SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database Management

Pages 1–12

ABSTRACT

Existing co-location mining algorithms require a user provided distance threshold at which prevalent patterns are searched. Since spatial interactions, in reality, may happen at different distances, finding the right distance threshold to mine all true patterns is not easy and a single appropriate threshold may not even exist. A standard co-location mining algorithm also requires a prevalence measure threshold to find prevalent patterns. The prevalence measure values of the true co-location patterns occurring at different distances may vary and finding a prevalence measure threshold to mine all true patterns without reporting random patterns is not easy and sometimes not even possible. In this paper, we propose an algorithm to mine true co-location patterns at multiple distances. Our approach is based on a statistical test and does not require thresholds for the prevalence measure and the interaction distance. We evaluate the efficacy of our algorithm using synthetic and real data sets comparing it with the state-of-the-art co-location mining approach.

References

O. Aguirrea, G. Huib, K. Gadowb, and J. Jiménez. An Analysis of Spatial Forest Structure Using Neighbourhood-Based Variables. Forest Ecol. Manag., 183(1--3):137--145, 2003.Google Scholar
A. Baddeley. Spatial Point Processes and their Applications. In Lecture Notes in Mathematics: Stochastic Geometry. Springer Verlag, 2007.Google Scholar
A. Baddeley. Multivariate and marked point processes. In A. E. G. et al., editor, Handbook of Spatial Statist., pages 371--402. Chapman & Hall / CRC, 2010.Google Scholar
S. Barua and J. Sander. SSCP: Mining Statistically Significant Co-location Patterns. In Proc. 12th SSTD, pages 2--20, 2011. Google ScholarDigital Library
S. Barua and J. Sander. Mining Statistically Significant Co-location and Segregation Patterns. IEEE Trans. Knowl. Data Eng., 99(PrePrints):1, 2013.Google Scholar
Y. Benjamini and Y. Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. Royal Statist. Society. Series B, 57(1):289--300, 1995.Google ScholarCross Ref
J. Besag and P. Diggle. Simple Monte Carlo Tests for Spatial Patterns. Appl. Statist., 26(3):327--333, 1977.Google Scholar
C. Bron and J. Kerbosch. Algorithm 457: Finding All Cliques of an Undirected Graph. Communications ACM, 16(9):575--577, 1973. Google ScholarDigital Library
P. Diggle. Statistical Analysis of Spatial Point Patterns (2nd edn.). Arnold, London, UK, 2003.Google Scholar
P. Diggle and R. Gratton. Monte Carlo Methods of Inference for Implicit Statistical Models. J. Royal Statist. Society, Series B, 46(2):193--227, 1984.Google Scholar
D. Gerrard. Competition Quotient: A New Measure of the Competition Affecting Individual Forest Trees. Research Bulletin 20, Michigan State University, 1969.Google Scholar
L. Gusmao and M. Daly. Evolution of Sea Anemones (Cnidaria: Actiniaria: Hormathiidae) Symbiotic With Hermit Crabs. Molecular Phylogenetics and Evolution, 56(3):868--877, 2010.Google ScholarCross Ref
J. Han, K. Koperski, and N. Stefanovic. GeoMiner: a System Prototype for Spatial Data Mining. SIGMOD Record, 26(2):553--556, 1997. Google ScholarDigital Library
S. Hanhijärvi. Multiple Hypothesis Testing in Pattern Discovery. In Proc. 14th Discovery Science, pages 122--134, 2011. Google ScholarDigital Library
S. Holm. A Simple Sequentially Rejective Multiple Test Procedure. Scand. J. of Statist., 6:65--70, 1979.Google Scholar
Y. Huang, S. Shekhar, and H. Xiong. Discovering Co-location Patterns from Spatial Data Sets: A General Approach. IEEE Trans. Knowl. Data Eng., 16(12):1472--1485, 2004. Google ScholarDigital Library
M. Hutchings. Standing Crop and Pattern in Pure Stands of Mercurialis Perennis and Rubus Fruticosus in Mixed Deciduous Woodland. Nordic Society Oikos, 31(3):351--357, 1979.Google ScholarCross Ref
J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan. Statistical Analysis and Modelling of Spatial Point Patterns. Wiley, 2008.Google Scholar
K. Koperski and J. Han. Discovery of Spatial Association Rules in Geographic Information Databases. In Proc. 4th SSD, pages 47--66, 1995. Google ScholarDigital Library
Y. Morimoto. Mining Frequent Neighboring Class Sets in Spatial Databases. In Proc. 7th SIGKDD, pages 353--358, 2001. Google ScholarDigital Library
J. Neyman and E. Scott. Statistical Approach to Problems of Cosmology. J. Royal Statist. Society, Series B, 20(1):1--43, 1958.Google Scholar
F. Qian, Q. He, and J. He. Mining Spatial Co-location Patterns with Dynamic Neighborhood Constraint. In Proc. 13th ECML, pages 238--253, 2009. Google ScholarDigital Library
S. Roxburgh and M. Matsuki. The Statistical Validation of Null Models Used in Spatial Association Analyses. Nordic Society Oikos, 85(1):68--78, 1999.Google ScholarCross Ref
S. Shekhar and Y. Huang. Discovering Spatial Co-location Patterns: A Summary of Results. In Proc. 7th SSTD, pages 236--256, 2001. Google ScholarDigital Library
S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa. A New Algorithm for Generating All the Maximal Independent Sets. SIAM J. Comput., 6(3):505--517, 1977.Google ScholarCross Ref
Wikipedia. Tiger. http://en.wikipedia.org/wiki/Tiger, 2013. {Online; accessed 17-February-2014}.Google Scholar
X. Xiao, X. Xie, Q. Luo, and W. Ma. Density Based Co-location Pattern Discovery. In Proc. 16th GIS, pages 250--259, 2008. Google ScholarDigital Library
J. Yoo and M. Bow. Mining Spatial Colocation Patterns: A Different Framework. Data Min. Knowl. Discov., 24(1):159--194, 2012. Google ScholarDigital Library
J. Yoo and S. Shekhar. A Partial Join Approach for Mining Co-location Patterns. In Proc. 12th GIS, pages 241--249, 2004. Google ScholarDigital Library
J. Yoo and S. Shekhar. A Joinless Approach for Mining Spatial Colocation Patterns. IEEE Trans. Knowl. Data Eng., 18(10):1323--1337, 2006. Google ScholarDigital Library

Index Terms

Mining statistically sound co-location patterns at multiple distances
1. Applied computing
  1. Physical sciences and engineering
    1. Mathematics and statistics
2. Information systems
  1. Data management systems
    1. Database management system engines
  2. Information systems applications
    1. Data mining
    2. Decision support systems
      1. Data analytics

Recommendations

A partial join approach for mining co-location patterns
GIS '04: Proceedings of the 12th annual ACM international workshop on Geographic information systems

Spatial co-location patterns represent the subsets of events whose instances are frequently located together in geographic space. We identified the computational bottleneck in the execution time of a current co-location mining algorithm. A large ...
Read More
OESCPM: An Online Extended Spatial Co-location Pattern Mining System
Web and Big Data
Abstract
In spatial data mining, co-location pattern mining is intended to discover the sets of spatial features whose instances occur frequently in nearby geographic areas. Co-location pattern mining is an important task in spatial data mining and has ...
Read More
A clique-based approach for co-location pattern mining
Abstract
Co-location pattern mining refers to the task of discovering the group of features (geographic object types) whose instances (geographic objects) are frequently located close together in a geometric space. Current approaches on this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database Management
June 2014
417 pages
ISBN:9781450327220
DOI:10.1145/2618243
Editors:
Christian S. Jensen
Aalborg University
,
Hua Lu
Aalborg University
,
Torben Bach Pedersen
Aalborg University
,
Christian Thomsen
Aalborg University
,
Kristian Torp
Aalborg University
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 June 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
co-location pattern
spatial data mining
statistical test
Qualifiers
- research-article
Conference

Acceptance Rates
SSDBM '14 Paper Acceptance Rate26of71submissions,37%Overall Acceptance Rate56of146submissions,38%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 138
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining statistically sound co-location patterns at multiple distances

SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

A partial join approach for mining co-location patterns

OESCPM: An Online Extended Spatial Co-location Pattern Mining System

A clique-based approach for co-location pattern mining