ABSTRACT
Existing co-location mining algorithms require a user provided distance threshold at which prevalent patterns are searched. Since spatial interactions, in reality, may happen at different distances, finding the right distance threshold to mine all true patterns is not easy and a single appropriate threshold may not even exist. A standard co-location mining algorithm also requires a prevalence measure threshold to find prevalent patterns. The prevalence measure values of the true co-location patterns occurring at different distances may vary and finding a prevalence measure threshold to mine all true patterns without reporting random patterns is not easy and sometimes not even possible. In this paper, we propose an algorithm to mine true co-location patterns at multiple distances. Our approach is based on a statistical test and does not require thresholds for the prevalence measure and the interaction distance. We evaluate the efficacy of our algorithm using synthetic and real data sets comparing it with the state-of-the-art co-location mining approach.
- O. Aguirrea, G. Huib, K. Gadowb, and J. Jiménez. An Analysis of Spatial Forest Structure Using Neighbourhood-Based Variables. Forest Ecol. Manag., 183(1--3):137--145, 2003.Google Scholar
- A. Baddeley. Spatial Point Processes and their Applications. In Lecture Notes in Mathematics: Stochastic Geometry. Springer Verlag, 2007.Google Scholar
- A. Baddeley. Multivariate and marked point processes. In A. E. G. et al., editor, Handbook of Spatial Statist., pages 371--402. Chapman & Hall / CRC, 2010.Google Scholar
- S. Barua and J. Sander. SSCP: Mining Statistically Significant Co-location Patterns. In Proc. 12th SSTD, pages 2--20, 2011. Google ScholarDigital Library
- S. Barua and J. Sander. Mining Statistically Significant Co-location and Segregation Patterns. IEEE Trans. Knowl. Data Eng., 99(PrePrints):1, 2013.Google Scholar
- Y. Benjamini and Y. Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. Royal Statist. Society. Series B, 57(1):289--300, 1995.Google ScholarCross Ref
- J. Besag and P. Diggle. Simple Monte Carlo Tests for Spatial Patterns. Appl. Statist., 26(3):327--333, 1977.Google Scholar
- C. Bron and J. Kerbosch. Algorithm 457: Finding All Cliques of an Undirected Graph. Communications ACM, 16(9):575--577, 1973. Google ScholarDigital Library
- P. Diggle. Statistical Analysis of Spatial Point Patterns (2nd edn.). Arnold, London, UK, 2003.Google Scholar
- P. Diggle and R. Gratton. Monte Carlo Methods of Inference for Implicit Statistical Models. J. Royal Statist. Society, Series B, 46(2):193--227, 1984.Google Scholar
- D. Gerrard. Competition Quotient: A New Measure of the Competition Affecting Individual Forest Trees. Research Bulletin 20, Michigan State University, 1969.Google Scholar
- L. Gusmao and M. Daly. Evolution of Sea Anemones (Cnidaria: Actiniaria: Hormathiidae) Symbiotic With Hermit Crabs. Molecular Phylogenetics and Evolution, 56(3):868--877, 2010.Google ScholarCross Ref
- J. Han, K. Koperski, and N. Stefanovic. GeoMiner: a System Prototype for Spatial Data Mining. SIGMOD Record, 26(2):553--556, 1997. Google ScholarDigital Library
- S. Hanhijärvi. Multiple Hypothesis Testing in Pattern Discovery. In Proc. 14th Discovery Science, pages 122--134, 2011. Google ScholarDigital Library
- S. Holm. A Simple Sequentially Rejective Multiple Test Procedure. Scand. J. of Statist., 6:65--70, 1979.Google Scholar
- Y. Huang, S. Shekhar, and H. Xiong. Discovering Co-location Patterns from Spatial Data Sets: A General Approach. IEEE Trans. Knowl. Data Eng., 16(12):1472--1485, 2004. Google ScholarDigital Library
- M. Hutchings. Standing Crop and Pattern in Pure Stands of Mercurialis Perennis and Rubus Fruticosus in Mixed Deciduous Woodland. Nordic Society Oikos, 31(3):351--357, 1979.Google ScholarCross Ref
- J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan. Statistical Analysis and Modelling of Spatial Point Patterns. Wiley, 2008.Google Scholar
- K. Koperski and J. Han. Discovery of Spatial Association Rules in Geographic Information Databases. In Proc. 4th SSD, pages 47--66, 1995. Google ScholarDigital Library
- Y. Morimoto. Mining Frequent Neighboring Class Sets in Spatial Databases. In Proc. 7th SIGKDD, pages 353--358, 2001. Google ScholarDigital Library
- J. Neyman and E. Scott. Statistical Approach to Problems of Cosmology. J. Royal Statist. Society, Series B, 20(1):1--43, 1958.Google Scholar
- F. Qian, Q. He, and J. He. Mining Spatial Co-location Patterns with Dynamic Neighborhood Constraint. In Proc. 13th ECML, pages 238--253, 2009. Google ScholarDigital Library
- S. Roxburgh and M. Matsuki. The Statistical Validation of Null Models Used in Spatial Association Analyses. Nordic Society Oikos, 85(1):68--78, 1999.Google ScholarCross Ref
- S. Shekhar and Y. Huang. Discovering Spatial Co-location Patterns: A Summary of Results. In Proc. 7th SSTD, pages 236--256, 2001. Google ScholarDigital Library
- S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa. A New Algorithm for Generating All the Maximal Independent Sets. SIAM J. Comput., 6(3):505--517, 1977.Google ScholarCross Ref
- Wikipedia. Tiger. http://en.wikipedia.org/wiki/Tiger, 2013. {Online; accessed 17-February-2014}.Google Scholar
- X. Xiao, X. Xie, Q. Luo, and W. Ma. Density Based Co-location Pattern Discovery. In Proc. 16th GIS, pages 250--259, 2008. Google ScholarDigital Library
- J. Yoo and M. Bow. Mining Spatial Colocation Patterns: A Different Framework. Data Min. Knowl. Discov., 24(1):159--194, 2012. Google ScholarDigital Library
- J. Yoo and S. Shekhar. A Partial Join Approach for Mining Co-location Patterns. In Proc. 12th GIS, pages 241--249, 2004. Google ScholarDigital Library
- J. Yoo and S. Shekhar. A Joinless Approach for Mining Spatial Colocation Patterns. IEEE Trans. Knowl. Data Eng., 18(10):1323--1337, 2006. Google ScholarDigital Library
Index Terms
- Mining statistically sound co-location patterns at multiple distances
Recommendations
A partial join approach for mining co-location patterns
GIS '04: Proceedings of the 12th annual ACM international workshop on Geographic information systemsSpatial co-location patterns represent the subsets of events whose instances are frequently located together in geographic space. We identified the computational bottleneck in the execution time of a current co-location mining algorithm. A large ...
OESCPM: An Online Extended Spatial Co-location Pattern Mining System
Web and Big DataAbstractIn spatial data mining, co-location pattern mining is intended to discover the sets of spatial features whose instances occur frequently in nearby geographic areas. Co-location pattern mining is an important task in spatial data mining and has ...
A clique-based approach for co-location pattern mining
AbstractCo-location pattern mining refers to the task of discovering the group of features (geographic object types) whose instances (geographic objects) are frequently located close together in a geometric space. Current approaches on this ...
Comments