Skip to main content
Top
Published in: GeoInformatica 4/2016

01-10-2016

On discovering co-location patterns in datasets: a case study of pollutants and child cancers

Authors: Jundong Li, Aibek Adilmagambetov, Mohomed Shazan Mohomed Jabbar, Osmar R. Zaïane, Alvaro Osornio-Vargas, Osnat Wine

Published in: GeoInformatica | Issue 4/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We intend to identify relationships between cancer cases and pollutant emissions by proposing a novel co-location mining algorithm. In this context, we specifically attempt to understand whether there is a relationship between the location of a child diagnosed with cancer with any chemical combinations emitted from various facilities in that particular location. Co-location pattern mining intends to detect sets of spatial features frequently located in close proximity to each other. Most of the previous works in this domain are based on transaction-free apriori-like algorithms which are dependent on user-defined thresholds, and are designed for boolean data points. Due to the absence of a clear notion of transactions, it is nontrivial to use association rule mining techniques to tackle the co-location mining problem. Our proposed approach is focused on a grid based transactionization? of the geographic space, and is designed to mine datasets with extended spatial objects. It is also capable of incorporating uncertainty of the existence of features to model real world scenarios more accurately. We eliminate the necessity of using a global threshold by introducing a statistical test to validate the significance of candidate co-location patterns and rules. Experiments on both synthetic and real datasets reveal that our algorithm can detect a considerable amount of statistically significant co-location patterns. In addition, we explain the data modelling framework which is used on real datasets of pollutants (PRTR/NPRI) and childhood cancer cases.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
3.
go back to reference Adilmagambetov A, Zaiane OR, Osornio-Vargas A (2013) Discovering co-location patterns in datasets with extended spatial objects. In: Dawak’13. Springer, pp 84–96 Adilmagambetov A, Zaiane OR, Osornio-Vargas A (2013) Discovering co-location patterns in datasets with extended spatial objects. In: Dawak’13. Springer, pp 84–96
4.
go back to reference Aggarwal CC, Li Y, Wang J, Wang J (2009) Frequent pattern mining with uncertain data. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining , pp 29–38 Aggarwal CC, Li Y, Wang J, Wang J (2009) Frequent pattern mining with uncertain data. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining , pp 29–38
5.
go back to reference Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD Rec 22:207–216CrossRef Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD Rec 22:207–216CrossRef
6.
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large databases, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large databases, pp 487–499
7.
go back to reference Al-Naymat G (2008) Enumeration of maximal clique for mining spatial co-location patterns. In: Proceedings of the 2008 IEEE/ACS international conference on computer systems and applications, pp 126–133 Al-Naymat G (2008) Enumeration of maximal clique for mining spatial co-location patterns. In: Proceedings of the 2008 IEEE/ACS international conference on computer systems and applications, pp 126–133
8.
go back to reference Armstrong B, Doll R (1975) Environmental factors and cancer incidence and mortality in different countries, with special reference to dietary practices. Int J Cancer 15(4):617–631CrossRef Armstrong B, Doll R (1975) Environmental factors and cancer incidence and mortality in different countries, with special reference to dietary practices. Int J Cancer 15(4):617–631CrossRef
9.
go back to reference Barua S, Sander J (2011) SSCP: Mining statistically significant co-location patterns. In: Proceedings of the 12th international conference on advances in spatial and temporal databases, pp 2–20 Barua S, Sander J (2011) SSCP: Mining statistically significant co-location patterns. In: Proceedings of the 12th international conference on advances in spatial and temporal databases, pp 2–20
10.
go back to reference Bernecker T, Kriegel HP, Renz M, Verhein F, Zuefle A (2009) Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 119–128 Bernecker T, Kriegel HP, Renz M, Verhein F, Zuefle A (2009) Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 119–128
11.
go back to reference Boffetta P, Nyberg F (2003) Contribution of environmental factors to cancer risk. British Med Bullet 68(1):71–94CrossRef Boffetta P, Nyberg F (2003) Contribution of environmental factors to cancer risk. British Med Bullet 68(1):71–94CrossRef
14.
go back to reference Chou YH (1997) Exploring spatial analysis in geographic information systems Chou YH (1997) Exploring spatial analysis in geographic information systems
15.
go back to reference Chui CK, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Advances in knowledge discovery and data mining, pp 64–75 Chui CK, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Advances in knowledge discovery and data mining, pp 64–75
16.
go back to reference Chui CK, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: Advances in knowledge discovery and data mining, pp 47–58 Chui CK, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: Advances in knowledge discovery and data mining, pp 47–58
17.
go back to reference Cressie NA (1991) Statistics for spatial data Cressie NA (1991) Statistics for spatial data
19.
go back to reference Ester M, Kriegel HP, Sander J (2001) Algorithms and applications for spatial data mining. In: Geographic data mining and knowledge discovery, research monographs in GIS, pp 160–187 Ester M, Kriegel HP, Sander J (2001) Algorithms and applications for spatial data mining. In: Geographic data mining and knowledge discovery, research monographs in GIS, pp 160–187
20.
go back to reference Estivill-Castro V, Lee I (2001) Data mining techniques for autonomous exploration of large volumes of geo-referenced crime data. In: Proceedings of the 6th international conference on geocomputation, pp 24–26 Estivill-Castro V, Lee I (2001) Data mining techniques for autonomous exploration of large volumes of geo-referenced crime data. In: Proceedings of the 6th international conference on geocomputation, pp 24–26
21.
go back to reference Estivill-Castrol V, Murray AT (1998) Discovering associations in spatial data? an efficient medoid based approach. In: Research and development in knowledge discovery and data mining, pp 110–121 Estivill-Castrol V, Murray AT (1998) Discovering associations in spatial data? an efficient medoid based approach. In: Research and development in knowledge discovery and data mining, pp 110–121
22.
go back to reference Getis A, Jackson PH (1971) The expected proportion of a region polluted, by k sources. Geogr Anal 3(3):256–261CrossRef Getis A, Jackson PH (1971) The expected proportion of a region polluted, by k sources. Geogr Anal 3(3):256–261CrossRef
23.
go back to reference Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. Geoinformatica 10(3) Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. Geoinformatica 10(3)
24.
go back to reference Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: A general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485CrossRef Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: A general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485CrossRef
25.
go back to reference Huang Y, Zhang P (2006) On the relationships between clustering and spatial co-location pattern mining. In: Proceedings of the 18th IEEE international conference on tools with artificial intelligence, pp 513–522 Huang Y, Zhang P (2006) On the relationships between clustering and spatial co-location pattern mining. In: Proceedings of the 18th IEEE international conference on tools with artificial intelligence, pp 513–522
26.
go back to reference Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceedings of the 4th international symposium on advances in spatial databases, pp 47–66 Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceedings of the 4th international symposium on advances in spatial databases, pp 47–66
27.
go back to reference Kwan Kim S, Kim Y, Kim U (2011) Maximal cliques generating algorithm for spatial co-location pattern mining. In: Secure and trust computing, data management and applications, pp 241–250 Kwan Kim S, Kim Y, Kim U (2011) Maximal cliques generating algorithm for spatial co-location pattern mining. In: Secure and trust computing, data management and applications, pp 241–250
28.
go back to reference Morimoto Y (2001) Mining frequent neighboring class sets in spatial databases. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 353–358 Morimoto Y (2001) Mining frequent neighboring class sets in spatial databases. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 353–358
29.
go back to reference Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proceedings of 2001 IEEE international conference on data mining, pp 441–448 Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proceedings of 2001 IEEE international conference on data mining, pp 441–448
30.
go back to reference Reggente M, Lilienthal AJ (2009) Using local wind information for gas distribution mapping in outdoor environments with a mobile robot. In: Sensors, 2009 IEEE, pp 1715–1720 Reggente M, Lilienthal AJ (2009) Using local wind information for gas distribution mapping in outdoor environments with a mobile robot. In: Sensors, 2009 IEEE, pp 1715–1720
31.
go back to reference Shekhar S, Huang Y (2001) Discovering spatial co-location patterns: a summary of results. In: Proceedings of the 7th international symposium on advances in spatial and temporal databases, pp 236–256 Shekhar S, Huang Y (2001) Discovering spatial co-location patterns: a summary of results. In: Proceedings of the 7th international symposium on advances in spatial and temporal databases, pp 236–256
32.
go back to reference Shekhar S, Zhang P, Huangm Y (2010) Spatial Data Mining Shekhar S, Zhang P, Huangm Y (2010) Spatial Data Mining
33.
go back to reference Williams RG (1999) Nonlinear surface interpolations: Which way is the wind blowing?. In: Proceedings of 1999 esri international user conference Williams RG (1999) Nonlinear surface interpolations: Which way is the wind blowing?. In: Proceedings of 1999 esri international user conference
34.
go back to reference Xiao X, Xie X, Luo Q, Ma WY (2008) Density based co-location pattern discovery. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic information systems, pp 1–10 Xiao X, Xie X, Luo Q, Ma WY (2008) Density based co-location pattern discovery. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic information systems, pp 1–10
35.
go back to reference Xiong H, Shekhar S, Huang Y, Kumar V, Ma X, Yoo JS (2004) A framework for discovering co-location patterns in data sets with extended spatial objects. In: Proceedings of 2004 SAIM international conference on data mining, pp 78–89 Xiong H, Shekhar S, Huang Y, Kumar V, Ma X, Yoo JS (2004) A framework for discovering co-location patterns in data sets with extended spatial objects. In: Proceedings of 2004 SAIM international conference on data mining, pp 78–89
36.
go back to reference Yoo JS, Shekhar S (2006) A joinless approach for mining spatial colocation patterns. IEEE Trans Knowl Data Eng 18(10):1323–1337CrossRef Yoo JS, Shekhar S (2006) A joinless approach for mining spatial colocation patterns. IEEE Trans Knowl Data Eng 18(10):1323–1337CrossRef
37.
go back to reference Yoo JS, Shekhar S, Smith J, Kumquat JP (2004) A partial join approach for mining co-location patterns. In: Proceedings of the 12th ACM international workshop on geographic information systems, pp 241–249 Yoo JS, Shekhar S, Smith J, Kumquat JP (2004) A partial join approach for mining co-location patterns. In: Proceedings of the 12th ACM international workshop on geographic information systems, pp 241–249
38.
go back to reference Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining Knowl Discov 8 (1):55–87CrossRef Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining Knowl Discov 8 (1):55–87CrossRef
39.
go back to reference Antonie L, Li J, Zaiane OR (2014) Negative Association Rules. In: Frequent Pattern Mining, pp. 135–145. Springer Antonie L, Li J, Zaiane OR (2014) Negative Association Rules. In: Frequent Pattern Mining, pp. 135–145. Springer
40.
go back to reference Li J, Zaiane OR, Osornio-Vargas A (2014) Discovering Statistically Significant Co-location Rules in Datasets with Extended Spatial Objects. In: Proc. of the 16th International Conference on Data Warehousing and Knowledge Discovery, pp. 124–135 Li J, Zaiane OR, Osornio-Vargas A (2014) Discovering Statistically Significant Co-location Rules in Datasets with Extended Spatial Objects. In: Proc. of the 16th International Conference on Data Warehousing and Knowledge Discovery, pp. 124–135
41.
go back to reference Li J, Zaiane O.R (2015) Associative Classification with Statistically Significant Positive and Negative Rules. In: Proc. of the 24th ACM International on Conference on Information and Knowledge Management, pp. 633–642 Li J, Zaiane O.R (2015) Associative Classification with Statistically Significant Positive and Negative Rules. In: Proc. of the 24th ACM International on Conference on Information and Knowledge Management, pp. 633–642
Metadata
Title
On discovering co-location patterns in datasets: a case study of pollutants and child cancers
Authors
Jundong Li
Aibek Adilmagambetov
Mohomed Shazan Mohomed Jabbar
Osmar R. Zaïane
Alvaro Osornio-Vargas
Osnat Wine
Publication date
01-10-2016
Publisher
Springer US
Published in
GeoInformatica / Issue 4/2016
Print ISSN: 1384-6175
Electronic ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-016-0254-1

Other articles of this Issue 4/2016

GeoInformatica 4/2016 Go to the issue