Skip to main content
Top
Published in: Journal of Geographical Systems 3/2017

15-05-2017 | Original Article

Comparison of approaches of geographic partitioning for data anonymization

Authors: William Lee Croft, Wei Shi, Jörg-Rüdiger Sack, Jean-Pierre Corriveau

Published in: Journal of Geographical Systems | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Given the large volumes of detailed data now being collected, there is a high demand for the release of this data for research purposes. In particular, organizations are faced with the conflicting goals of (a) releasing this data and (b) protecting the privacy of the individuals to whom the data pertains. Especially, there is a conflict between the need to release precise geographic information (which is essential to many healthcare research fields such as spatial epidemiology) and the requirement to censor or generalize the same information for the sake of privacy protection. Ultimately, the challenge is to anonymize data in order to comply with government privacy policies while reducing the loss in geographic information as much as possible. In this paper, we present novel component approaches used to configure the Voronoi-Based Aggregation System (VBAS) as well as an in-depth comparison of their effectiveness. VBAS is a system which protects privacy by enforcing k-anonymity via the aggregation of regions of fine granularity into larger regions. We additionally discuss heuristics rooted in linear programming which we have also integrated in our system. Based on extensive comparisons, we highlight the strengths and weaknesses of the different approaches we tested. This enables us to make recommendations on how to satisfy user requirements via the selection of specific combinations of such approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Armstrong M, Rushton G, Zimmerman D (1999) Geographically masking health data to preserve confidentiality. Stat Med 18:497–525CrossRef Armstrong M, Rushton G, Zimmerman D (1999) Geographically masking health data to preserve confidentiality. Stat Med 18:497–525CrossRef
go back to reference Arzberger P, Schroeder P, Beaulieu A et al (2004) Promoting access to public research data for scientific, economic, and social development. Data Sci J 3:135–152CrossRef Arzberger P, Schroeder P, Beaulieu A et al (2004) Promoting access to public research data for scientific, economic, and social development. Data Sci J 3:135–152CrossRef
go back to reference Aurenhammer F, Klein R (2000) Voronoi diagrams. In: Sack J-R, Urrutia J (eds) Handbook of computational geometry. Elsevier Science Publishers B.V., North-Holland, pp 201–290CrossRef Aurenhammer F, Klein R (2000) Voronoi diagrams. In: Sack J-R, Urrutia J (eds) Handbook of computational geometry. Elsevier Science Publishers B.V., North-Holland, pp 201–290CrossRef
go back to reference Bayardo RJ, Agrawa R (2005) Data privacy through optimal k-anonymization. In: Proceedings of 21st ICDE ’05, pp 217–228 Bayardo RJ, Agrawa R (2005) Data privacy through optimal k-anonymization. In: Proceedings of 21st ICDE ’05, pp 217–228
go back to reference Benitez K, Malin B (2010) Evaluating re-identication risks with respect to the HIPAA privacy rule. J Am Med Inform Assoc 17:169–177CrossRef Benitez K, Malin B (2010) Evaluating re-identication risks with respect to the HIPAA privacy rule. J Am Med Inform Assoc 17:169–177CrossRef
go back to reference Boulos M, Cai Q, Padget JA et al (2006) Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses. J Biomed Inform 39:160–170CrossRef Boulos M, Cai Q, Padget JA et al (2006) Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses. J Biomed Inform 39:160–170CrossRef
go back to reference Bridwell SA (2007) The dimensions of locational privacy. In: Miller HJ (ed) Societies and cities in the age of instant access. Springer, Netherlands, pp 209–225CrossRef Bridwell SA (2007) The dimensions of locational privacy. In: Miller HJ (ed) Societies and cities in the age of instant access. Springer, Netherlands, pp 209–225CrossRef
go back to reference Charnes A, Cooper WW (1954) The stepping stone method of explaining linear programming calculations in transportation problems. Manag Sci 1:49–69CrossRef Charnes A, Cooper WW (1954) The stepping stone method of explaining linear programming calculations in transportation problems. Manag Sci 1:49–69CrossRef
go back to reference Clifton KJ, Gehrke SR (2013) Application of geographic perturbation methods to residential locations in the oregon household activity survey. Transp Res Rec 2354:40–50CrossRef Clifton KJ, Gehrke SR (2013) Application of geographic perturbation methods to residential locations in the oregon household activity survey. Transp Res Rec 2354:40–50CrossRef
go back to reference Croft W, Shi W, Sack J-R et al (2016) Location-based anonymization: comparison and evaluation of the Voronoi-based aggregation system. Int J Geogr Inf Sci 30:2253–2275CrossRef Croft W, Shi W, Sack J-R et al (2016) Location-based anonymization: comparison and evaluation of the Voronoi-based aggregation system. Int J Geogr Inf Sci 30:2253–2275CrossRef
go back to reference Emam KE, Brown A, AbdelMalik P (2009) Evaluating predictors of geographic area population size cut-offs to manage re-identification risk. J Am Med Inform Assoc 16:256–266CrossRef Emam KE, Brown A, AbdelMalik P (2009) Evaluating predictors of geographic area population size cut-offs to manage re-identification risk. J Am Med Inform Assoc 16:256–266CrossRef
go back to reference Emam KE, Dankar FK, Neisa A et al (2013) Evaluating the risk of patient re-identification from adverse drug event reports. BMC Med Inform Decis 13 Emam KE, Dankar FK, Neisa A et al (2013) Evaluating the risk of patient re-identification from adverse drug event reports. BMC Med Inform Decis 13
go back to reference Ford LR, Fulkerson DR (1956) Solving the transportation problem. Manag Sci 3:24–32CrossRef Ford LR, Fulkerson DR (1956) Solving the transportation problem. Manag Sci 3:24–32CrossRef
go back to reference Gionis A, Tassa T (2008) k-anonymization with minimal loss of information. IEEE Trans Knowl Data Eng 21:206–219CrossRef Gionis A, Tassa T (2008) k-anonymization with minimal loss of information. IEEE Trans Knowl Data Eng 21:206–219CrossRef
go back to reference Goodchild M, Massam B (1969) Some least-cost models of spatial administrative systems in southern ontario. Geogr Ann 51:86–94CrossRef Goodchild M, Massam B (1969) Some least-cost models of spatial administrative systems in southern ontario. Geogr Ann 51:86–94CrossRef
go back to reference Greenberg B, Voshell L (1990) Relating risk of disclosure for microdata and geographic area size. In: Proceedings of SRMS, American statistical association, pp 450–455 Greenberg B, Voshell L (1990) Relating risk of disclosure for microdata and geographic area size. In: Proceedings of SRMS, American statistical association, pp 450–455
go back to reference Hawala S (2001) Enhancing the “100,000 rule” on the variation of the per cent of uniques in a microdata sample and the geographic area size identified on the file. In: Proceedings of the annual meeting of the American statistical association, pp 1–6 Hawala S (2001) Enhancing the “100,000 rule” on the variation of the per cent of uniques in a microdata sample and the geographic area size identified on the file. In: Proceedings of the annual meeting of the American statistical association, pp 1–6
go back to reference Jung H-W, Emam KE (2014) A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. Int J Health Geogr 13 Jung H-W, Emam KE (2014) A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. Int J Health Geogr 13
go back to reference Lowrance W (2006) Access to collections of data and materials for health research: a report to the medical research council and the wellcome trust. Medical research council and the wellcome trust, pp 1–39 Lowrance W (2006) Access to collections of data and materials for health research: a report to the medical research council and the wellcome trust. Medical research council and the wellcome trust, pp 1–39
go back to reference Lyseen AK, Nohr C, Sorensen EM et al (2014) A review and framework for categorizing current research and development in health related geographical information systems (GIS) studies. Yearb Med Inform 9:110–124CrossRef Lyseen AK, Nohr C, Sorensen EM et al (2014) A review and framework for categorizing current research and development in health related geographical information systems (GIS) studies. Yearb Med Inform 9:110–124CrossRef
go back to reference Mohammed N, Fung BCM, Hung PCK et al (2009) Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1285–1294 Mohammed N, Fung BCM, Hung PCK et al (2009) Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1285–1294
go back to reference Olson KL, Grannis SJ, Mandl KD (2002) Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 96:2002–2008CrossRef Olson KL, Grannis SJ, Mandl KD (2002) Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 96:2002–2008CrossRef
go back to reference Rezaeian M, Dunn G, Leger SS et al (2007) Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Commun H 61:98–102CrossRef Rezaeian M, Dunn G, Leger SS et al (2007) Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Commun H 61:98–102CrossRef
go back to reference Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13:1010–1027CrossRef Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13:1010–1027CrossRef
go back to reference Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzz 10:557–570CrossRef Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzz 10:557–570CrossRef
go back to reference Thomas Y, Richardson D, Cheung I (2008) Integrating geography and social epidemiology in drug abuse research. In: Thomas Y, Richardson D, Cheung I (eds) Geography and drug addiction. Springer, Netherlands, pp 17–26CrossRef Thomas Y, Richardson D, Cheung I (2008) Integrating geography and social epidemiology in drug abuse research. In: Thomas Y, Richardson D, Cheung I (eds) Geography and drug addiction. Springer, Netherlands, pp 17–26CrossRef
go back to reference Vora A, Burke DS, Cummings DAT (2008) The impact of a physical geographic barrier on the dynamic of measles. Epidemiol Infect 136:713–720CrossRef Vora A, Burke DS, Cummings DAT (2008) The impact of a physical geographic barrier on the dynamic of measles. Epidemiol Infect 136:713–720CrossRef
go back to reference Young C, Martin D, Skinner C (2009) Geographically intelligent disclosure control for flexible aggregation of census data. Epidemiol Infect 23:457–482 Young C, Martin D, Skinner C (2009) Geographically intelligent disclosure control for flexible aggregation of census data. Epidemiol Infect 23:457–482
go back to reference Zhou A, Qu B, Li H et al (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolut Comput 1:32–49CrossRef Zhou A, Qu B, Li H et al (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolut Comput 1:32–49CrossRef
Metadata
Title
Comparison of approaches of geographic partitioning for data anonymization
Authors
William Lee Croft
Wei Shi
Jörg-Rüdiger Sack
Jean-Pierre Corriveau
Publication date
15-05-2017
Publisher
Springer Berlin Heidelberg
Published in
Journal of Geographical Systems / Issue 3/2017
Print ISSN: 1435-5930
Electronic ISSN: 1435-5949
DOI
https://doi.org/10.1007/s10109-017-0251-4

Premium Partner