Skip to main content
Erschienen in: Journal of Geographical Systems 3/2017

15.05.2017 | Original Article

Comparison of approaches of geographic partitioning for data anonymization

verfasst von: William Lee Croft, Wei Shi, Jörg-Rüdiger Sack, Jean-Pierre Corriveau

Erschienen in: Journal of Geographical Systems | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Given the large volumes of detailed data now being collected, there is a high demand for the release of this data for research purposes. In particular, organizations are faced with the conflicting goals of (a) releasing this data and (b) protecting the privacy of the individuals to whom the data pertains. Especially, there is a conflict between the need to release precise geographic information (which is essential to many healthcare research fields such as spatial epidemiology) and the requirement to censor or generalize the same information for the sake of privacy protection. Ultimately, the challenge is to anonymize data in order to comply with government privacy policies while reducing the loss in geographic information as much as possible. In this paper, we present novel component approaches used to configure the Voronoi-Based Aggregation System (VBAS) as well as an in-depth comparison of their effectiveness. VBAS is a system which protects privacy by enforcing k-anonymity via the aggregation of regions of fine granularity into larger regions. We additionally discuss heuristics rooted in linear programming which we have also integrated in our system. Based on extensive comparisons, we highlight the strengths and weaknesses of the different approaches we tested. This enables us to make recommendations on how to satisfy user requirements via the selection of specific combinations of such approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Armstrong M, Rushton G, Zimmerman D (1999) Geographically masking health data to preserve confidentiality. Stat Med 18:497–525CrossRef Armstrong M, Rushton G, Zimmerman D (1999) Geographically masking health data to preserve confidentiality. Stat Med 18:497–525CrossRef
Zurück zum Zitat Arzberger P, Schroeder P, Beaulieu A et al (2004) Promoting access to public research data for scientific, economic, and social development. Data Sci J 3:135–152CrossRef Arzberger P, Schroeder P, Beaulieu A et al (2004) Promoting access to public research data for scientific, economic, and social development. Data Sci J 3:135–152CrossRef
Zurück zum Zitat Aurenhammer F, Klein R (2000) Voronoi diagrams. In: Sack J-R, Urrutia J (eds) Handbook of computational geometry. Elsevier Science Publishers B.V., North-Holland, pp 201–290CrossRef Aurenhammer F, Klein R (2000) Voronoi diagrams. In: Sack J-R, Urrutia J (eds) Handbook of computational geometry. Elsevier Science Publishers B.V., North-Holland, pp 201–290CrossRef
Zurück zum Zitat Bayardo RJ, Agrawa R (2005) Data privacy through optimal k-anonymization. In: Proceedings of 21st ICDE ’05, pp 217–228 Bayardo RJ, Agrawa R (2005) Data privacy through optimal k-anonymization. In: Proceedings of 21st ICDE ’05, pp 217–228
Zurück zum Zitat Benitez K, Malin B (2010) Evaluating re-identication risks with respect to the HIPAA privacy rule. J Am Med Inform Assoc 17:169–177CrossRef Benitez K, Malin B (2010) Evaluating re-identication risks with respect to the HIPAA privacy rule. J Am Med Inform Assoc 17:169–177CrossRef
Zurück zum Zitat Boulos M, Cai Q, Padget JA et al (2006) Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses. J Biomed Inform 39:160–170CrossRef Boulos M, Cai Q, Padget JA et al (2006) Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses. J Biomed Inform 39:160–170CrossRef
Zurück zum Zitat Bridwell SA (2007) The dimensions of locational privacy. In: Miller HJ (ed) Societies and cities in the age of instant access. Springer, Netherlands, pp 209–225CrossRef Bridwell SA (2007) The dimensions of locational privacy. In: Miller HJ (ed) Societies and cities in the age of instant access. Springer, Netherlands, pp 209–225CrossRef
Zurück zum Zitat Charnes A, Cooper WW (1954) The stepping stone method of explaining linear programming calculations in transportation problems. Manag Sci 1:49–69CrossRef Charnes A, Cooper WW (1954) The stepping stone method of explaining linear programming calculations in transportation problems. Manag Sci 1:49–69CrossRef
Zurück zum Zitat Clifton KJ, Gehrke SR (2013) Application of geographic perturbation methods to residential locations in the oregon household activity survey. Transp Res Rec 2354:40–50CrossRef Clifton KJ, Gehrke SR (2013) Application of geographic perturbation methods to residential locations in the oregon household activity survey. Transp Res Rec 2354:40–50CrossRef
Zurück zum Zitat Cooper L (1972) The transportation-location problem. Op Res 20:94–108CrossRef Cooper L (1972) The transportation-location problem. Op Res 20:94–108CrossRef
Zurück zum Zitat Croft W, Shi W, Sack J-R et al (2016) Location-based anonymization: comparison and evaluation of the Voronoi-based aggregation system. Int J Geogr Inf Sci 30:2253–2275CrossRef Croft W, Shi W, Sack J-R et al (2016) Location-based anonymization: comparison and evaluation of the Voronoi-based aggregation system. Int J Geogr Inf Sci 30:2253–2275CrossRef
Zurück zum Zitat Emam KE, Brown A, AbdelMalik P (2009) Evaluating predictors of geographic area population size cut-offs to manage re-identification risk. J Am Med Inform Assoc 16:256–266CrossRef Emam KE, Brown A, AbdelMalik P (2009) Evaluating predictors of geographic area population size cut-offs to manage re-identification risk. J Am Med Inform Assoc 16:256–266CrossRef
Zurück zum Zitat Emam KE, Dankar FK, Neisa A et al (2013) Evaluating the risk of patient re-identification from adverse drug event reports. BMC Med Inform Decis 13 Emam KE, Dankar FK, Neisa A et al (2013) Evaluating the risk of patient re-identification from adverse drug event reports. BMC Med Inform Decis 13
Zurück zum Zitat Ford LR, Fulkerson DR (1956) Solving the transportation problem. Manag Sci 3:24–32CrossRef Ford LR, Fulkerson DR (1956) Solving the transportation problem. Manag Sci 3:24–32CrossRef
Zurück zum Zitat Gionis A, Tassa T (2008) k-anonymization with minimal loss of information. IEEE Trans Knowl Data Eng 21:206–219CrossRef Gionis A, Tassa T (2008) k-anonymization with minimal loss of information. IEEE Trans Knowl Data Eng 21:206–219CrossRef
Zurück zum Zitat Goodchild M, Massam B (1969) Some least-cost models of spatial administrative systems in southern ontario. Geogr Ann 51:86–94CrossRef Goodchild M, Massam B (1969) Some least-cost models of spatial administrative systems in southern ontario. Geogr Ann 51:86–94CrossRef
Zurück zum Zitat Greenberg B, Voshell L (1990) Relating risk of disclosure for microdata and geographic area size. In: Proceedings of SRMS, American statistical association, pp 450–455 Greenberg B, Voshell L (1990) Relating risk of disclosure for microdata and geographic area size. In: Proceedings of SRMS, American statistical association, pp 450–455
Zurück zum Zitat Hawala S (2001) Enhancing the “100,000 rule” on the variation of the per cent of uniques in a microdata sample and the geographic area size identified on the file. In: Proceedings of the annual meeting of the American statistical association, pp 1–6 Hawala S (2001) Enhancing the “100,000 rule” on the variation of the per cent of uniques in a microdata sample and the geographic area size identified on the file. In: Proceedings of the annual meeting of the American statistical association, pp 1–6
Zurück zum Zitat Jung H-W, Emam KE (2014) A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. Int J Health Geogr 13 Jung H-W, Emam KE (2014) A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. Int J Health Geogr 13
Zurück zum Zitat Lowrance W (2006) Access to collections of data and materials for health research: a report to the medical research council and the wellcome trust. Medical research council and the wellcome trust, pp 1–39 Lowrance W (2006) Access to collections of data and materials for health research: a report to the medical research council and the wellcome trust. Medical research council and the wellcome trust, pp 1–39
Zurück zum Zitat Lyseen AK, Nohr C, Sorensen EM et al (2014) A review and framework for categorizing current research and development in health related geographical information systems (GIS) studies. Yearb Med Inform 9:110–124CrossRef Lyseen AK, Nohr C, Sorensen EM et al (2014) A review and framework for categorizing current research and development in health related geographical information systems (GIS) studies. Yearb Med Inform 9:110–124CrossRef
Zurück zum Zitat Mohammed N, Fung BCM, Hung PCK et al (2009) Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1285–1294 Mohammed N, Fung BCM, Hung PCK et al (2009) Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1285–1294
Zurück zum Zitat Olson KL, Grannis SJ, Mandl KD (2002) Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 96:2002–2008CrossRef Olson KL, Grannis SJ, Mandl KD (2002) Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 96:2002–2008CrossRef
Zurück zum Zitat Rezaeian M, Dunn G, Leger SS et al (2007) Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Commun H 61:98–102CrossRef Rezaeian M, Dunn G, Leger SS et al (2007) Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Commun H 61:98–102CrossRef
Zurück zum Zitat Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13:1010–1027CrossRef Samarati P (2001) Protecting respondents identities in microdata release. IEEE Trans Knowl Data Eng 13:1010–1027CrossRef
Zurück zum Zitat Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzz 10:557–570CrossRef Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzz 10:557–570CrossRef
Zurück zum Zitat Thomas Y, Richardson D, Cheung I (2008) Integrating geography and social epidemiology in drug abuse research. In: Thomas Y, Richardson D, Cheung I (eds) Geography and drug addiction. Springer, Netherlands, pp 17–26CrossRef Thomas Y, Richardson D, Cheung I (2008) Integrating geography and social epidemiology in drug abuse research. In: Thomas Y, Richardson D, Cheung I (eds) Geography and drug addiction. Springer, Netherlands, pp 17–26CrossRef
Zurück zum Zitat Vora A, Burke DS, Cummings DAT (2008) The impact of a physical geographic barrier on the dynamic of measles. Epidemiol Infect 136:713–720CrossRef Vora A, Burke DS, Cummings DAT (2008) The impact of a physical geographic barrier on the dynamic of measles. Epidemiol Infect 136:713–720CrossRef
Zurück zum Zitat Young C, Martin D, Skinner C (2009) Geographically intelligent disclosure control for flexible aggregation of census data. Epidemiol Infect 23:457–482 Young C, Martin D, Skinner C (2009) Geographically intelligent disclosure control for flexible aggregation of census data. Epidemiol Infect 23:457–482
Zurück zum Zitat Zhou A, Qu B, Li H et al (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolut Comput 1:32–49CrossRef Zhou A, Qu B, Li H et al (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolut Comput 1:32–49CrossRef
Metadaten
Titel
Comparison of approaches of geographic partitioning for data anonymization
verfasst von
William Lee Croft
Wei Shi
Jörg-Rüdiger Sack
Jean-Pierre Corriveau
Publikationsdatum
15.05.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
Journal of Geographical Systems / Ausgabe 3/2017
Print ISSN: 1435-5930
Elektronische ISSN: 1435-5949
DOI
https://doi.org/10.1007/s10109-017-0251-4