Skip to main content
Top
Published in: Population and Environment 4/2021

24-03-2021 | Original paper

Working toward effective anonymization for surveillance data: innovation at South Africa’s Agincourt Health and Socio-Demographic Surveillance Site

Authors: Lori M. Hunter, Catherine Talbot, Wayne Twine, Joe McGlinchy, Chodziwadziwa W. Kabudula, Daniel Ohene-Kwofie

Published in: Population and Environment | Issue 4/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Linking people and places is essential for population-health-environment research. Yet, this data integration requires geographic coding such that information reflecting individuals or households can appropriately be connected with characteristics of their proximate environments. However, offering access to such geocoding greatly increases the risk of respondent identification and, therefore, holds the potential to breach confidentiality. In response, a variety of “geographic masking” techniques have been developed to introduce error into geographic coding and thereby reduce the likelihood of identification. We report findings from analyses of the error introduced by several masking techniques applied to data from the Agincourt Health and Socio-Demographic Surveillance System in rural South Africa. Using a vegetation index (Normalized Difference Vegetation Index (NDVI)) at the household scale, comparisons are made between the “true” NDVI values and those calculated after masking. We also examine the tradeoffs between accuracy and protecting respondent privacy. The exploration suggests that in this study setting and for NDVI, geomasking approaches that use buffers and account for population density produce the most accurate results. However, the exploration also clearly demonstrates the tradeoff between accuracy and privacy, with more accuracy resulting in a higher level of potential respondent identification. It is important to note that these analyses illustrate a process that should characterize spatially informed research but within which particular decisions must be shaped by the research setting and objectives. In the long run, we aim to provide insight into masking’s potential and perils to facilitate population-environment-health research.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
The NDVI Values and Quality Assessment (QA) files were obtained from the Land Satellites Data System (LSDS) Science Research and Development (LSRD) repository provided by the US Geological Survey (USGS) Earth Resources Observation and Science (EROS) center (LSRD, 2018).
 
2
Calculations were also made with 1 km buffers with no substantial differences in overarching conclusions.
 
3
We also examined the impact of removing boundary constraints which increases the distance of displacement for households by an average of 14%. However, the displacement distances varied substantially across villages. For example, villages with lower household density did not see large gains in displacement as compared to when with masking methods that account for household density. Also, household displacement distances were on average, lower when using k-anonymity methods. In all, this suggests that the household density is more limiting in terms of constraining displacement distances than the village boundary.
 
Literature
go back to reference Abowd, J. M., & Schmutte, I. M. (2019). An economic analysis of privacy protection and statistical accuracy as social choices. American Economic Review, 109(1), 171–202.CrossRef Abowd, J. M., & Schmutte, I. M. (2019). An economic analysis of privacy protection and statistical accuracy as social choices. American Economic Review, 109(1), 171–202.CrossRef
go back to reference Allshouse, W. B., Fitch, M. K., Hampton, K. H., Gesink, D. C., Doherty, I. A., Leone, P. A., Serre, M. L., & Miller, W. C. (2010). Geomasking sensitive health data and privacy protection: An evalution using an E911 database. Geocarto International, 25(6), 443–452. Allshouse, W. B., Fitch, M. K., Hampton, K. H., Gesink, D. C., Doherty, I. A., Leone, P. A., Serre, M. L., & Miller, W. C. (2010). Geomasking sensitive health data and privacy protection: An evalution using an E911 database. Geocarto International, 25(6), 443–452.
go back to reference Anane‐Sarpong, E (2016). Application of ethical principles to research using public health data in the Global South: Perspectives from Africa. Developing World Bioethics. Anane‐Sarpong, E (2016). Application of ethical principles to research using public health data in the Global South: Perspectives from Africa. Developing World Bioethics.
go back to reference Armstrong, M. P., Rushton, G., & Zimmerman, D. L. (1999). Geographically masking health data to preserve confidentiality. Statistics in Medicine, 18(5), 497–525.CrossRef Armstrong, M. P., Rushton, G., & Zimmerman, D. L. (1999). Geographically masking health data to preserve confidentiality. Statistics in Medicine, 18(5), 497–525.CrossRef
go back to reference Byers, E., Gidden, M., Leclère, D., Balkovic, J., Burek, P., Ebi, K., & Johnson, N. (2018). Global exposure and vulnerability to multi-sector development and climate change hotspots. Environmental Research Letters, 13(5), 055012.CrossRef Byers, E., Gidden, M., Leclère, D., Balkovic, J., Burek, P., Ebi, K., & Johnson, N. (2018). Global exposure and vulnerability to multi-sector development and climate change hotspots. Environmental Research Letters, 13(5), 055012.CrossRef
go back to reference Cassa, C. A., Wieland, S. C., & Mandl, K. D. (2008). Re-identification of home addresses from spatial locations anonymized by Gaussian skew. International Journal of Health Geographics. 7(1), 1-9. Cassa, C. A., Wieland, S. C., & Mandl, K. D. (2008). Re-identification of home addresses from spatial locations anonymized by Gaussian skew. International Journal of Health Geographics. 7(1), 1-9.
go back to reference Collinson, M. A. (2010). Striving against adversity: The dynamics of migration, health and poverty in rural South Africa. Global Health Action, 3(1), 5080.CrossRef Collinson, M. A. (2010). Striving against adversity: The dynamics of migration, health and poverty in rural South Africa. Global Health Action, 3(1), 5080.CrossRef
go back to reference Elkies, N., Fink, G., & Bärnighausen, T. (2015). “Scrambling” geo-referenced data to protect privacy induces bias in distance estimation. Population and Environment, 37(1), 83–98.CrossRef Elkies, N., Fink, G., & Bärnighausen, T. (2015). “Scrambling” geo-referenced data to protect privacy induces bias in distance estimation. Population and Environment, 37(1), 83–98.CrossRef
go back to reference Foody, G. M., Cutler, M. E., Mcmorrow, J., Pelz, D., Tangki, H., Boyd, D. S., & Douglas, I. (2001). Mapping the biomass of Bornean tropical rain forest from remotely sensed data published by: Blackwell Publishing Stable http://www.Jstor.Org/Stable/2665383. Global Ecology & Biogeography, 10(4), 379–387. Foody, G. M., Cutler, M. E., Mcmorrow, J., Pelz, D., Tangki, H., Boyd, D. S., & Douglas, I. (2001). Mapping the biomass of Bornean tropical rain forest from remotely sensed data published by: Blackwell Publishing Stable http://​www.​Jstor.​Org/​Stable/​2665383. Global Ecology & Biogeography, 10(4), 379–387.
go back to reference Giannecchini, M., Twine, W., & Vogel, C. (2007). Land-cover change and human–environment interactions in a rural cultural landscape in South Africa. Geographical Journal, 173(1), 26–42.CrossRef Giannecchini, M., Twine, W., & Vogel, C. (2007). Land-cover change and human–environment interactions in a rural cultural landscape in South Africa. Geographical Journal, 173(1), 26–42.CrossRef
go back to reference Grace, K., Nagle, N. N., Burgert-Brucker, C. R., Rutzick, S., Van Riper, D. C., Dontamsetti, T., & Croft, T. (2019). Integrating environmental context into DHS analysis while protecting participant confidentiality: A new remote sensing method. Population and Development Review, 45(1), 197.CrossRef Grace, K., Nagle, N. N., Burgert-Brucker, C. R., Rutzick, S., Van Riper, D. C., Dontamsetti, T., & Croft, T. (2019). Integrating environmental context into DHS analysis while protecting participant confidentiality: A new remote sensing method. Population and Development Review, 45(1), 197.CrossRef
go back to reference Hunter, L. M., Twine, W., & Patterson, L. (2007). ``Locusts are now our beef'': Adult mortality and household dietary use of local environmental resources in rural South Africa1. Scandinavian Journal of Public Health, 35(69_suppl), 165–174. Hunter, L. M., Twine, W., & Patterson, L. (2007). ``Locusts are now our beef'': Adult mortality and household dietary use of local environmental resources in rural South Africa1. Scandinavian Journal of Public Health35(69_suppl), 165–174.
go back to reference Leyk, S., Maclaurin, G. J., Hunter, L. M., Nawrotzki, R., Twine, W., Collinson, M., & Erasmus, B. (2012). Spatially and temporally varying associations between temporary outmigration and natural resource availability in resource-dependent rural communities in South Africa: A modeling framework. Applied Geography, 34(2012), 559–568.CrossRef Leyk, S., Maclaurin, G. J., Hunter, L. M., Nawrotzki, R., Twine, W., Collinson, M., & Erasmus, B. (2012). Spatially and temporally varying associations between temporary outmigration and natural resource availability in resource-dependent rural communities in South Africa: A modeling framework. Applied Geography, 34(2012), 559–568.CrossRef
go back to reference Lu, Y., Yorke, C., & Zhan, F. B. (2012). Considering risk locations when defining perturbation zones for geomasking. Cartographica: The International Journal for Geographic Information and Geovisualization 47(3):168–78. Lu, Y., Yorke, C., & Zhan, F. B. (2012). Considering risk locations when defining perturbation zones for geomasking. Cartographica: The International Journal for Geographic Information and Geovisualization 47(3):168–78.
go back to reference LSRD. (2018). Land Satelite Data System (LSDS) Science Research and Development (LSRD) Reposiory. Sioux Falls, ND. U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center. `https://espa.cr.usgs.gov. LSRD. (2018). Land Satelite Data System (LSDS) Science Research and Development (LSRD) Reposiory. Sioux Falls, ND. U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center. `https://​espa.​cr.​usgs.​gov.
go back to reference Matsika, R., Erasmus, B. F. N., & Twine, W. C. (2013). Double jeopardy: The dichotomy of fuelwood use in rural South Africa. Energy Policy, 52, 716–725.CrossRef Matsika, R., Erasmus, B. F. N., & Twine, W. C. (2013). Double jeopardy: The dichotomy of fuelwood use in rural South Africa. Energy Policy, 52, 716–725.CrossRef
go back to reference Mutanga, O., & Skidmore, A. K. (2004). Narrow band vegetation indices overcome the saturation problem in biomass estimation. International Journal of Remote Sensing, 25(19), 3999–4014.CrossRef Mutanga, O., & Skidmore, A. K. (2004). Narrow band vegetation indices overcome the saturation problem in biomass estimation. International Journal of Remote Sensing, 25(19), 3999–4014.CrossRef
go back to reference NASA. (2000). Measuring Vegetation (NDVI & EVI). Measuring Vegetation (NDVI & EVI). NASA. (2000). Measuring Vegetation (NDVI & EVI). Measuring Vegetation (NDVI & EVI).
go back to reference Olsson, L., Opondo, M., Tschakert, P., Agrawal, A., & Eriksen, S. E. (2014). Livelihoods and poverty. In: Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Field, C.B., V.R. Barros, D.J. Dokken, K.J. Mach, M.D. Mastrandrea, T.E. Bilir, M. Chatterjee, K.L. Ebi, Y.O. Estrada, R.C. Genova, B. Girma, E.S. Kissel, A.N. Levy, S. MacCracken, P.R. Mastrandrea, and L.L.White (Eds.), Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 793–832. Olsson, L., Opondo, M., Tschakert, P., Agrawal, A., & Eriksen, S. E. (2014). Livelihoods and poverty. In: Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Field, C.B., V.R. Barros, D.J. Dokken, K.J. Mach, M.D. Mastrandrea, T.E. Bilir, M. Chatterjee, K.L. Ebi, Y.O. Estrada, R.C. Genova, B. Girma, E.S. Kissel, A.N. Levy, S. MacCracken, P.R. Mastrandrea, and L.L.White (Eds.), Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 793–832.
go back to reference Paumgarten, F., & Shackleton, C. M. (2011). The role of non-timber forest products in household coping strategies in South Africa: the influence of household wealth and gender. Population and Environment, 33(1), 108.CrossRef Paumgarten, F., & Shackleton, C. M. (2011). The role of non-timber forest products in household coping strategies in South Africa: the influence of household wealth and gender. Population and Environment, 33(1), 108.CrossRef
go back to reference Roerink, G. J., Menenti, M., Soepboer, W., & Su, Z. (2003). Assessment of climate impact on vegetation dynamics by using remote sensing. Physics and Chemistry of the Earth, 28(1–3), 103–109.CrossRef Roerink, G. J., Menenti, M., Soepboer, W., & Su, Z. (2003). Assessment of climate impact on vegetation dynamics by using remote sensing. Physics and Chemistry of the Earth, 28(1–3), 103–109.CrossRef
go back to reference Ruggles, S., Fitch, C., Magnuson, D., & Schroeder, J. (2019). Differential privacy and census data: Implications for social and economic research. AEA Papers and Proceedings, 109, 403–408.CrossRef Ruggles, S., Fitch, C., Magnuson, D., & Schroeder, J. (2019). Differential privacy and census data: Implications for social and economic research. AEA Papers and Proceedings, 109, 403–408.CrossRef
go back to reference Sumner, D., Christie, M. E., & Boulakia, S. (2017). Conservation agriculture and gendered livelihoods in Northwestern Cambodia: Decision-making, space and access. Agriculture and Human Values, 34(2), 347–362.CrossRef Sumner, D., Christie, M. E., & Boulakia, S. (2017). Conservation agriculture and gendered livelihoods in Northwestern Cambodia: Decision-making, space and access. Agriculture and Human Values, 34(2), 347–362.CrossRef
go back to reference Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 557–570.CrossRef Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 557–570.CrossRef
go back to reference Tlou, B., Sartorius, B., & Tanser, F. (2017). Space-time patterns in maternal and mother mortality in a rural South African population with high HIV prevalence (2000–2014): Results from a population-based cohort. BMC Public Health, 17(1), 543.CrossRef Tlou, B., Sartorius, B., & Tanser, F. (2017). Space-time patterns in maternal and mother mortality in a rural South African population with high HIV prevalence (2000–2014): Results from a population-based cohort. BMC Public Health, 17(1), 543.CrossRef
go back to reference Tucker, C. J. (1979). Red and Photographic Infrared l,Lnear Combinations for Monitoring Vegetation. Vol 8. Tucker, C. J. (1979). Red and Photographic Infrared l,Lnear Combinations for Monitoring Vegetation. Vol 8.
go back to reference Wang, J. & Rich, P. M. (2008). Geocarto International Relations between NDVI, Grassland Production, and Crop Yield in the Central Great Plains. Wang, J. & Rich, P. M. (2008). Geocarto International Relations between NDVI, Grassland Production, and Crop Yield in the Central Great Plains.
go back to reference Wang, H., & Reiter, J. P. (2012). Multiple imputation for sharing precise geographies in public use data. The Annals of Applied Statistics, 6(1), 229–252.CrossRef Wang, H., & Reiter, J. P. (2012). Multiple imputation for sharing precise geographies in public use data. The Annals of Applied Statistics, 6(1), 229–252.CrossRef
go back to reference Warren, J. L., Perez-Heydrich, C., Burgert, C. R., & Emch, M. E. (2016). Influence of demographic and health survey point displacements on distance-based analyses. Spatial Demography, 4(2), 155–173.CrossRef Warren, J. L., Perez-Heydrich, C., Burgert, C. R., & Emch, M. E. (2016). Influence of demographic and health survey point displacements on distance-based analyses. Spatial Demography, 4(2), 155–173.CrossRef
go back to reference Wessels, K. J., Prince, S. D., Frost, P. E., & Van Zyl, D. (2004). Assessing the effects of human-induced land degradation in the former homelands of Northern South Africa with a 1 km AVHRR NDVI time-series. Remote Sensing of Environment, 91(1), 47–67.CrossRef Wessels, K. J., Prince, S. D., Frost, P. E., & Van Zyl, D. (2004). Assessing the effects of human-induced land degradation in the former homelands of Northern South Africa with a 1 km AVHRR NDVI time-series. Remote Sensing of Environment, 91(1), 47–67.CrossRef
go back to reference Wisely, S. M., Alexander, K., & Cassidy, L. (2018). Linking ecosystem services to livelihoods in southern Africa. Ecosystem Services, 30, 339–341.CrossRef Wisely, S. M., Alexander, K., & Cassidy, L. (2018). Linking ecosystem services to livelihoods in southern Africa. Ecosystem Services, 30, 339–341.CrossRef
go back to reference Zandbergen, P. A. (2014). Ensuring confidentiality of geocoded health data: Assessing geographic masking strategies for individual-level data. Advances in Medicine, 1–14. Zandbergen, P. A. (2014). Ensuring confidentiality of geocoded health data: Assessing geographic masking strategies for individual-level data. Advances in Medicine, 1–14.
go back to reference Zhou, F. D., & Louis, T. A. (2010). A smoothing approach for masking spatial data. The Annals of Applied Statistics, 4(3), 1451–1475.CrossRef Zhou, F. D., & Louis, T. A. (2010). A smoothing approach for masking spatial data. The Annals of Applied Statistics, 4(3), 1451–1475.CrossRef
Metadata
Title
Working toward effective anonymization for surveillance data: innovation at South Africa’s Agincourt Health and Socio-Demographic Surveillance Site
Authors
Lori M. Hunter
Catherine Talbot
Wayne Twine
Joe McGlinchy
Chodziwadziwa W. Kabudula
Daniel Ohene-Kwofie
Publication date
24-03-2021
Publisher
Springer Netherlands
Published in
Population and Environment / Issue 4/2021
Print ISSN: 0199-0039
Electronic ISSN: 1573-7810
DOI
https://doi.org/10.1007/s11111-020-00372-4

Other articles of this Issue 4/2021

Population and Environment 4/2021 Go to the issue