Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 2/2005

01.09.2005

Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

verfasst von: Josep Domingo-Ferrer, Vicenç Torra

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 2/2005

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

k-Anonymity is a useful concept to solve the tension between data utility and respondent privacy in individual data (microdata) protection. However, the generalization and suppression approach proposed in the literature to achieve k-anonymity is not equally suited for all types of attributes: (i) generalization/suppression is one of the few possibilities for nominal categorical attributes; (ii) it is just one possibility for ordinal categorical attributes which does not always preserve ordinality; (iii) and it is completely unsuitable for continuous attributes, as it causes them to lose their numerical meaning. Since attributes leading to disclosure (and thus needing k-anonymization) may be nominal, ordinal and also continuous, it is important to devise k-anonymization procedures which preserve the semantics of each attribute type as much as possible. We propose in this paper to use categorical microaggregation as an alternative to generalization/suppression for nominal and ordinal k-anonymization; we also propose continuous microaggregation as the method for continuous k-anonymization.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A. 2004. k-Anonymity: Algorithms and hardness. Technical report, Stanford University. Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A. 2004. k-Anonymity: Algorithms and hardness. Technical report, Stanford University.
Zurück zum Zitat Dalenius, T. 1986. Finding a needle in a haystack - or identifying anonymous census records. Journal of Official Statistics, 2(3):329–336. Dalenius, T. 1986. Finding a needle in a haystack - or identifying anonymous census records. Journal of Official Statistics, 2(3):329–336.
Zurück zum Zitat Defays, D. and Nanopoulos, P. 1993. Panels of enterprises and confidentiality: the small aggregates method. In Proc. of 92 Symposium on Design and Analysis of Longitudinal Surveys. Ottawa, Statistics Canada, pp.195–204. Defays, D. and Nanopoulos, P. 1993. Panels of enterprises and confidentiality: the small aggregates method. In Proc. of 92 Symposium on Design and Analysis of Longitudinal Surveys. Ottawa, Statistics Canada, pp.195–204.
Zurück zum Zitat Domingo-Ferrer, J. and Mateo-Sanz, J.M. 2002. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189–201.CrossRef Domingo-Ferrer, J. and Mateo-Sanz, J.M. 2002. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189–201.CrossRef
Zurück zum Zitat Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2001. Comparing sdc methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS'2001 (vol. 2). Luxemburg. Eurostat, pp. 807–826. Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2001. Comparing sdc methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS'2001 (vol. 2). Luxemburg. Eurostat, pp. 807–826.
Zurück zum Zitat Domingo-Ferrer, J. and Torra, V. 2001a. Disclosure protection methods and information loss for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://vneumann.etse.urv.es/publications/bcpi pp. 91–110. Domingo-Ferrer, J. and Torra, V. 2001a. Disclosure protection methods and information loss for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://​vneumann.​etse.​urv.​es/​publications/​bcpi pp. 91–110.
Zurück zum Zitat Domingo-Ferrer, J. and Torra, V. 2001b. A quantitative comparison of disclosure control methods for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://vneumann.etse.urv.es/publications/bcpi, pp. 111–134. Domingo-Ferrer, J. and Torra, V. 2001b. A quantitative comparison of disclosure control methods for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://​vneumann.​etse.​urv.​es/​publications/​bcpi, pp. 111–134.
Zurück zum Zitat Domingo-Ferrer, J. and Torra, V. 2005. Privacy in statistical databases: Methods and performance metrics for microdata protection. manuscript. Domingo-Ferrer, J. and Torra, V. 2005. Privacy in statistical databases: Methods and performance metrics for microdata protection. manuscript.
Zurück zum Zitat Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., and Roehrig, S.F. 2001a. Disclosure limitation methods and information loss for tabular data. In P. Doyle, J.I. Lane, J.J. Theeuwes and L.V. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. Amsterdam. North-Holland, pp. 135–166. Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., and Roehrig, S.F. 2001a. Disclosure limitation methods and information loss for tabular data. In P. Doyle, J.I. Lane, J.J. Theeuwes and L.V. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. Amsterdam. North-Holland, pp. 135–166.
Zurück zum Zitat Duncan, G.T., Keller-McNulty, S.A., and Stokes, S.L. 2001b. Disclosure risk vs. data utility: The r-u confidentiality map. Duncan, G.T., Keller-McNulty, S.A., and Stokes, S.L. 2001b. Disclosure risk vs. data utility: The r-u confidentiality map.
Zurück zum Zitat Mateo-Sanz, J.M., Domingo-Ferrer, J., and Sebé, F. 2005. Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Mining and Knowledge Discovery, this issue. Mateo-Sanz, J.M., Domingo-Ferrer, J., and Sebé, F. 2005. Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Mining and Knowledge Discovery, this issue.
Zurück zum Zitat Meyerson, A. and Williams, R. 2004. On the complexity of optimal k-Anonymity. In Proc. of the ACM Symposium on Principles of Database Systems-PODS'2004. Paris, France. ACM, pp. 223–228. Meyerson, A. and Williams, R. 2004. On the complexity of optimal k-Anonymity. In Proc. of the ACM Symposium on Principles of Database Systems-PODS'2004. Paris, France. ACM, pp. 223–228.
Zurück zum Zitat Oganian, A. and Domingo-Ferrer, J. 2001. On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Comission for Europe, 18(4):345–354. Oganian, A. and Domingo-Ferrer, J. 2001. On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Comission for Europe, 18(4):345–354.
Zurück zum Zitat Reiter, J.P. 2004. Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A, page forthcoming. Reiter, J.P. 2004. Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A, page forthcoming.
Zurück zum Zitat Samarati, P. 2001. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.CrossRef Samarati, P. 2001. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.CrossRef
Zurück zum Zitat Samarati, P. and Sweeney, L. 1998. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International. Samarati, P. and Sweeney, L. 1998. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International.
Zurück zum Zitat Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2002. Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In J. Domingo-Ferrer (ed.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg, Springer, pp. 163–171. Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2002. Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In J. Domingo-Ferrer (ed.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg, Springer, pp. 163–171.
Zurück zum Zitat Sweeney, L. 2002a. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):571–588.MATHCrossRefMathSciNet Sweeney, L. 2002a. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):571–588.MATHCrossRefMathSciNet
Zurück zum Zitat Sweeney, L. 2002b. k-anonimity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):557–570.MATHCrossRefMathSciNet Sweeney, L. 2002b. k-anonimity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):557–570.MATHCrossRefMathSciNet
Zurück zum Zitat Torra, V. 2004. Microaggregation for categorical variables: A median based approach. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg. Springer, pp. 162–174. Torra, V. 2004. Microaggregation for categorical variables: A median based approach. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg. Springer, pp. 162–174.
Zurück zum Zitat Willenborg, L. and DeWaal, T. 2001. Elements of Statistical Disclosure Control. Springer-Verlag, New York.MATH Willenborg, L. and DeWaal, T. 2001. Elements of Statistical Disclosure Control. Springer-Verlag, New York.MATH
Zurück zum Zitat Winkler, W. E. 2004. Re-identification methods for masked microdata. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg, Springer, pp. 216–230. Winkler, W. E. 2004. Re-identification methods for masked microdata. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg, Springer, pp. 216–230.
Zurück zum Zitat Yancey, W.E., Winkler, W.E., and Creecy, R.H. 2002. Disclosure risk assessment in perturbative microdata protection. In J. Domingo-Ferrer (Eds.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg. Springer, pp. 135–152. Yancey, W.E., Winkler, W.E., and Creecy, R.H. 2002. Disclosure risk assessment in perturbative microdata protection. In J. Domingo-Ferrer (Eds.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg. Springer, pp. 135–152.
Metadaten
Titel
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation
verfasst von
Josep Domingo-Ferrer
Vicenç Torra
Publikationsdatum
01.09.2005
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 2/2005
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-005-0007-5

Weitere Artikel der Ausgabe 2/2005

Data Mining and Knowledge Discovery 2/2005 Zur Ausgabe

Premium Partner