Skip to main content

2014 | OriginalPaper | Buchkapitel

Beyond Multivariate Microaggregation for Large Record Anonymization

verfasst von : Jordi Nin

Erschienen in: Citizen in Sensor Networks

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least \(k\) elements and, therefore, preserving \(k\)-anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. This is called multivariate microaggregation. By using this technique, the information loss after collapsing several values to the centroid of their group is reduced. Unfortunately, with multivariate microaggregation, the \(k\)-anonymity property is lost when at least two attributes of different blocks are known by the intruder, which might be the usual case.
In this work, we present a new microaggregation method called one dimension microaggregation (\(Mic1D-k\)). With \(Mic1D-k\), the problem of \(k\)-anonymity loss is mitigated by mixing all the values in the original microdata file into a single non-attributed data set using a set of simple pre-processing steps and then, microaggregating all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Comput. Surv. 21, 515–556 (1989)CrossRef Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Comput. Surv. 21, 515–556 (1989)CrossRef
2.
Zurück zum Zitat Aggarwal, C.: On \(k\)-anonymity and the curse of dimensionality. In: Proceedings of the 31st International Conference on Very Large Databases, pp. 901–909 (2005) Aggarwal, C.: On \(k\)-anonymity and the curse of dimensionality. In: Proceedings of the 31st International Conference on Very Large Databases, pp. 901–909 (2005)
3.
Zurück zum Zitat Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. In: Proceedings of the 25th ACM Symposium on Principles of Databases Systems, pp. 153–162 (2006) Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. In: Proceedings of the 25th ACM Symposium on Principles of Databases Systems, pp. 153–162 (2006)
5.
Zurück zum Zitat Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata, pp. 91–110 of [8] (2001) Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata, pp. 91–110 of [8] (2001)
6.
Zurück zum Zitat Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata, pp. 111–133 of [8] (2001) Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata, pp. 111–133 of [8] (2001)
7.
Zurück zum Zitat Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)CrossRef Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)CrossRef
8.
Zurück zum Zitat Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier Science, New York (2001) Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier Science, New York (2001)
9.
Zurück zum Zitat Felso, F., Theeuwes, J., Wagner, G.: Disclosure limitation in use: results of a survey, pp. 17–42 of [8] (2001) Felso, F., Theeuwes, J., Wagner, G.: Disclosure limitation in use: results of a survey, pp. 17–42 of [8] (2001)
10.
Zurück zum Zitat Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st IEEE International Conference on Data, Engineering, pp. 205–216 (2005) Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st IEEE International Conference on Data, Engineering, pp. 205–216 (2005)
11.
Zurück zum Zitat Hansen, S., Mukherjee, S.: A polynomial algorithm for optimal univariate microaggregation. Trans. Knowl. Data Eng. 15(4), 1043–1044 (2003)CrossRef Hansen, S., Mukherjee, S.: A polynomial algorithm for optimal univariate microaggregation. Trans. Knowl. Data Eng. 15(4), 1043–1044 (2003)CrossRef
12.
Zurück zum Zitat Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (2002). ISBN: 978-0-387-95442-4MATH Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (2002). ISBN: 978-0-387-95442-4MATH
13.
Zurück zum Zitat Larsen, R.J., Marx, M.L.: An Introduction to Mathematical Statistics and Its Applications, 3rd edn. Prentice Hall, Upper Saddle River (2005). ISBN-10: 0131867938 Larsen, R.J., Marx, M.L.: An Introduction to Mathematical Statistics and Its Applications, 3rd edn. Prentice Hall, Upper Saddle River (2005). ISBN-10: 0131867938
14.
Zurück zum Zitat Mateo-Sanz, J.M., Domingo-Ferrer, J.: A method for data-oriented multivariate microaggregation. In: Statistical Data Protection for Official Publications of the European, Communities, pp. 89–99 Mateo-Sanz, J.M., Domingo-Ferrer, J.: A method for data-oriented multivariate microaggregation. In: Statistical Data Protection for Official Publications of the European, Communities, pp. 89–99
16.
Zurück zum Zitat Nin, J., Torra, V.: Empirical analysis of database privacy using twofold integrals. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005, vol. 3801, pp. 1–8. LNAI. Springer, Heidelberg (2005) Nin, J., Torra, V.: Empirical analysis of database privacy using twofold integrals. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005, vol. 3801, pp. 1–8. LNAI. Springer, Heidelberg (2005)
17.
Zurück zum Zitat Nin, J., Herranz, J., Torra, V.: On the disclosure risk of multivariate microaggregation. Data. Knowl. Eng. (DKE), Elsevier 67(3), 399–412 (2008)CrossRef Nin, J., Herranz, J., Torra, V.: On the disclosure risk of multivariate microaggregation. Data. Knowl. Eng. (DKE), Elsevier 67(3), 399–412 (2008)CrossRef
18.
Zurück zum Zitat Nin, J., Herranz, J., Torra, V.: How to group attributes in multivariate microaggregation. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 16(1), 121–138 (2008)CrossRef Nin, J., Herranz, J., Torra, V.: How to group attributes in multivariate microaggregation. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 16(1), 121–138 (2008)CrossRef
19.
Zurück zum Zitat Nin, J., Herranz, J., Torra, V.: Towards a more realistic disclosure risk assessment. In: Domingo-Ferrer, J., Saygın, Y. (eds.) PSD 2008, vol. 5262, pp. 152–165. LNCS. Springer, Heidelberg (2008) Nin, J., Herranz, J., Torra, V.: Towards a more realistic disclosure risk assessment. In: Domingo-Ferrer, J., Saygın, Y. (eds.) PSD 2008, vol. 5262, pp. 152–165. LNCS. Springer, Heidelberg (2008)
20.
Zurück zum Zitat Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Stat. J. United Nations Econ. Comm. Europe 18(4), 345–354 (2000) Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Stat. J. United Nations Econ. Comm. Europe 18(4), 345–354 (2000)
21.
Zurück zum Zitat Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2 (1999) Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2 (1999)
22.
Zurück zum Zitat Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \(k\)-anonymity and its enforcement through generalization and suppression. SRI International technical reports (1998) Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \(k\)-anonymity and its enforcement through generalization and suppression. SRI International technical reports (1998)
23.
Zurück zum Zitat Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 459–476 (2002)CrossRefMATHMathSciNet Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 459–476 (2002)CrossRefMATHMathSciNet
24.
Zurück zum Zitat Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases, vol. 2316, pp. 163–171. LNCS. Springer, Heidelberg (2002)CrossRef Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases, vol. 2316, pp. 163–171. LNCS. Springer, Heidelberg (2002)CrossRef
25.
Zurück zum Zitat Sweeney, L.: Achieving \(k\)-anonymity privacy protection using generalization and suppression. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 571–588 (2002)CrossRefMATHMathSciNet Sweeney, L.: Achieving \(k\)-anonymity privacy protection using generalization and suppression. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 571–588 (2002)CrossRefMATHMathSciNet
26.
28.
Zurück zum Zitat Willenborg, L., Waal, T.: Elements of Statistical Diclosure Control. Lecture Notes in Statistics. Springer, New York (2001)CrossRef Willenborg, L., Waal, T.: Elements of Statistical Diclosure Control. Lecture Notes in Statistics. Springer, New York (2001)CrossRef
Metadaten
Titel
Beyond Multivariate Microaggregation for Large Record Anonymization
verfasst von
Jordi Nin
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-04178-0_8