Skip to main content
Erschienen in: International Journal of Information Security 6/2018

23.10.2017 | Regular Contribution

Optimization algorithm for k-anonymization of datasets with low information loss

verfasst von: Keisuke Murakami, Takeaki Uno

Erschienen in: International Journal of Information Security | Ausgabe 6/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Anonymization is the modification of data to mask the correspondence between a person and sensitive information in the data. Several anonymization models such as k-anonymity have been intensively studied. Recently, a new model with less information loss than existing models was proposed; this is a type of non-homogeneous generalization. In this paper, we present an alternative anonymization algorithm that further reduces the information loss using optimization techniques. We also prove that a modified dataset is checked whether it satisfies the k-anonymity by a polynomial-time algorithm. Computational experiments were conducted and demonstrated the efficiency of our algorithm even on large datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Sacharidis, D., Mouratidis, K., Papadias, D.: k-Anonymity in the presence of external databases. IEEE Trans. Knowl. Data Eng. 22(3), 392–403 (2010)CrossRef Sacharidis, D., Mouratidis, K., Papadias, D.: k-Anonymity in the presence of external databases. IEEE Trans. Knowl. Data Eng. 22(3), 392–403 (2010)CrossRef
2.
Zurück zum Zitat Dalenius, T.: Finding a needle in a haystack or identifying anonymous census record. J. Off. Stat. 2(3), 329–336 (1986) Dalenius, T.: Finding a needle in a haystack or identifying anonymous census record. J. Off. Stat. 2(3), 329–336 (1986)
3.
Zurück zum Zitat Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection. In: Fourth IEEE International Conference on Data Mining, 2004. ICDM’04, pp. 249–256 (2004) Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection. In: Fourth IEEE International Conference on Data Mining, 2004. ICDM’04, pp. 249–256 (2004)
4.
Zurück zum Zitat Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: Proceedings of the 17th ACM SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, p. 188 (1998) Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: Proceedings of the 17th ACM SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, p. 188 (1998)
5.
Zurück zum Zitat Fung, B., Wang, K., YU, P.: Top-down specialization for information and privacy preservation. In: Proceedings. 21st International Conference on Data Engineering, 2005. ICDE 2005. IEEE, pp. 205–216 (2005) Fung, B., Wang, K., YU, P.: Top-down specialization for information and privacy preservation. In: Proceedings. 21st International Conference on Data Engineering, 2005. ICDE 2005. IEEE, pp. 205–216 (2005)
6.
Zurück zum Zitat Samarati, P.: Protecting respondants identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)CrossRef Samarati, P.: Protecting respondants identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)CrossRef
7.
Zurück zum Zitat Sun, X., Li, M., Wang, H., Plank, A.: An efficient hash-based algorithm for minimal k-anonymity. In: Proceedings of the Thirty-First Australasian Conference on Computer Science, vol. 74, pp. 101–107 (2008) Sun, X., Li, M., Wang, H., Plank, A.: An efficient hash-based algorithm for minimal k-anonymity. In: Proceedings of the Thirty-First Australasian Conference on Computer Science, vol. 74, pp. 101–107 (2008)
8.
Zurück zum Zitat Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \({k}\)-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98–04, SRI Computer Science Laboratory (1998) Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \({k}\)-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98–04, SRI Computer Science Laboratory (1998)
9.
Zurück zum Zitat LeFevre, K., DeWitt, D.J., Ramakrishnan, R., Incognito: Efficient full-domain \({k}\)-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, vol. 21, pp. 49–60 (2005) LeFevre, K., DeWitt, D.J., Ramakrishnan, R., Incognito: Efficient full-domain \({k}\)-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, vol. 21, pp. 49–60 (2005)
10.
Zurück zum Zitat Machanavajjhala, A., Gehrke, J., Kifer, D.: l-Diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(3), 1–52 (2007) Machanavajjhala, A., Gehrke, J., Kifer, D.: l-Diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(3), 1–52 (2007)
11.
Zurück zum Zitat Domingo-Ferrer, J.: Microaggregation for database and location privacy. In: Etzion, O., Kuflik, T., Motro, A. (eds.) Next Generation Information Technologies and Systems. Lecture Notes in Computer Science, vol. 4032. Springer, Berlin, Heidelberg, pp. 106–116 (2006)CrossRef Domingo-Ferrer, J.: Microaggregation for database and location privacy. In: Etzion, O., Kuflik, T., Motro, A. (eds.) Next Generation Information Technologies and Systems. Lecture Notes in Computer Science, vol. 4032. Springer, Berlin, Heidelberg, pp. 106–116 (2006)CrossRef
12.
Zurück zum Zitat Campan, A., Truta, T.M., Miller, J., Sinca, R.A.: A clustering approach for achieving data privacy. In: Proceedings of the International Data Mining Conference, pp. 321–330 (2007) Campan, A., Truta, T.M., Miller, J., Sinca, R.A.: A clustering approach for achieving data privacy. In: Proceedings of the International Data Mining Conference, pp. 321–330 (2007)
13.
Zurück zum Zitat Goldberg, A.V., Tarjan, R.E.: Efficient maximum flow algorithms. Commun. ACM 57(8), 82–89 (2014)CrossRef Goldberg, A.V., Tarjan, R.E.: Efficient maximum flow algorithms. Commun. ACM 57(8), 82–89 (2014)CrossRef
14.
Zurück zum Zitat Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: Proceedings of the 10th International Conference on Database Theory, LNCS, 3363, pp. 246–258 (2005) Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: Proceedings of the 10th International Conference on Database Theory, LNCS, 3363, pp. 246–258 (2005)
15.
Zurück zum Zitat Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)MathSciNetCrossRef Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)MathSciNetCrossRef
16.
Zurück zum Zitat Wong, W.K., Mamoulis, N., Cheung, D.W.-L.: Non-homogeneous generalization in privacy preserving data publishing. In: The ACM SIGMOD International Conference on Data Management (SIGMOD), pp. 747–758 (2010) Wong, W.K., Mamoulis, N., Cheung, D.W.-L.: Non-homogeneous generalization in privacy preserving data publishing. In: The ACM SIGMOD International Conference on Data Management (SIGMOD), pp. 747–758 (2010)
17.
Zurück zum Zitat Murakami, K., Uno, T.: A matching model and an algorithm for k-anonymity of large-scale data. In: Proceedings of the 15th Korea-Japan Joint Workshop on Algorithms and Computation, pp. 154–160 (2012) Murakami, K., Uno, T.: A matching model and an algorithm for k-anonymity of large-scale data. In: Proceedings of the 15th Korea-Japan Joint Workshop on Algorithms and Computation, pp. 154–160 (2012)
18.
Zurück zum Zitat Shmueli, E., Tassa, T., Wasserstein, R., Shapira, B., Rokach, L.: Limiting disclosure of sensitive data in sequential releases of databases. Inf. Sci. 191, 98–127 (2012)CrossRef Shmueli, E., Tassa, T., Wasserstein, R., Shapira, B., Rokach, L.: Limiting disclosure of sensitive data in sequential releases of databases. Inf. Sci. 191, 98–127 (2012)CrossRef
19.
Zurück zum Zitat Shmueli, E., Tassa, T.: Privacy by diversity in sequential releases of databases. Inf. Sci. 298, 344–372 (2015)CrossRef Shmueli, E., Tassa, T.: Privacy by diversity in sequential releases of databases. Inf. Sci. 298, 344–372 (2015)CrossRef
20.
Zurück zum Zitat Goldberg, A.V.: An Efficient implementation of a scaling minimum-cost flow algorithm. J. Algorithms 22(1), 1–29 (1997)MathSciNetCrossRef Goldberg, A.V.: An Efficient implementation of a scaling minimum-cost flow algorithm. J. Algorithms 22(1), 1–29 (1997)MathSciNetCrossRef
21.
Zurück zum Zitat Hall, P.: On representatives of subsets. J. Lond. Math. Soc. 10(1), 26–30 (1935)CrossRef Hall, P.: On representatives of subsets. J. Lond. Math. Soc. 10(1), 26–30 (1935)CrossRef
22.
Zurück zum Zitat Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by canceling negative cycles. J. ACM 36(4), 873–886 (1989)MathSciNetCrossRef Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by canceling negative cycles. J. ACM 36(4), 873–886 (1989)MathSciNetCrossRef
23.
Zurück zum Zitat Sokkalingam, P.T.: New polynomial-time cycle-canceling algorithms for minimum cost flows. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA 02139, USA (1997) Sokkalingam, P.T.: New polynomial-time cycle-canceling algorithms for minimum cost flows. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA 02139, USA (1997)
24.
Zurück zum Zitat Uno, T.: Multi-sorting algorithm for finding pairs of similar short substrings from large-scale string data. Knowl. Inf. Syst. 25(2), 229–251 (2010)CrossRef Uno, T.: Multi-sorting algorithm for finding pairs of similar short substrings from large-scale string data. Knowl. Inf. Syst. 25(2), 229–251 (2010)CrossRef
Metadaten
Titel
Optimization algorithm for k-anonymization of datasets with low information loss
verfasst von
Keisuke Murakami
Takeaki Uno
Publikationsdatum
23.10.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Information Security / Ausgabe 6/2018
Print ISSN: 1615-5262
Elektronische ISSN: 1615-5270
DOI
https://doi.org/10.1007/s10207-017-0392-y

Weitere Artikel der Ausgabe 6/2018

International Journal of Information Security 6/2018 Zur Ausgabe