Skip to main content

2017 | OriginalPaper | Buchkapitel

An Anonymization Method to Improve Data Utility for Classification

verfasst von : Jianmin Han, Juan Yu, Jianfeng Lu, Hao Peng, Jiandang Wu

Erschienen in: Cyberspace Safety and Security

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

k-anonymity is a popular method to preserve privacy in microdata, which sacrifices data utility for preserving individuals’ privacy. Therefore, how to preserve privacy with high data utility has been becoming a hot topic in k-anonymity area. Existing anonymization methods seldomly consider the data utility for specific data mining. To address the problem, we define a novel attribute weight measurement for determining the generalization order, and further propose a new anonymization algorithm based on the weight measurement using global generalization, called Weighted Full-Domain Anonymization (WFDA) Algorithm. The main idea of the algorithm is to generalize attributes with large weights to lower levels, and attributes with small weights to high levels. The proposed algorithm can reserve data utility for classification to a large extent. Experiments show that anonymous data resulted from the proposed method retains higher utility, i.e., has better classification accuracy, than that generated by other anonymization methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)CrossRefMATHMathSciNet Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)CrossRefMATHMathSciNet
2.
Zurück zum Zitat Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of the 17th ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, Piscataway, NJ, p. 188. IEEE (1998) Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of the 17th ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, Piscataway, NJ, p. 188. IEEE (1998)
3.
Zurück zum Zitat Machanavajjhala, A., Gehrke, J., Kifer, D.: l-diversity: privacy beyond k-anonymity. In: Proceedings of the 22nd International Conference on Data Engineering, Atlanta, pp. 24–35. IEEE Computer Society (2006) Machanavajjhala, A., Gehrke, J., Kifer, D.: l-diversity: privacy beyond k-anonymity. In: Proceedings of the 22nd International Conference on Data Engineering, Atlanta, pp. 24–35. IEEE Computer Society (2006)
4.
Zurück zum Zitat Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE), Istanbul, Turkey, pp. 106–115. IEEE Press (2007) Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE), Istanbul, Turkey, pp. 106–115. IEEE Press (2007)
5.
Zurück zum Zitat Gramaglia, M., Fiore, M., Tarable, A., et al.: k τ,ɛ -anonymity: Towards Privacy-Preserving Publishing of Spatiotemporal Trajectory Data (2017). arXiv preprint: arXiv:1701.02243 Gramaglia, M., Fiore, M., Tarable, A., et al.: k τ,ɛ -anonymity: Towards Privacy-Preserving Publishing of Spatiotemporal Trajectory Data (2017). arXiv preprint: arXiv:​1701.​02243
6.
Zurück zum Zitat Jia, J., Yan, G., Xing, L.: Personalized sensitive attribute anonymity based on P - sensitive K anonymity. In: Proceedings of the 2016 International Conference on Intelligent Information Processing, New York, NY, USA, pp. 54:1–54:7 (2016) Jia, J., Yan, G., Xing, L.: Personalized sensitive attribute anonymity based on P - sensitive K anonymity. In: Proceedings of the 2016 International Conference on Intelligent Information Processing, New York, NY, USA, pp. 54:1–54:7 (2016)
7.
Zurück zum Zitat Gong, Q., Luo, J., Yang, M., Ni, W., Li, X.-B.: Anonymizing 1:M microdata with high utility. Knowl. Based Syst. 115, 15–26 (2017)CrossRef Gong, Q., Luo, J., Yang, M., Ni, W., Li, X.-B.: Anonymizing 1:M microdata with high utility. Knowl. Based Syst. 115, 15–26 (2017)CrossRef
8.
Zurück zum Zitat Yin, C., Zhang, S., Xi, J., et al.: An improved anonymity model for big data security based on clustering algorithm. Concurrency Comput. Pract. Exp. 29(7) (2017) Yin, C., Zhang, S., Xi, J., et al.: An improved anonymity model for big data security based on clustering algorithm. Concurrency Comput. Pract. Exp. 29(7) (2017)
9.
Zurück zum Zitat Tsai, Y.-C., Wang, S.-L., Song, C.-Y., Ting, I.-H.: Privacy and utility effects of k-anonymity on association rule hiding. In: Proceedings of the 3rd Multidisciplinary International Social Networks Conference on Social Informatics 2016, Data Science 2016, New York, NY, USA, pp. 42:1–42:6 (2016) Tsai, Y.-C., Wang, S.-L., Song, C.-Y., Ting, I.-H.: Privacy and utility effects of k-anonymity on association rule hiding. In: Proceedings of the 3rd Multidisciplinary International Social Networks Conference on Social Informatics 2016, Data Science 2016, New York, NY, USA, pp. 42:1–42:6 (2016)
10.
Zurück zum Zitat Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, pp. 279–288, July 2002 Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, pp. 279–288, July 2002
11.
Zurück zum Zitat Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 205–216 (2004) Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 205–216 (2004)
12.
Zurück zum Zitat Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), pp. 205–216, April 2005 Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), pp. 205–216, April 2005
13.
Zurück zum Zitat Fung, B.C.M., Wang, K., Yu, P.S.: Anonymizing classification data for privacy preservation. IEEE Trans. Knowl. Data Eng. 19(5), 711–725 (2007)CrossRef Fung, B.C.M., Wang, K., Yu, P.S.: Anonymizing classification data for privacy preservation. IEEE Trans. Knowl. Data Eng. 19(5), 711–725 (2007)CrossRef
14.
Zurück zum Zitat Kisilevich, S., Rokach, L., Elovici, Y., Shapira, B.: Efficient multidimensional suppression for k-anonymity. IEEE Trans. Knowl. Data Eng. 22(3), 334–347 (2010)CrossRef Kisilevich, S., Rokach, L., Elovici, Y., Shapira, B.: Efficient multidimensional suppression for k-anonymity. IEEE Trans. Knowl. Data Eng. 22(3), 334–347 (2010)CrossRef
15.
Zurück zum Zitat LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering (ICDE 2006), p. 25. IEEE Computer Society (2006) LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering (ICDE 2006), p. 25. IEEE Computer Society (2006)
16.
Zurück zum Zitat LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization techniques for large-scale datasets. ACM Trans. Database Syst. 33(3), 1–47 (2008)CrossRef LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization techniques for large-scale datasets. ACM Trans. Database Syst. 33(3), 1–47 (2008)CrossRef
17.
Zurück zum Zitat Li, J., Liu, J., Baig, M.: Information based data anonymization for classification utility. Elsevier, 18 July 2011 Li, J., Liu, J., Baig, M.: Information based data anonymization for classification utility. Elsevier, 18 July 2011
18.
Zurück zum Zitat Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers Inc. (2005) Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers Inc. (2005)
20.
Zurück zum Zitat Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques. China Machine Press, Beijing (2012) Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques. China Machine Press, Beijing (2012)
Metadaten
Titel
An Anonymization Method to Improve Data Utility for Classification
verfasst von
Jianmin Han
Juan Yu
Jianfeng Lu
Hao Peng
Jiandang Wu
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69471-9_5