Skip to main content
Top
Published in: Journal of Intelligent Information Systems 2/2009

01-10-2009

(α, k)-anonymous data publishing

Authors: Raymond Wong, Jiuyong Li, Ada Fu, Ke Wang

Published in: Journal of Intelligent Information Systems | Issue 2/2009

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Privacy preservation is an important issue in the release of data for mining purposes. The k-anonymity model has been introduced for protecting individual identification. Recent studies show that a more sophisticated model is necessary to protect the association of individuals to sensitive information. In this paper, we propose an (α, k)-anonymity model to protect both identifications and relationships to sensitive information in data. We discuss the properties of (α, k)-anonymity model. We prove that the optimal (α, k)-anonymity problem is NP-hard. We first present an optimal global-recoding method for the (α, k)-anonymity problem. Next we propose two scalable local-recoding algorithms which are both more scalable and result in less data distortion. The effectiveness and efficiency are shown by experiments. We also describe how the model can be extended to more general cases.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We use a simplified postcode scheme in this paper. There are four single digits, representing states, regions, cities and suburbs. Postcode 4350 indicates state-region-city-suburb.
 
2
After sorting, a set of contiguous tuples forms an equivalence class.
 
Literature
go back to reference Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., et al. (2005). Anonymizing tables. In ICDT (pp. 246–258). Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., et al. (2005). Anonymizing tables. In ICDT (pp. 246–258).
go back to reference Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In PODS ’01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 247–255). New York: ACM.CrossRef Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In PODS ’01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 247–255). New York: ACM.CrossRef
go back to reference Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In VLDB. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In VLDB.
go back to reference Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In Proc. of the ACM SIGMOD conference on management of data (pp. 439–450). New York: ACM.CrossRef Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In Proc. of the ACM SIGMOD conference on management of data (pp. 439–450). New York: ACM.CrossRef
go back to reference Bayardo, R., & Agrawal, R. (2005). Data privacy through optimal k-anonymization. In ICDE (pp. 217–228). Bayardo, R., & Agrawal, R. (2005). Data privacy through optimal k-anonymization. In ICDE (pp. 217–228).
go back to reference Bu, Y., Fu, A. W.-C., Wong, R. C.-W., Chen, L., & Li, J. (2008). Privacy preserving serial data publishing by role composition. In VLDB. Bu, Y., Fu, A. W.-C., Wong, R. C.-W., Chen, L., & Li, J. (2008). Privacy preserving serial data publishing by role composition. In VLDB.
go back to reference Cox, L. (1980). Suppression methodology and statistical disclosure control. Journal of the American Statistical Association, 75, 377–385.MATHCrossRef Cox, L. (1980). Suppression methodology and statistical disclosure control. Journal of the American Statistical Association, 75, 377–385.MATHCrossRef
go back to reference Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93) (pp. 1022–1027). San Francisco: Morgan Kaufmann. Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93) (pp. 1022–1027). San Francisco: Morgan Kaufmann.
go back to reference Fung, B. C. M., Wang, K., & Yu, P. S. (2005). Top-down specialization for information and privacy preservation. In ICDE (pp. 205–216). Fung, B. C. M., Wang, K., & Yu, P. S. (2005). Top-down specialization for information and privacy preservation. In ICDE (pp. 205–216).
go back to reference Hundepool, A. (2004). The argus software in the casc-project: Casc project international workshop. In Privacy in statistical databases. Lecture notes in computer science (Vol. 3050, pp. 323–335). Barcelona: Springer. Hundepool, A. (2004). The argus software in the casc-project: Casc project international workshop. In Privacy in statistical databases. Lecture notes in computer science (Vol. 3050, pp. 323–335). Barcelona: Springer.
go back to reference Hundepool, A., & Willenborg, L. (1996). μ-and τ- argus: Software for statistical disclosure control. In Third international seminar on statsitcal confidentiality, Bled. Hundepool, A., & Willenborg, L. (1996). μ-and τ- argus: Software for statistical disclosure control. In Third international seminar on statsitcal confidentiality, Bled.
go back to reference Iyengar, V. S. (2002). Transforming data to satisfy privacy constraints. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 279–288). Iyengar, V. S. (2002). Transforming data to satisfy privacy constraints. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 279–288).
go back to reference LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005). Incognito: Efficient full-domain k-anonymity. In SIGMOD conference (pp. 49–60). LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005). Incognito: Efficient full-domain k-anonymity. In SIGMOD conference (pp. 49–60).
go back to reference Li, J., Wong, R. C.-W., Fu, A. W.-C., & Pei, J. (2006). Achieving k-anonymity by clustering in attribute hierarchical structures. In DaWaK. Li, J., Wong, R. C.-W., Fu, A. W.-C., & Pei, J. (2006). Achieving k-anonymity by clustering in attribute hierarchical structures. In DaWaK.
go back to reference Li, N., & Li, T. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE. Li, N., & Li, T. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE.
go back to reference Machanavajjhala, A., Gehrke, J., & Kifer, D. (2006). l-diversity: Privacy beyond k-anonymity. In ICDE06. Machanavajjhala, A., Gehrke, J., & Kifer, D. (2006). l-diversity: Privacy beyond k-anonymity. In ICDE06.
go back to reference Meyerson, A., & Williams, R. (2004). On the complexity of optimal k-anonymity. In PODS (pp. 223–228). Meyerson, A., & Williams, R. (2004). On the complexity of optimal k-anonymity. In PODS (pp. 223–228).
go back to reference Rizvi, S., & Haritsa, J. (2002). Maintaining data privacy in association rule mining. In Proceedings of the 28th conference on very large data base (VLDB02) (pp. 682–693). VLDB Endowment. Rizvi, S., & Haritsa, J. (2002). Maintaining data privacy in association rule mining. In Proceedings of the 28th conference on very large data base (VLDB02) (pp. 682–693). VLDB Endowment.
go back to reference Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.CrossRef Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.CrossRef
go back to reference Sweeney, L. (2002a). Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowldege Based Systems, 10(5), 571–588.MATHCrossRefMathSciNet Sweeney, L. (2002a). Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowldege Based Systems, 10(5), 571–588.MATHCrossRefMathSciNet
go back to reference Sweeney, L. (2002b). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowldeg Based Systems, 10(5), 557–570.MATHCrossRefMathSciNet Sweeney, L. (2002b). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowldeg Based Systems, 10(5), 557–570.MATHCrossRefMathSciNet
go back to reference Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4), 434–447.CrossRef Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4), 434–447.CrossRef
go back to reference Wang, K., Fung, B. C. M., & Yu, P. S. (2005). Template-based privacy preservation in classification problems. In ICDM05. Wang, K., Fung, B. C. M., & Yu, P. S. (2005). Template-based privacy preservation in classification problems. In ICDM05.
go back to reference Wang, K., Fung, B., & Yu, P. (2007). Handicapping attacker’s confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 11(3), 345–368.CrossRef Wang, K., Fung, B., & Yu, P. (2007). Handicapping attacker’s confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 11(3), 345–368.CrossRef
go back to reference Wang, K., Yu, P. S., & Chakraborty, S. (2004). Bottom-up generalization: A data mining solution to privacy protection. In ICDM (pp. 249–256). Wang, K., Yu, P. S., & Chakraborty, S. (2004). Bottom-up generalization: A data mining solution to privacy protection. In ICDM (pp. 249–256).
go back to reference Willenborg, L., & de Waal, T. (1996). Statistical disclosure control in practice. Lecture Notes in Statistics, 111. Willenborg, L., & de Waal, T. (1996). Statistical disclosure control in practice. Lecture Notes in Statistics, 111.
go back to reference Xiao, X., & Tao, Y. (2006). Personalized privacy preservation. In SIGMOD. Xiao, X., & Tao, Y. (2006). Personalized privacy preservation. In SIGMOD.
go back to reference Xiao, X., & Tao, Y. (2007). m-invariance: Towards privacy preserving re-publication of dynamic datasets. In SIGMOD. Xiao, X., & Tao, Y. (2007). m-invariance: Towards privacy preserving re-publication of dynamic datasets. In SIGMOD.
go back to reference Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., & Fu, A. (2006). Utility-based anonymization using local recoding. In KDD. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., & Fu, A. (2006). Utility-based anonymization using local recoding. In KDD.
Metadata
Title
(α, k)-anonymous data publishing
Authors
Raymond Wong
Jiuyong Li
Ada Fu
Ke Wang
Publication date
01-10-2009
Publisher
Springer US
Published in
Journal of Intelligent Information Systems / Issue 2/2009
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-008-0075-2

Premium Partner