Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2009

01.10.2009

(α, k)-anonymous data publishing

verfasst von: Raymond Wong, Jiuyong Li, Ada Fu, Ke Wang

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2009

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Privacy preservation is an important issue in the release of data for mining purposes. The k-anonymity model has been introduced for protecting individual identification. Recent studies show that a more sophisticated model is necessary to protect the association of individuals to sensitive information. In this paper, we propose an (α, k)-anonymity model to protect both identifications and relationships to sensitive information in data. We discuss the properties of (α, k)-anonymity model. We prove that the optimal (α, k)-anonymity problem is NP-hard. We first present an optimal global-recoding method for the (α, k)-anonymity problem. Next we propose two scalable local-recoding algorithms which are both more scalable and result in less data distortion. The effectiveness and efficiency are shown by experiments. We also describe how the model can be extended to more general cases.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We use a simplified postcode scheme in this paper. There are four single digits, representing states, regions, cities and suburbs. Postcode 4350 indicates state-region-city-suburb.
 
2
After sorting, a set of contiguous tuples forms an equivalence class.
 
Literatur
Zurück zum Zitat Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., et al. (2005). Anonymizing tables. In ICDT (pp. 246–258). Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., et al. (2005). Anonymizing tables. In ICDT (pp. 246–258).
Zurück zum Zitat Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In PODS ’01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 247–255). New York: ACM.CrossRef Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In PODS ’01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 247–255). New York: ACM.CrossRef
Zurück zum Zitat Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In VLDB. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In VLDB.
Zurück zum Zitat Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In Proc. of the ACM SIGMOD conference on management of data (pp. 439–450). New York: ACM.CrossRef Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In Proc. of the ACM SIGMOD conference on management of data (pp. 439–450). New York: ACM.CrossRef
Zurück zum Zitat Bayardo, R., & Agrawal, R. (2005). Data privacy through optimal k-anonymization. In ICDE (pp. 217–228). Bayardo, R., & Agrawal, R. (2005). Data privacy through optimal k-anonymization. In ICDE (pp. 217–228).
Zurück zum Zitat Bu, Y., Fu, A. W.-C., Wong, R. C.-W., Chen, L., & Li, J. (2008). Privacy preserving serial data publishing by role composition. In VLDB. Bu, Y., Fu, A. W.-C., Wong, R. C.-W., Chen, L., & Li, J. (2008). Privacy preserving serial data publishing by role composition. In VLDB.
Zurück zum Zitat Cox, L. (1980). Suppression methodology and statistical disclosure control. Journal of the American Statistical Association, 75, 377–385.MATHCrossRef Cox, L. (1980). Suppression methodology and statistical disclosure control. Journal of the American Statistical Association, 75, 377–385.MATHCrossRef
Zurück zum Zitat Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93) (pp. 1022–1027). San Francisco: Morgan Kaufmann. Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93) (pp. 1022–1027). San Francisco: Morgan Kaufmann.
Zurück zum Zitat Fung, B. C. M., Wang, K., & Yu, P. S. (2005). Top-down specialization for information and privacy preservation. In ICDE (pp. 205–216). Fung, B. C. M., Wang, K., & Yu, P. S. (2005). Top-down specialization for information and privacy preservation. In ICDE (pp. 205–216).
Zurück zum Zitat Hundepool, A. (2004). The argus software in the casc-project: Casc project international workshop. In Privacy in statistical databases. Lecture notes in computer science (Vol. 3050, pp. 323–335). Barcelona: Springer. Hundepool, A. (2004). The argus software in the casc-project: Casc project international workshop. In Privacy in statistical databases. Lecture notes in computer science (Vol. 3050, pp. 323–335). Barcelona: Springer.
Zurück zum Zitat Hundepool, A., & Willenborg, L. (1996). μ-and τ- argus: Software for statistical disclosure control. In Third international seminar on statsitcal confidentiality, Bled. Hundepool, A., & Willenborg, L. (1996). μ-and τ- argus: Software for statistical disclosure control. In Third international seminar on statsitcal confidentiality, Bled.
Zurück zum Zitat Iyengar, V. S. (2002). Transforming data to satisfy privacy constraints. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 279–288). Iyengar, V. S. (2002). Transforming data to satisfy privacy constraints. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 279–288).
Zurück zum Zitat LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005). Incognito: Efficient full-domain k-anonymity. In SIGMOD conference (pp. 49–60). LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005). Incognito: Efficient full-domain k-anonymity. In SIGMOD conference (pp. 49–60).
Zurück zum Zitat Li, J., Wong, R. C.-W., Fu, A. W.-C., & Pei, J. (2006). Achieving k-anonymity by clustering in attribute hierarchical structures. In DaWaK. Li, J., Wong, R. C.-W., Fu, A. W.-C., & Pei, J. (2006). Achieving k-anonymity by clustering in attribute hierarchical structures. In DaWaK.
Zurück zum Zitat Li, N., & Li, T. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE. Li, N., & Li, T. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE.
Zurück zum Zitat Machanavajjhala, A., Gehrke, J., & Kifer, D. (2006). l-diversity: Privacy beyond k-anonymity. In ICDE06. Machanavajjhala, A., Gehrke, J., & Kifer, D. (2006). l-diversity: Privacy beyond k-anonymity. In ICDE06.
Zurück zum Zitat Meyerson, A., & Williams, R. (2004). On the complexity of optimal k-anonymity. In PODS (pp. 223–228). Meyerson, A., & Williams, R. (2004). On the complexity of optimal k-anonymity. In PODS (pp. 223–228).
Zurück zum Zitat Rizvi, S., & Haritsa, J. (2002). Maintaining data privacy in association rule mining. In Proceedings of the 28th conference on very large data base (VLDB02) (pp. 682–693). VLDB Endowment. Rizvi, S., & Haritsa, J. (2002). Maintaining data privacy in association rule mining. In Proceedings of the 28th conference on very large data base (VLDB02) (pp. 682–693). VLDB Endowment.
Zurück zum Zitat Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.CrossRef Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.CrossRef
Zurück zum Zitat Sweeney, L. (2002a). Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowldege Based Systems, 10(5), 571–588.MATHCrossRefMathSciNet Sweeney, L. (2002a). Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowldege Based Systems, 10(5), 571–588.MATHCrossRefMathSciNet
Zurück zum Zitat Sweeney, L. (2002b). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowldeg Based Systems, 10(5), 557–570.MATHCrossRefMathSciNet Sweeney, L. (2002b). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowldeg Based Systems, 10(5), 557–570.MATHCrossRefMathSciNet
Zurück zum Zitat Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4), 434–447.CrossRef Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4), 434–447.CrossRef
Zurück zum Zitat Wang, K., Fung, B. C. M., & Yu, P. S. (2005). Template-based privacy preservation in classification problems. In ICDM05. Wang, K., Fung, B. C. M., & Yu, P. S. (2005). Template-based privacy preservation in classification problems. In ICDM05.
Zurück zum Zitat Wang, K., Fung, B., & Yu, P. (2007). Handicapping attacker’s confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 11(3), 345–368.CrossRef Wang, K., Fung, B., & Yu, P. (2007). Handicapping attacker’s confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 11(3), 345–368.CrossRef
Zurück zum Zitat Wang, K., Yu, P. S., & Chakraborty, S. (2004). Bottom-up generalization: A data mining solution to privacy protection. In ICDM (pp. 249–256). Wang, K., Yu, P. S., & Chakraborty, S. (2004). Bottom-up generalization: A data mining solution to privacy protection. In ICDM (pp. 249–256).
Zurück zum Zitat Willenborg, L., & de Waal, T. (1996). Statistical disclosure control in practice. Lecture Notes in Statistics, 111. Willenborg, L., & de Waal, T. (1996). Statistical disclosure control in practice. Lecture Notes in Statistics, 111.
Zurück zum Zitat Xiao, X., & Tao, Y. (2006). Personalized privacy preservation. In SIGMOD. Xiao, X., & Tao, Y. (2006). Personalized privacy preservation. In SIGMOD.
Zurück zum Zitat Xiao, X., & Tao, Y. (2007). m-invariance: Towards privacy preserving re-publication of dynamic datasets. In SIGMOD. Xiao, X., & Tao, Y. (2007). m-invariance: Towards privacy preserving re-publication of dynamic datasets. In SIGMOD.
Zurück zum Zitat Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., & Fu, A. (2006). Utility-based anonymization using local recoding. In KDD. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., & Fu, A. (2006). Utility-based anonymization using local recoding. In KDD.
Metadaten
Titel
(α, k)-anonymous data publishing
verfasst von
Raymond Wong
Jiuyong Li
Ada Fu
Ke Wang
Publikationsdatum
01.10.2009
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2009
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-008-0075-2

Premium Partner