ABSTRACT
The previous literature of privacy preserving data publication has focused on performing "one-time" releases. Specifically, none of the existing solutions supports re-publication of the microdata, after it has been updated with insertions <u>and</u> deletions. This is a serious drawback, because currently a publisher cannot provide researchers with the most recent dataset continuously.
This paper remedies the drawback. First, we reveal the characteristics of the re-publication problem that invalidate the conventional approaches leveraging k-anonymity and l-diversity. Based on rigorous theoretical analysis, we develop a new generalization principle m-invariance that effectively limits the risk of privacy disclosure in re-publication. We accompany the principle with an algorithm, which computes privacy-guarded relations that permit retrieval of accurate aggregate information about the original microdata. Our theoretical results are confirmed by extensive experiments with real data.
- C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901--909, 2005. Google ScholarDigital Library
- G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In PODS, pages 153--162, 2006. Google ScholarDigital Library
- G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005. Google ScholarDigital Library
- F. Bacchus, A. J. Grove, J. Y. Halpern, and D. Koller. From statistical knowledge bases to degrees of belief. Artif. Intell., 87(1-2):75--143, 1996. Google ScholarDigital Library
- R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217--228, 2005. Google ScholarDigital Library
- J. W. Byun, Y. Sohn, E. Bertino, and N. Li. Secure anonymization for incremental datasets. In SDM, pages 48--63, 2006. Google ScholarDigital Library
- B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In ICDE, pages 205--216, 2005. Google ScholarDigital Library
- V. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, pages 279--288, 2002. Google ScholarDigital Library
- D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In SIGMOD, pages 217--228, 2006. Google ScholarDigital Library
- N. Koudas, D. Srivastava, T. Yu, and Q. Zhang. Aggregate query answering on anonymized tables. In ICDE, 2007.Google Scholar
- K. LeFevre, D. DeWitt, and R. Ramakrishnan. Workload-aware anonymization. In SIGKDD, 2006. Google ScholarDigital Library
- K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49--60, 2005. Google ScholarDigital Library
- K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, 2006. Google ScholarDigital Library
- N. Li and T. Li. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, 2007.Google ScholarCross Ref
- A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarDigital Library
- D. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Halpern. Worst-case background knowledge in privacy. In ICDE, 2007.Google Scholar
- A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, pages 223--228, 2004. Google ScholarDigital Library
- P. Samarati. Protecting respondents' identities in microdata release. TKDE, 13(6):1010--1027, 2001. Google ScholarDigital Library
- P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. In PODS, page 188, 1998. Google ScholarDigital Library
- L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):571--588, 2002. Google ScholarDigital Library
- K. Wang and B. C. M. Fung. Anonymizing sequential releases. In SIGKDD, pages 414--423, 2006. Google ScholarDigital Library
- X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In VLDB, pages 139--150, 2006. Google ScholarDigital Library
- X. Xiao and Y. Tao. Personalized privacy preservation. In SIGMOD, pages 229--240, 2006. Google ScholarDigital Library
Index Terms
- M-invariance: towards privacy preserving re-publication of dynamic datasets
Recommendations
Anonymizing sequential releases under arbitrary updates
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 WorkshopsIn today's global information society, governments, companies, public and private institutions and even individuals have to cope with growing demands for personal data publication from scientists, statisticians, journalists and many other data ...
Yet another privacy metric for publishing micro-data
WPES '08: Proceedings of the 7th ACM workshop on Privacy in the electronic societyRecently many schemes, including k-anonymity [8], l-diversity [6] and t-closeness [5] have been introduced for preserving individual privacy when publishing database tables. Furthermore k-anonymity and l-diversity have been shown to have weaknesses. In ...
Anonymizing sequential releases
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningAn organization makes a new release as new information become available, releases a tailored view for each data request, releases sensitive information and identifying information separately. The availability of related releases sharpens the ...
Comments