Article

M-invariance: towards privacy preserving re-publication of dynamic datasets

Authors:
Xiaokui Xiao

Chinese University of Hong Kong, Hong Kong, Hong Kong

Chinese University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Yufei Tao

Chinese University of Hong Kong, Hong Kong, Hong Kong

Chinese University of Hong Kong, Hong Kong, Hong Kong
View Profile

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataJune 2007Pages 689–700https://doi.org/10.1145/1247480.1247556

Published:11 June 2007Publication History

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Pages 689–700

ABSTRACT

The previous literature of privacy preserving data publication has focused on performing "one-time" releases. Specifically, none of the existing solutions supports re-publication of the microdata, after it has been updated with insertions <u>and</u> deletions. This is a serious drawback, because currently a publisher cannot provide researchers with the most recent dataset continuously.

This paper remedies the drawback. First, we reveal the characteristics of the re-publication problem that invalidate the conventional approaches leveraging k-anonymity and l-diversity. Based on rigorous theoretical analysis, we develop a new generalization principle m-invariance that effectively limits the risk of privacy disclosure in re-publication. We accompany the principle with an algorithm, which computes privacy-guarded relations that permit retrieval of accurate aggregate information about the original microdata. Our theoretical results are confirmed by extensive experiments with real data.

References

C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901--909, 2005. Google ScholarDigital Library
G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In PODS, pages 153--162, 2006. Google ScholarDigital Library
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005. Google ScholarDigital Library
F. Bacchus, A. J. Grove, J. Y. Halpern, and D. Koller. From statistical knowledge bases to degrees of belief. Artif. Intell., 87(1-2):75--143, 1996. Google ScholarDigital Library
R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217--228, 2005. Google ScholarDigital Library
J. W. Byun, Y. Sohn, E. Bertino, and N. Li. Secure anonymization for incremental datasets. In SDM, pages 48--63, 2006. Google ScholarDigital Library
B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In ICDE, pages 205--216, 2005. Google ScholarDigital Library
V. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, pages 279--288, 2002. Google ScholarDigital Library
D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In SIGMOD, pages 217--228, 2006. Google ScholarDigital Library
N. Koudas, D. Srivastava, T. Yu, and Q. Zhang. Aggregate query answering on anonymized tables. In ICDE, 2007.Google Scholar
K. LeFevre, D. DeWitt, and R. Ramakrishnan. Workload-aware anonymization. In SIGKDD, 2006. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49--60, 2005. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, 2006. Google ScholarDigital Library
N. Li and T. Li. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, 2007.Google ScholarCross Ref
A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarDigital Library
D. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Halpern. Worst-case background knowledge in privacy. In ICDE, 2007.Google Scholar
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, pages 223--228, 2004. Google ScholarDigital Library
P. Samarati. Protecting respondents' identities in microdata release. TKDE, 13(6):1010--1027, 2001. Google ScholarDigital Library
P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. In PODS, page 188, 1998. Google ScholarDigital Library
L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):571--588, 2002. Google ScholarDigital Library
K. Wang and B. C. M. Fung. Anonymizing sequential releases. In SIGKDD, pages 414--423, 2006. Google ScholarDigital Library
X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In VLDB, pages 139--150, 2006. Google ScholarDigital Library
X. Xiao and Y. Tao. Personalized privacy preservation. In SIGMOD, pages 229--240, 2006. Google ScholarDigital Library

Index Terms

M-invariance: towards privacy preserving re-publication of dynamic datasets
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Anonymizing sequential releases under arbitrary updates
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops

In today's global information society, governments, companies, public and private institutions and even individuals have to cope with growing demands for personal data publication from scientists, statisticians, journalists and many other data ...
Read More
Yet another privacy metric for publishing micro-data
WPES '08: Proceedings of the 7th ACM workshop on Privacy in the electronic society

Recently many schemes, including k-anonymity [8], l-diversity [6] and t-closeness [5] have been introduced for preserving individual privacy when publishing database tables. Furthermore k-anonymity and l-diversity have been shown to have weaknesses. In ...
Read More
Anonymizing sequential releases
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

An organization makes a new release as new information become available, releases a tailored view for each data request, releases sensitive information and identifying information separately. The availability of related releases sharpens the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
June 2007
1210 pages
ISBN:9781595936868
DOI:10.1145/1247480
General Chairs:
Lizhu Zhou
Tsinghua University, China
,
Tok Wang Ling
National University of Singapore, Singapore
,
Program Chair:
Beng Chin Ooi
National University of Singapore, Singapore
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
m-invariance
generalization
privacy
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 330
  Total Citations
  View Citations
- 317
  Total Downloads
- Downloads (Last 12 months)63
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

M-invariance: towards privacy preserving re-publication of dynamic datasets

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Anonymizing sequential releases under arbitrary updates

Yet another privacy metric for publishing micro-data

Anonymizing sequential releases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

M-invariance: towards privacy preserving re-publication of dynamic datasets

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Anonymizing sequential releases under arbitrary updates

Yet another privacy metric for publishing micro-data

Anonymizing sequential releases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media