skip to main content
10.1145/1557019.1557157acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Anonymizing healthcare data: a case study on the blood transfusion service

Published:28 June 2009Publication History

ABSTRACT

Sharing healthcare data has become a vital requirement in healthcare system management; however, inappropriate sharing and usage of healthcare data could threaten patients' privacy. In this paper, we study the privacy concerns of the blood transfusion information-sharing system between the Hong Kong Red Cross Blood Transfusion Service (BTS) and public hospitals, and identify the major challenges that make traditional data anonymization methods not applicable. Furthermore, we propose a new privacy model called LKC-privacy, together with an anonymization algorithm, to meet the privacy and information requirements in this BTS case. Experiments on the real-life data demonstrate that our anonymization algorithm can effectively retain the essential information in anonymous data for data analysis and is scalable for anonymizing large datasets.

Skip Supplemental Material Section

Supplemental Material

p1285-mohammed.mp4

mp4

104.8 MB

References

  1. C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. C. Aggarwal and P. S. Yu. Privacy Preserving Data Mining: Models and Algorithms. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant. Privacy preserving data mining. In SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. M. Carlisle, M. L. Rodrian, and C. L. Diamond. California inpatient data reporting manual, medical information reporting for california, 5th edition. Technical report, Office of Statewide Health Planning and Development, July 2007.Google ScholarGoogle Scholar
  6. C. Dwork. Differential privacy: A survey of results. Theory and Applications of Models of Computation, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey on recent developments. ACM Computing Surveys, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. C. M. Fung, K. Wang, and P. S. Yu. Anonymizing classification data for privacy preservation. IEEE TKDE, 19(5):711--725, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Gardner and L. Xiong. An integrated framework for de-identifying heterogeneous data. DKE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. S. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Kim and W. Winkler. Masking microdata files. In ASA Section on Survey Research Methods, 1995.Google ScholarGoogle Scholar
  13. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Workload-aware anonymization techniques for large-scale data sets. ACM TODS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM TKDD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Mohammed, B. C. M. Fung, K. Wang, and P. C. K. Hung. Privacy-preserving data mashup. In EDBT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998.Google ScholarGoogle Scholar
  17. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Skowron and C. Rauszer. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, chapter The discernibility matrices and functions in information systems. 1992.Google ScholarGoogle Scholar
  20. L. Sweeney. k-anonymity: A model for protecting privacy. In International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving anonymization of set-valued data. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Wang and B. C. M. Fung. Anonymizing sequential releases. In SIGKDD, pages 414--423, August 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Wang, B. C. M. Fung, and P. S. Yu. Handicapping attacker's confidence: An alternative to k-anonymization. KAIS, 11(3):345--368, April 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. C. W. Wong, J. Li., A. W. C. Fu, and K. Wang. (®,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In SIGKDD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Xu, B. C. M. Fung, K. Wang, A. W. C. Fu, and J. Pei. Publishing sensitive transactions for itemset utility. In ICDM, pages 1109--1114, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Xu, K. Wang, A. W. C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In SIGKDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Yu, G. Fung, R. Rosales, S. Krishnan, R. B. Rao, C. Dehing-Oberije, and P. Lambin. Privacy-preserving cox regression for survival analysis. In SIGKDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Anonymizing healthcare data: a case study on the blood transfusion service

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
          June 2009
          1426 pages
          ISBN:9781605584959
          DOI:10.1145/1557019

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 June 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader