ABSTRACT
Extracting association rules helps data owners to unveil hidden patterns from their data for the purpose of analyzing and predicting the behavior of their clients. However, mining association rules in a distributed environment is not a trivial task due to privacy concerns. Data owners are interested in collaborating with each other to mine association rules on a global level; however, they are concerned that sensitive information related to the individuals involved in their database might get compromised during the mining process. In this paper, we formulate and address the problem of answering association rules queries in a distributed environment such that the mining process is confidential and the results are differentially private. We propose a privacy-preserving distributed association rules mining approach, named DARM, where global strong association rules are determined in a confidential way, and the results returned satisfy ε-differential privacy. We conduct our experiments on real-life data, and show that our approach can efficiently answer association rules queries and is scalable with increasing data records.
- R. Agrawal and J. C. Shafer. Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering, 8(6):962--969, Dec. 1996. Google ScholarDigital Library
- G. Alonso, F. Casati, H. Kuno, and V. Machiraju. Web Services: Concepts, Architectures and Applications. Springer, 1st edition, 2010. Google ScholarDigital Library
- A. Anitha, G. R. Suhanantham, and N. Krishnan. An efficient association rule mining model for distributed databases. International Journal of Computer Science and Technology, 3(1):794--797, 2002.Google Scholar
- M. Arafati, G. G. Dagher, B. C. M. Fung, and P. C. K. Hung. D-mash: A framework for privacy-preserving data-as-a-service mashups. In Proceedings of the 8th IEEE International Conference on Cloud Computing (CLOUD), June 2014.Google ScholarDigital Library
- M. Z. Ashrafi, D. Taniar, and K. Smith. Odam: an optimized distributed association rule mining algorithm. IEEE Distributed Systems Online, 5, 2004. Google ScholarDigital Library
- K. Bache and M. Lichman. Uci machine learning repository. University of California, Irvine, School of Information and Computer Sciences, 2013.Google Scholar
- D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu. A fast distributed algorithm for mining association rules. In Proceedings of the 4th International Conference on on Parallel and Distributed Information Systems (DIS), pages 31--43, 1996. Google ScholarDigital Library
- J. L. Dautrich, Jr. and C. V. Ravishankar. Compromising privacy in precise query protocols. In Proceedings of the 16th International Conference on Extending Database Technology (EDBT), pages 155--166, 2013. Google ScholarDigital Library
- B. C. M. Fung, K. Wang, A. W.-C. Fu, and P. S. Yu. Introduction to Privacy-Preserving Data Publishing: Concepts and Techniques. Data Mining and Knowledge Discovery. August 2010. Google ScholarDigital Library
- B. C. M. Fung, K. Wang, and P. S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering (TKDE), 19(5):711--725, May 2007. Google ScholarDigital Library
- F. Giannotti, L. Lakshmanan, A. Monreale, D. Pedreschi, and H. Wang. Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Systems Journal, 7(3):385--395, Sept 2013.Google ScholarCross Ref
- P. Gurunathan, N. Ishwarya, V. Sridevi, C. Nandhini, and S. Deepalakshmi. High-dimensional confidential data mash up using service- oriented architecture. International Journal of Emerging Science and Engineering (IJESE), 1(6), April 2013.Google Scholar
- S. Kamara, P. Mohassel, and M. Raykova. Outsourcing multi-party computation. IACR Cryptology ePrint Archive, 2011:272.Google Scholar
- M. Kantarcioglu and C. Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering, 16(9):1026--1037, Sept. 2004. Google ScholarDigital Library
- A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 2007. Google ScholarDigital Library
- N. Mohammed, R. Chen, B. C. M. Fung, and P. S. Yu. Differentially private data release for data mining. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 493--501, August 2011. Google ScholarDigital Library
- N. Mohammed, B. C. M. Fung, P. C. K. Hung, and C. Lee. Anonymizing healthcare data: A case study on the blood transfusion service. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 1285--1294, June 2009. Google ScholarDigital Library
- J. S. Park, M.-S. Chen, and P. S. Yu. Efficient parallel data mining for association rules. In Proceedings of the 4th International Conference on Information and Knowledge Management (CIKM), pages 31--36, 1995. Google ScholarDigital Library
- J. Renjit and K. Shunmuganathan. Mining the data from distributed database using an improved mining algorithm. International Journal of Computer Science and Information Security, 7(3):116âĂŞ--121, 2010.Google Scholar
- L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5):571--588, Oct. 2002. Google ScholarDigital Library
- T. Trojer, B. C. M. Fung, and P. C. K. Hung. Service-oriented architecture for privacy-preserving data mashup. In Proceedings of the 7th IEEE International Conference on Web Services (ICWS), pages 767--774, July 2009. Google ScholarDigital Library
- J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 639--644, 2002. Google ScholarDigital Library
- J. Vaidya and C. Clifton. Privacy-preserving data mining: why, how, and when. IEEE Security Privacy, 2(6):19--27, Nov 2004. Google ScholarDigital Library
- W. K. Wong, D. W. Cheung, E. Hung, B. Kao, and N. Mamoulis. Security in outsourcing of association rule mining. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), pages 111--122, 2007. Google ScholarDigital Library
- N. Zhang, M. Li, and W. Lou. Distributed data mining with differential privacy. In Proceedings of the IEEE International Conference on Communications(ICC), pages 1--5, 2011.Google ScholarCross Ref
Index Terms
- DARM: a privacy-preserving approach for distributed association rules mining on horizontally-partitioned data
Recommendations
Mining fuzzy specific rare itemsets for education data
Association rule mining is an important data analysis method for the discovery of associations within data. There have been many studies focused on finding fuzzy association rules from transaction databases. Unfortunately, in the real world, one may ...
Beyond intratransaction association analysis: mining multidimensional intertransaction association rules
In this paper, we extend the scope of mining association rules from traditional single-dimensional intratransaction associations, to multidimensional intertransaction associations. Intratransaction associations are the associations among items with the ...
Discovering association rules change from large databases
AICI'11: Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part IDiscovering association rules and association rules change (ARC) from existing large databases is an important problem. This paper presents an approach based on multi-hash chain structures to mine association rules change from large database with ...
Comments