Skip to main content
Top
Published in: Soft Computing 5/2011

01-05-2011 | Focus

A cost-efficient and versatile sanitizing algorithm by using a greedy approach

Authors: Chieh-Ming Wu, Yin-Fu Huang

Published in: Soft Computing | Issue 5/2011

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In a very large database, there exists sensitive information that must be protected against unauthorized accesses. The confidentiality protection of the information has been a long-term goal pursued by the database security research community and the government statistical agencies. In this paper, we proposed greedy methods for hiding sensitive rules. The experimental results showed the effectiveness of our approaches in terms of undesired side effects avoided in the rule hiding process. The results also revealed that in most cases, all the sensitive rules are hidden without generating spurious rules. First, the good scalability of our approach in terms of database sizes was achieved by using an efficient data structure, FCET, to store only maximal frequent itemsets instead of storing all frequent itemsets. Furthermore, we also proposed a new framework for enforcing the privacy in mining association rules. In the framework, we combined the techniques of efficiently hiding sensitive rules with the transaction retrieval engine based on the FCET index tree. For hiding sensitive rules, the proposed greedy approach includes a greedy approximation algorithm and a greedy exhausted algorithm to sanitize the database. In particular, we presented four strategies in the sanitizing procedure and four strategies in the exposed procedure, respectively, for hiding a group of association rules characterized as sensitive or artificial rules. In addition, the exposed procedure would expose missing rules during the processing so that the number of missing rules could be lowered as much as possible.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 247–255 Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 247–255
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases, pp 487–499
go back to reference Agrawal R, Srikant R (2000) Privacy-preserving data mining. ACM SIGMOD Rec 29(2):439–450CrossRef Agrawal R, Srikant R (2000) Privacy-preserving data mining. ACM SIGMOD Rec 29(2):439–450CrossRef
go back to reference Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of ACM international conference on management of data, pp 207–216 Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of ACM international conference on management of data, pp 207–216
go back to reference Atallah M, Bertino E, Elmagarmid A (1999) Disclosure limitation of sensitive rules. In: Proceedings of IEEE knowledge and data engineering workshop, pp 45–52 Atallah M, Bertino E, Elmagarmid A (1999) Disclosure limitation of sensitive rules. In: Proceedings of IEEE knowledge and data engineering workshop, pp 45–52
go back to reference Burdick D, Calimlim M, Gehrke J (2001) MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th international conference on data engineering, pp 443–452 Burdick D, Calimlim M, Gehrke J (2001) MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th international conference on data engineering, pp 443–452
go back to reference Fu WC, Wong RCW, Wang K (2005) Privacy-preserving frequent pattern mining across private databases. In: Proceedings of the 5th IEEE international conference on data mining, pp 613–616 Fu WC, Wong RCW, Wang K (2005) Privacy-preserving frequent pattern mining across private databases. In: Proceedings of the 5th IEEE international conference on data mining, pp 613–616
go back to reference Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of ACM international conference on management of data, pp 1–12 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of ACM international conference on management of data, pp 1–12
go back to reference Huang YF, Wu CM (2002) Mining generalized association rules using pruning techniques. In: Proceedings of IEEE international conference on data mining, pp 227–234 Huang YF, Wu CM (2002) Mining generalized association rules using pruning techniques. In: Proceedings of IEEE international conference on data mining, pp 227–234
go back to reference Lee G, Chang CY, Chen A (2004) Hiding sensitive patterns in association rules mining. In: Proceedings of 28th annual international computer software and applications conference, pp 424–429 Lee G, Chang CY, Chen A (2004) Hiding sensitive patterns in association rules mining. In: Proceedings of 28th annual international computer software and applications conference, pp 424–429
go back to reference Oliveira SRM, Zaiane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE ICDM workshop on privacy, security, and data mining, pp 43–54 Oliveira SRM, Zaiane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE ICDM workshop on privacy, security, and data mining, pp 43–54
go back to reference Oliveira SRM, Zaiane OR (2003) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE international conference on data mining, pp 613–616 Oliveira SRM, Zaiane OR (2003) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE international conference on data mining, pp 613–616
go back to reference Oliveira SRM, Zaiane OR (2003) Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proceedings of the 7th international database engineering and applications symposium, pp 54–63 Oliveira SRM, Zaiane OR (2003) Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proceedings of the 7th international database engineering and applications symposium, pp 54–63
go back to reference Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of ACM international conference on management of data, pp 175–186 Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of ACM international conference on management of data, pp 175–186
go back to reference Rizvi SJ, Haritsa JR (2002) Maintaining data privacy in association rule mining. In: Proceedings of international conference on very large data bases, pp 682–693 Rizvi SJ, Haritsa JR (2002) Maintaining data privacy in association rule mining. In: Proceedings of international conference on very large data bases, pp 682–693
go back to reference Saygin Y, Verykios VS, Clifton C (2001) Using unknowns to prevent discovery of association rules. ACM SIGMOD Rec 30(4):45–54CrossRef Saygin Y, Verykios VS, Clifton C (2001) Using unknowns to prevent discovery of association rules. ACM SIGMOD Rec 30(4):45–54CrossRef
go back to reference Shenoy P, Haritsa J, Sudarshan S, Bhalotia G, Bawa M, Shah D (2000) Turbo-charging vertical mining of large databases. In: Proceedings of ACM international conference on management of data, pp 22–33 Shenoy P, Haritsa J, Sudarshan S, Bhalotia G, Bawa M, Shah D (2000) Turbo-charging vertical mining of large databases. In: Proceedings of ACM international conference on management of data, pp 22–33
go back to reference Verykios VS, Elmagarmid AK, Bertino E, Saygin Y, Dasseni E (2004) Association rule hiding. IEEE Trans Knowl Data Eng 16(4):434–447CrossRef Verykios VS, Elmagarmid AK, Bertino E, Saygin Y, Dasseni E (2004) Association rule hiding. IEEE Trans Knowl Data Eng 16(4):434–447CrossRef
go back to reference Wang SL, Jafari A (2005) Hiding sensitive predictive association rules. In: Proceedings of IEEE international conference on systems, man, and cybernetics, vol 1, pp 164–169 Wang SL, Jafari A (2005) Hiding sensitive predictive association rules. In: Proceedings of IEEE international conference on systems, man, and cybernetics, vol 1, pp 164–169
go back to reference Wu CM, Huang YF (2008) An efficient data structure for mining generalized association rules. In: Proceedings of the 5th international conference on fuzzy systems and knowledge discovery, vol 2, pp 565–571 Wu CM, Huang YF (2008) An efficient data structure for mining generalized association rules. In: Proceedings of the 5th international conference on fuzzy systems and knowledge discovery, vol 2, pp 565–571
go back to reference Wu Y-H, Chiang C-M, Chen ALP (2007) Hiding sensitive association rules with limited side effects. IEEE Trans Knowl Data Eng 19(1):29–42CrossRef Wu Y-H, Chiang C-M, Chen ALP (2007) Hiding sensitive association rules with limited side effects. IEEE Trans Knowl Data Eng 19(1):29–42CrossRef
go back to reference Wu CM, Huang YF, Chen JY (2009) Privacy preserving association rules by using greedy approach. In: Proceedings of the World Congress on computer science and information engineering, vol 4, pp 61–65 Wu CM, Huang YF, Chen JY (2009) Privacy preserving association rules by using greedy approach. In: Proceedings of the World Congress on computer science and information engineering, vol 4, pp 61–65
go back to reference Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478CrossRef Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478CrossRef
go back to reference Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of the 3rd international conference on knowledge discovery in databases and data mining, pp 283–286 Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of the 3rd international conference on knowledge discovery in databases and data mining, pp 283–286
Metadata
Title
A cost-efficient and versatile sanitizing algorithm by using a greedy approach
Authors
Chieh-Ming Wu
Yin-Fu Huang
Publication date
01-05-2011
Publisher
Springer-Verlag
Published in
Soft Computing / Issue 5/2011
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-010-0549-3

Other articles of this Issue 5/2011

Soft Computing 5/2011 Go to the issue

Premium Partner