Skip to main content
Erschienen in: Artificial Intelligence Review 4/2015

01.12.2015

A survey: deriving private information from perturbed data

verfasst von: Burcu D. Okkalioglu, Murat Okkalioglu, Mehmet Koc, Huseyin Polat

Erschienen in: Artificial Intelligence Review | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Privacy-preserving data mining has attracted the attention of a large number of researchers. Many data perturbation methods have been proposed to ensure individual privacy. Such methods seem to be successful in providing privacy and accuracy. On one hand, different methods are utilized to preserve privacy. On the other hand, various data reconstruction approaches have been proposed to derive private information from perturbed data. Thus, many researchers have been conducting various studies about data reconstruction methods and the resilience of data perturbation schemes. In this survey, we focus on data reconstruction methods due to their importance in privacy-preserving data mining. We provide a detailed review of the data reconstruction methods and the data perturbation schemes attacked by different data reconstruction techniques. We merge our review with the evaluation metrics and the data sets used in current attack techniques. Finally, we pose some open questions to provide a better understanding of these approaches and to guide future study.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aggarwal CC, Yu PS (2008) A survey of randomization methods for privacy preserving data mining. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 137–156CrossRef Aggarwal CC, Yu PS (2008) A survey of randomization methods for privacy preserving data mining. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 137–156CrossRef
Zurück zum Zitat Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. Santa Barbara, pp 247–255 Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. Santa Barbara, pp 247–255
Zurück zum Zitat Agrawal R, Ghosh SP, Imielinski T, Iyer BR, Swami A (1992) An interval classifier for database mining applications. In: Proceedings of the 18th international conference on very large databases. Vancouver, British Columbia, pp 560–573 Agrawal R, Ghosh SP, Imielinski T, Iyer BR, Swami A (1992) An interval classifier for database mining applications. In: Proceedings of the 18th international conference on very large databases. Vancouver, British Columbia, pp 560–573
Zurück zum Zitat Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data. Dallas, pp 439–450 Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data. Dallas, pp 439–450
Zurück zum Zitat Agrawal S, Haritsa JR (2005) A framework for high-accuracy privacy-preserving mining. In: Proceedings of 21st international conference on data engineering. Los Alamitos, pp 193–204 Agrawal S, Haritsa JR (2005) A framework for high-accuracy privacy-preserving mining. In: Proceedings of 21st international conference on data engineering. Los Alamitos, pp 193–204
Zurück zum Zitat Alaggan M, Gambs S, Kermarrec A-M (2012) BLIP: non-interactive differentially-private similarity computation on bloom filters. Lecture notes in computer science, vol 7596. pp 202–216. doi:10.1007/978-3-642-33536-5_20 Alaggan M, Gambs S, Kermarrec A-M (2012) BLIP: non-interactive differentially-private similarity computation on bloom filters. Lecture notes in computer science, vol 7596. pp 202–216. doi:10.​1007/​978-3-642-33536-5_​20
Zurück zum Zitat Atallah M, Elmagarmid A, Ibrahim M, Bertino E, Verykios V (1999) Disclosure limitation of sensitive rules. In: Proceedings of the 1999 workshop on knowledge and data engineering exchange. Chicago, pp 45–52 Atallah M, Elmagarmid A, Ibrahim M, Bertino E, Verykios V (1999) Disclosure limitation of sensitive rules. In: Proceedings of the 1999 workshop on knowledge and data engineering exchange. Chicago, pp 45–52
Zurück zum Zitat Balu R, Furon T, Gambs S (2014) Challenging differential privacy: The case of non-interactive mechanisms. Lecture notes in computer science, vol 8713. pp 146–164. doi:10.1007/978-3-319-11212-1_9 Balu R, Furon T, Gambs S (2014) Challenging differential privacy: The case of non-interactive mechanisms. Lecture notes in computer science, vol 8713. pp 146–164. doi:10.​1007/​978-3-319-11212-1_​9
Zurück zum Zitat Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) You might also like: privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy. Berkeley, pp 231–246 Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) You might also like: privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy. Berkeley, pp 231–246
Zurück zum Zitat Canny J (2002) Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. Tampere, pp 238–245 Canny J (2002) Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. Tampere, pp 238–245
Zurück zum Zitat Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of the 5th IEEE international conference on data mining. Houston, pp 589–592 Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of the 5th IEEE international conference on data mining. Houston, pp 589–592
Zurück zum Zitat Chen K, Sun G, Liu L (2007) Towards attack-resilient geometric data perturbation. In: Proceedings of the 2007 SIAM international conference on data mining. Minneapolis, pp 78–89 Chen K, Sun G, Liu L (2007) Towards attack-resilient geometric data perturbation. In: Proceedings of the 2007 SIAM international conference on data mining. Minneapolis, pp 78–89
Zurück zum Zitat Chen K, Liu L (2008) A survey of multiplicative perturbation for privacy preserving data mining. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 157–181CrossRef Chen K, Liu L (2008) A survey of multiplicative perturbation for privacy preserving data mining. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 157–181CrossRef
Zurück zum Zitat Domingo-Ferrer J, Sebé F, Castellà-Roca J (2004) On the security of noise addition for privacy in statistical databases. Lecture notes in computer science, vol 3050. pp 149–161. doi:10.1007/978-3-540-25955-8_12 Domingo-Ferrer J, Sebé F, Castellà-Roca J (2004) On the security of noise addition for privacy in statistical databases. Lecture notes in computer science, vol 3050. pp 149–161. doi:10.​1007/​978-3-540-25955-8_​12
Zurück zum Zitat Du W, Zhan Z (2003) Using randomized response techniques for privacy-preserving data mining. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining. Washington, pp 505–510 Du W, Zhan Z (2003) Using randomized response techniques for privacy-preserving data mining. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining. Washington, pp 505–510
Zurück zum Zitat Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, pp 217–228 Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, pp 217–228
Zurück zum Zitat Evfimievski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. San Diego, pp 211–222 Evfimievski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. San Diego, pp 211–222
Zurück zum Zitat Guo L, Wu X (2009) Privacy preserving categorical data analysis with unknown distortion parameters. Trans Data Priv 2:185–205MathSciNet Guo L, Wu X (2009) Privacy preserving categorical data analysis with unknown distortion parameters. Trans Data Priv 2:185–205MathSciNet
Zurück zum Zitat Guo S, Wu X (2006a) On the use of spectral filtering for privacy preserving data mining. In: Proceedings of the 21st annual ACM symposium on applied computing. Dijon, pp 622–626 Guo S, Wu X (2006a) On the use of spectral filtering for privacy preserving data mining. In: Proceedings of the 21st annual ACM symposium on applied computing. Dijon, pp 622–626
Zurück zum Zitat Guo S, Wu X (2006b) Deriving private information from general linear transformation perturbed data. Technical report, The University of North Carolina at Charlotte, Charlotte Guo S, Wu X (2006b) Deriving private information from general linear transformation perturbed data. Technical report, The University of North Carolina at Charlotte, Charlotte
Zurück zum Zitat Guo S, Wu X, Li Y (2006a) Deriving private information from perturbed data using IQR based approach. In: Proceedings of the 22nd international conference on data engineering workshops. Atlanta, pp 92–101 Guo S, Wu X, Li Y (2006a) Deriving private information from perturbed data using IQR based approach. In: Proceedings of the 22nd international conference on data engineering workshops. Atlanta, pp 92–101
Zurück zum Zitat Guo S, Wu X, Li Y (2006b) On the lower bound of reconstruction error for spectral filtering based privacy preserving data mining. Lecture notes in computer science, vol 4213. pp 520–527. doi:10.1007/11871637_51 Guo S, Wu X, Li Y (2006b) On the lower bound of reconstruction error for spectral filtering based privacy preserving data mining. Lecture notes in computer science, vol 4213. pp 520–527. doi:10.​1007/​11871637_​51
Zurück zum Zitat Guo S (2007) Analysis of and techniques for privacy preserving data mining. Dissertation, University of North Carolina at Charlotte Guo S (2007) Analysis of and techniques for privacy preserving data mining. Dissertation, University of North Carolina at Charlotte
Zurück zum Zitat Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data. Baltimore, pp 37–48 Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data. Baltimore, pp 37–48
Zurück zum Zitat Huang Z, Du W (2008) OptRR: optimizing randomized response schemes for privacy-preserving data mining. In: Proceedings of the 2008 IEEE 24th international conference on data engineering. Cancun, pp 705–714 Huang Z, Du W (2008) OptRR: optimizing randomized response schemes for privacy-preserving data mining. In: Proceedings of the 2008 IEEE 24th international conference on data engineering. Cancun, pp 705–714
Zurück zum Zitat Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New YorkCrossRef Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New YorkCrossRef
Zurück zum Zitat Iyengar VS (2002) Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, pp 279–288 Iyengar VS (2002) Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, pp 279–288
Zurück zum Zitat Kargupta H, Datta S, Wang Q, Sivakumar K (2003a) On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 99–106 Kargupta H, Datta S, Wang Q, Sivakumar K (2003a) On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 99–106
Zurück zum Zitat Kargupta H, Dutta H, Datta S, Sivakumar K (2003) Analysis of privacy preserving random perturbation techniques: further explorations. In: Proceedings of the 2003 ACM workshop on privacy in the electronic society. Washington, pp 31–38 Kargupta H, Dutta H, Datta S, Sivakumar K (2003) Analysis of privacy preserving random perturbation techniques: further explorations. In: Proceedings of the 2003 ACM workshop on privacy in the electronic society. Washington, pp 31–38
Zurück zum Zitat Kenthapadi K, Korolova A, Mironov I, Mishra N (2013) Privacy via the Johnson–Lindenstrauss transform. J Priv Confid 5(1):39–71 Kenthapadi K, Korolova A, Mironov I, Mishra N (2013) Privacy via the Johnson–Lindenstrauss transform. J Priv Confid 5(1):39–71
Zurück zum Zitat Liu K, Kargupta H, Ryan J (2006) Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Tran Knowl Data Eng 18(1):92–106. doi:10.1109/TKDE.2006.14 CrossRef Liu K, Kargupta H, Ryan J (2006) Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Tran Knowl Data Eng 18(1):92–106. doi:10.​1109/​TKDE.​2006.​14 CrossRef
Zurück zum Zitat Liu K, Giannella C, Kargupta H (2006b) An attacker’s view of distance preserving maps for privacy preserving data mining. Lecture notes in computer science, vol 4213. pp 297–308. doi:10.1007/11871637_30 Liu K, Giannella C, Kargupta H (2006b) An attacker’s view of distance preserving maps for privacy preserving data mining. Lecture notes in computer science, vol 4213. pp 297–308. doi:10.​1007/​11871637_​30
Zurück zum Zitat Liu K (2007) Multiplicative data perturbation for privacy preserving data mining. Dissertation, University of Maryland, Baltimore County Liu K (2007) Multiplicative data perturbation for privacy preserving data mining. Dissertation, University of Maryland, Baltimore County
Zurück zum Zitat Liu K, Giannella C, Kargupta H (2008a) A survey of attack techniques on privacy-preserving data perturbation methods. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 359–381CrossRef Liu K, Giannella C, Kargupta H (2008a) A survey of attack techniques on privacy-preserving data perturbation methods. In: Aggarwal CC, Yu PS (eds) Privacy-preserving data mining: models and algorithms. Springer, New York, pp 359–381CrossRef
Zurück zum Zitat Liu L, Wang J, Zhang J (2008b) Privacy vulnerabilities with background information in data perturbation. Technical report. Department of Computer Science, University of Kentucky Liu L, Wang J, Zhang J (2008b) Privacy vulnerabilities with background information in data perturbation. Technical report. Department of Computer Science, University of Kentucky
Zurück zum Zitat Oliveira SRM, Zaïane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE international conference on privacy. Security and data mining. Maebashi City, pp 43–54 Oliveira SRM, Zaïane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE international conference on privacy. Security and data mining. Maebashi City, pp 43–54
Zurück zum Zitat Oliveira SRM, Zaïane OR (2003a) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 613–616 Oliveira SRM, Zaïane OR (2003a) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 613–616
Zurück zum Zitat Oliveira SRM, Zaïane OR (2003b) Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian symposium on databases. Manaus, pp 304–318 Oliveira SRM, Zaïane OR (2003b) Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian symposium on databases. Manaus, pp 304–318
Zurück zum Zitat Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 625–628 Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: Proceedings of the 3rd IEEE international conference on data mining. Melbourne, pp 625–628
Zurück zum Zitat Polat H, Du W (2005) SVD-based collaborative filtering with privacy. In: Proceedings of the 21st annual ACM symposium on applied computing. Dijon, pp 791–795 Polat H, Du W (2005) SVD-based collaborative filtering with privacy. In: Proceedings of the 21st annual ACM symposium on applied computing. Dijon, pp 791–795
Zurück zum Zitat Polat H, Du W (2006) Achieving private recommendations using randomized response techniques. Lecture notes in computer science, vol 3918. pp 637–646. doi:10.1007/11731139_73 Polat H, Du W (2006) Achieving private recommendations using randomized response techniques. Lecture notes in computer science, vol 3918. pp 637–646. doi:10.​1007/​11731139_​73
Zurück zum Zitat Rizvi SJ, Haritsa JR (2002) Maintaining data privacy in association rule mining. Proceedings of the 28th international conference on very large data bases. Hong Kong, pp 682–693 Rizvi SJ, Haritsa JR (2002) Maintaining data privacy in association rule mining. Proceedings of the 28th international conference on very large data bases. Hong Kong, pp 682–693
Zurück zum Zitat Sang Y, Shen H, Tian H (2009) Reconstructing data perturbed by random projections when the mixing matrix is known. Lecture notes in computer science, vol 5782. pp 334–349. doi:10.1007/978-3-642-04174-7_22 Sang Y, Shen H, Tian H (2009) Reconstructing data perturbed by random projections when the mixing matrix is known. Lecture notes in computer science, vol 5782. pp 334–349. doi:10.​1007/​978-3-642-04174-7_​22
Zurück zum Zitat Sramka M, Safavi-Naini R, Denzinger J (2009) An attack on the privacy of sanitized data that fuses the outputs of multiple data miners. In: Proceedings of the 9th IEEE international conference on data mining workshops. Miami, pp 130–137 Sramka M, Safavi-Naini R, Denzinger J (2009) An attack on the privacy of sanitized data that fuses the outputs of multiple data miners. In: Proceedings of the 9th IEEE international conference on data mining workshops. Miami, pp 130–137
Zurück zum Zitat Sramka M (2010) A privacy attack that removes the majority of the noise from perturbed data. In: Proceedings of the 2010 international joint conference on neural networks. Barcelona, pp 1–8 Sramka M (2010) A privacy attack that removes the majority of the noise from perturbed data. In: Proceedings of the 2010 international joint conference on neural networks. Barcelona, pp 1–8
Zurück zum Zitat Sramka M, Safavi-Naini R, Denzinger J, Askari M (2010) A practice-oriented framework for measuring privacy and utility in data sanitization systems. In: Proceedings of the 12th international conference on extending database technology workshops. Lausanne Sramka M, Safavi-Naini R, Denzinger J, Askari M (2010) A practice-oriented framework for measuring privacy and utility in data sanitization systems. In: Proceedings of the 12th international conference on extending database technology workshops. Lausanne
Zurück zum Zitat Stewart GW, Sun J (1990) Matrix perturbation theory. Academic Press, WalthamMATH Stewart GW, Sun J (1990) Matrix perturbation theory. Academic Press, WalthamMATH
Zurück zum Zitat Székely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. InterStat 5:1–6 Székely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. InterStat 5:1–6
Zurück zum Zitat Turgay EO, Pedersen TB, Saygin Y, Savas E, Levi A (2008) Disclosure risks of distance preserving data transformations. Lecture notes in computer science, vol 5069. pp 79–94. doi:10.1007/978-3-540-69497-7_8 Turgay EO, Pedersen TB, Saygin Y, Savas E, Levi A (2008) Disclosure risks of distance preserving data transformations. Lecture notes in computer science, vol 5069. pp 79–94. doi:10.​1007/​978-3-540-69497-7_​8
Zurück zum Zitat Zhang S, Ford J, Makedon F (2006) Deriving private information from randomly perturbed ratings. In: Proceedings of the 6th SIAM international conference on data mining. Bethesda, pp 59–69 Zhang S, Ford J, Makedon F (2006) Deriving private information from randomly perturbed ratings. In: Proceedings of the 6th SIAM international conference on data mining. Bethesda, pp 59–69
Zurück zum Zitat Zhao J, Yang J, Zhang J (2014) Privacy properties of random projection perturbation when random matrix is leaking. J Comput Inf Syst 10(8):3465–3472MathSciNet Zhao J, Yang J, Zhang J (2014) Privacy properties of random projection perturbation when random matrix is leaking. J Comput Inf Syst 10(8):3465–3472MathSciNet
Zurück zum Zitat Zhu Z, Wang G, Du W (2009) Deriving private information from association rule mining results. In: Proceedings of the 25th international conference on data engineering. Shanghai, pp 18–29 Zhu Z, Wang G, Du W (2009) Deriving private information from association rule mining results. In: Proceedings of the 25th international conference on data engineering. Shanghai, pp 18–29
Metadaten
Titel
A survey: deriving private information from perturbed data
verfasst von
Burcu D. Okkalioglu
Murat Okkalioglu
Mehmet Koc
Huseyin Polat
Publikationsdatum
01.12.2015
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 4/2015
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-015-9439-5

Weitere Artikel der Ausgabe 4/2015

Artificial Intelligence Review 4/2015 Zur Ausgabe

Premium Partner