Skip to main content
Erschienen in: Journal of Combinatorial Optimization 2/2021

02.01.2021

Differentially private approximate aggregation based on feature selection

verfasst von: Zaobo He, Akshita Maradapu Vera Venkata Sai, Yan Huang, Daehee seo, Hanzhou Zhang, Qilong Han

Erschienen in: Journal of Combinatorial Optimization | Ausgabe 2/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Privacy-preserving data aggregation is an important problem that has attracted extensive study. The state-of-the-art techniques for solving this problem is differential privacy, which offers a strong privacy guarantee without making strong assumptions about the attacker. However, existing solutions cannot effectively query data aggregation from high-dimensional datasets under differential privacy guarantee. Particularly, when the input dataset contains large number of dimensions, existing solutions must inject large scale of noise into returned aggregates. To address the above issue, this paper proposes an algorithm for querying differentially private approximate aggregates from high-dimensional datasets. Given a dataset D, our algorithm first develops a \(\varepsilon '\)-differentially private feature selection method that is based on a data sampling process over a kd-tree, which allows us to obtain a differentially private low-dimensional dataset with representative instances. After that, our algorithm samples independent samples from the kd-tree aiming at obtaining \((\alpha ', \delta ')\)-approximate aggregates. Finally, a model is proposed to determine the relevance between privacy and utility budgets such that the final aggregate still satisfies the accuracy requirements specified by data consumers. Intuitively, the proposed algorithm circumvents the dilemma of both dimensionality and the height threshold of kd-tree, as it samples a low-dimensional dataset S and queries aggregates from S, instead of the kd-tree. Satisfying user-specified privacy and utility budgets after multiple-stages approximation is significantly challenging, and we presents a novel model to determine the parameters’ relevance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bamunu Mudiyanselage TK, Xiao X, Zhang Y, Pan Y (2020) Deep fuzzy neural networks for biomarker selection for accurate cancer detection. IEEE Trans Fuzzy Syst 28(12):3219–3228 Bamunu Mudiyanselage TK, Xiao X, Zhang Y, Pan Y (2020) Deep fuzzy neural networks for biomarker selection for accurate cancer detection. IEEE Trans Fuzzy Syst 28(12):3219–3228
Zurück zum Zitat Cai Z, Zheng X (2018) A private and efficient mechanism for data uploading in smart cyber-physical systems. IEEE Trans Network Sci Eng 7:766–775MathSciNetCrossRef Cai Z, Zheng X (2018) A private and efficient mechanism for data uploading in smart cyber-physical systems. IEEE Trans Network Sci Eng 7:766–775MathSciNetCrossRef
Zurück zum Zitat Cai Z, He Z, Guan X, Li Y (2016) Collective data-sanitization for preventing sensitive information inference attacks in social networks. IEEE Trans Dependable Secure Comput 15(4):577–590 Cai Z, He Z, Guan X, Li Y (2016) Collective data-sanitization for preventing sensitive information inference attacks in social networks. IEEE Trans Dependable Secure Comput 15(4):577–590
Zurück zum Zitat Cai Z, Zheng X, Yu J (2019) A differential-private framework for urban traffic flows estimation via taxi companies. IEEE Trans Ind Inform 15(12):6492–6499CrossRef Cai Z, Zheng X, Yu J (2019) A differential-private framework for urban traffic flows estimation via taxi companies. IEEE Trans Ind Inform 15(12):6492–6499CrossRef
Zurück zum Zitat Cai Z, He Z (2019) Trading private range counting over big iot data. In: 2019 IEEE 39th international conference on distributed computing systems (ICDCS). IEEE, pp 144–153 Cai Z, He Z (2019) Trading private range counting over big iot data. In: 2019 IEEE 39th international conference on distributed computing systems (ICDCS). IEEE, pp 144–153
Zurück zum Zitat Day W-Y, Li N, Lyu M (2016) Publishing graph degree distribution with node differential privacy. In: Proceedings of the 2016 international conference on management of data. ACM, pp 123–138 Day W-Y, Li N, Lyu M (2016) Publishing graph degree distribution with node differential privacy. In: Proceedings of the 2016 international conference on management of data. ACM, pp 123–138
Zurück zum Zitat Dwork C (2006) Differential privacy. In: Automata. Languages and Programming pp 1–12 Dwork C (2006) Differential privacy. In: Automata. Languages and Programming pp 1–12
Zurück zum Zitat Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Theory of cryptography conference. Springer, pp 265–284 Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Theory of cryptography conference. Springer, pp 265–284
Zurück zum Zitat Friedman JH, Bentley JL, Finkel RA (1976) An algorithm for finding best matches in logarithmic time. ACM Trans Math Softw 3:209–226CrossRef Friedman JH, Bentley JL, Finkel RA (1976) An algorithm for finding best matches in logarithmic time. ACM Trans Math Softw 3:209–226CrossRef
Zurück zum Zitat He Z, Cai Z, Cheng S, Wang X (2015) Approximate aggregation for tracking quantiles and range countings in wireless sensor networks. Theor Comput Sci 607:381–390MathSciNetCrossRef He Z, Cai Z, Cheng S, Wang X (2015) Approximate aggregation for tracking quantiles and range countings in wireless sensor networks. Theor Comput Sci 607:381–390MathSciNetCrossRef
Zurück zum Zitat He Z, Cai Z, Yu J, Wang X, Sun Y, Li Y (2016) Cost-efficient strategies for restraining rumor spreading in mobile social networks. IEEE Trans Veh Technol 66(3):2789–2800CrossRef He Z, Cai Z, Yu J, Wang X, Sun Y, Li Y (2016) Cost-efficient strategies for restraining rumor spreading in mobile social networks. IEEE Trans Veh Technol 66(3):2789–2800CrossRef
Zurück zum Zitat He Z, Cai Z, Yu J (2017) Latent-data privacy preserving with customized data utility for social network data. IEEE Trans Veh Technol 67(1):665–673CrossRef He Z, Cai Z, Yu J (2017) Latent-data privacy preserving with customized data utility for social network data. IEEE Trans Veh Technol 67(1):665–673CrossRef
Zurück zum Zitat He Z, Li Y, Li J, Li K, Cai Q, Liang Y (2018) Achieving differential privacy of genomic data releasing via belief propagation. Tsinghua Sci Techno 23(4):389–395CrossRef He Z, Li Y, Li J, Li K, Cai Q, Liang Y (2018) Achieving differential privacy of genomic data releasing via belief propagation. Tsinghua Sci Techno 23(4):389–395CrossRef
Zurück zum Zitat Kasiviswanathan SP, Lee HK, Nissim K, Raskhodnikova S, Smith A (2011) What can we learn privately? SIAM J Comput 40(3):793–826MathSciNetCrossRef Kasiviswanathan SP, Lee HK, Nissim K, Raskhodnikova S, Smith A (2011) What can we learn privately? SIAM J Comput 40(3):793–826MathSciNetCrossRef
Zurück zum Zitat Lei J (2011) Differentially private m-estimators. In: Advances in neural information processing systems, pp 361–369 Lei J (2011) Differentially private m-estimators. In: Advances in neural information processing systems, pp 361–369
Zurück zum Zitat Li G, Peng S, Wang C, Niu J, Yuan Y (2019) An energy-efficient data collection scheme using denoising autoencoder in wireless sensor networks. Tsinghua Sci Technol 24(1):86–96CrossRef Li G, Peng S, Wang C, Niu J, Yuan Y (2019) An energy-efficient data collection scheme using denoising autoencoder in wireless sensor networks. Tsinghua Sci Technol 24(1):86–96CrossRef
Zurück zum Zitat Li M, Wang H, Li J (2020) Mining conditional functional dependency rules on big data. Big Data Mining Anal 3(1):68–84CrossRef Li M, Wang H, Li J (2020) Mining conditional functional dependency rules on big data. Big Data Mining Anal 3(1):68–84CrossRef
Zurück zum Zitat Liu L, Chen X, Lu Z, Wang L, Wen X (2019) Mobile-edge computing framework with data compression for wireless network in energy internet. Tsinghua Sci Technol 24(3):271–280CrossRef Liu L, Chen X, Lu Z, Wang L, Wen X (2019) Mobile-edge computing framework with data compression for wireless network in energy internet. Tsinghua Sci Technol 24(3):271–280CrossRef
Zurück zum Zitat Liu J, Wang L, Liu J (2019) Efficient preference clustering via random fourier features. Big Data Min Anal 2(3):195–204CrossRef Liu J, Wang L, Liu J (2019) Efficient preference clustering via random fourier features. Big Data Min Anal 2(3):195–204CrossRef
Zurück zum Zitat McSherry F, Talwar K (2007) Mechanism design via differential privacy. In: 48th annual IEEE symposium on foundations of computer science (FOCS’07). IEEE, pp 94–103 McSherry F, Talwar K (2007) Mechanism design via differential privacy. In: 48th annual IEEE symposium on foundations of computer science (FOCS’07). IEEE, pp 94–103
Zurück zum Zitat Su D, Cao J, Li N, Bertino E, Jin H (2016) Differentially private k-means clustering. In: Proceedings of the sixth ACM conference on data and application security and privacy. ACM, pp 26–37 Su D, Cao J, Li N, Bertino E, Jin H (2016) Differentially private k-means clustering. In: Proceedings of the sixth ACM conference on data and application security and privacy. ACM, pp 26–37
Zurück zum Zitat Wu W, Yu Z, He J (2019) A semi-supervised deep network embedding approach based on the neighborhood structure. Big Data Min Anal 2(3):205–216CrossRef Wu W, Yu Z, He J (2019) A semi-supervised deep network embedding approach based on the neighborhood structure. Big Data Min Anal 2(3):205–216CrossRef
Zurück zum Zitat Zaki MJ, Pan Y (2002) Introduction: recent developments in parallel and distributed data mining. Distrib Parallel Databases 11(2):123–127 Zaki MJ, Pan Y (2002) Introduction: recent developments in parallel and distributed data mining. Distrib Parallel Databases 11(2):123–127
Zurück zum Zitat Zhang J, Cormode G, Procopiuc CM, Srivastava D, Xiao X (2017) Privbayes: private data release via Bayesian networks. ACM Trans Database Syst 42(4):25MathSciNetCrossRef Zhang J, Cormode G, Procopiuc CM, Srivastava D, Xiao X (2017) Privbayes: private data release via Bayesian networks. ACM Trans Database Syst 42(4):25MathSciNetCrossRef
Zurück zum Zitat Zhang H, Wang H, Li J, Gao H (2018) A generic data analytics system for manufacturing production. Big Data Min Anal 1(2):160–171CrossRef Zhang H, Wang H, Li J, Gao H (2018) A generic data analytics system for manufacturing production. Big Data Min Anal 1(2):160–171CrossRef
Zurück zum Zitat Zhang J, Xiao X, Xie X (2016) Privtree: a differentially private algorithm for hierarchical decompositions. In: Proceedings of the 2016 international conference on management of data, pp 155–170 Zhang J, Xiao X, Xie X (2016) Privtree: a differentially private algorithm for hierarchical decompositions. In: Proceedings of the 2016 international conference on management of data, pp 155–170
Zurück zum Zitat Zheng X, Cai Z (2020) Privacy-preserved data sharing towards multiple parties in industrial Io Ts. IEEE J Sel Areas Commun 38:968–979CrossRef Zheng X, Cai Z (2020) Privacy-preserved data sharing towards multiple parties in industrial Io Ts. IEEE J Sel Areas Commun 38:968–979CrossRef
Zurück zum Zitat Zheng X, Cai Z, Yu J, Wang C, Li Y (2017) Follow but no track: privacy preserved profile publishing in cyber-physical social systems. IEEE Internet Things J 4(6):1868–1878CrossRef Zheng X, Cai Z, Yu J, Wang C, Li Y (2017) Follow but no track: privacy preserved profile publishing in cyber-physical social systems. IEEE Internet Things J 4(6):1868–1878CrossRef
Metadaten
Titel
Differentially private approximate aggregation based on feature selection
verfasst von
Zaobo He
Akshita Maradapu Vera Venkata Sai
Yan Huang
Daehee seo
Hanzhou Zhang
Qilong Han
Publikationsdatum
02.01.2021
Verlag
Springer US
Erschienen in
Journal of Combinatorial Optimization / Ausgabe 2/2021
Print ISSN: 1382-6905
Elektronische ISSN: 1573-2886
DOI
https://doi.org/10.1007/s10878-020-00666-1

Weitere Artikel der Ausgabe 2/2021

Journal of Combinatorial Optimization 2/2021 Zur Ausgabe