Skip to main content
Top

13-07-2024

Representing a Model for the Anonymization of Big Data Stream Using In-Memory Processing

Authors: Elham Shamsinejad, Touraj Banirostam, Mir Mohsen Pedram, Amir Masoud Rahmani

Published in: Annals of Data Science

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In light of the escalating privacy risks in the big data era, this paper introduces an innovative model for the anonymization of big data streams, leveraging in-memory processing within the Spark framework. The approach is founded on the principle of K-anonymity and propels the field forward by critically evaluating various anonymization methods and algorithms, benchmarking their performance with respect to time and space complexities. A distinctive formula for optimized cluster determination in the K-means algorithm is presented, along with a novel tuple expiration time strategy for the efficient purging of clusters. The integration of these components into Spark’s RDD and MLlib modules results in a significant decrease in execution time and data loss rates, even with increasing data volumes. The paper’s notable contributions are its methodological advancements that offer a robust, scalable solution for data anonymization, safeguarding user privacy without sacrificing data utility or processing efficiency.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York
10.
13.
go back to reference Banirostam H, Banirostam T, Pedram MM et al (2023) A model to detect the fraud of electronic payment card transactions based on stream processing in big data. J Signal Process Syst 23(1):1–16 Banirostam H, Banirostam T, Pedram MM et al (2023) A model to detect the fraud of electronic payment card transactions based on stream processing in big data. J Signal Process Syst 23(1):1–16
14.
go back to reference Shamsinezhad E, Shahbahrami A, Hedayati A, Khadem Zadeh A, Banirostam H (2013) Presentation methods for task migration in cloud computing by combination of Yu router and post-copy. Int J Comput Sci Issues (IJCSI) 10:98–102 Shamsinezhad E, Shahbahrami A, Hedayati A, Khadem Zadeh A, Banirostam H (2013) Presentation methods for task migration in cloud computing by combination of Yu router and post-copy. Int J Comput Sci Issues (IJCSI) 10:98–102
15.
go back to reference Banirostam H, Banirostam, T., Pedram MM, Rahmani AM (2023) Providing and evaluating a comprehensive model for detecting fraudulent electronic payment card transactions with a two-level filter based on flow processing in big data. Int. j. inf. tecnol. 15, 4161–4166. https://doi.org/10.1007/s41870-023-01501-6 Banirostam H, Banirostam, T., Pedram MM, Rahmani AM (2023) Providing and evaluating a comprehensive model for detecting fraudulent electronic payment card transactions with a two-level filter based on flow processing in big data. Int. j. inf. tecnol. 15, 4161–4166. https://​doi.​org/​10.​1007/​s41870-023-01501-6
19.
go back to reference Banirostam H, Shamsinezhad E, Banirostam T (2013) Functional control of users by biometric behavior features in cloud computing. In: Proceedings of the 2013 4th International conference on intelligent systems, modelling and simulation, pp 94–98. https://doi.org/10.1109/ISMS.2013.102 Banirostam H, Shamsinezhad E, Banirostam T (2013) Functional control of users by biometric behavior features in cloud computing. In: Proceedings of the 2013 4th International conference on intelligent systems, modelling and simulation, pp 94–98. https://​doi.​org/​10.​1109/​ISMS.​2013.​102
20.
go back to reference Banirostam, H., Hedayati, A., Khadem Zadeh, A., & Shamsinezhad, E. (2013). A trust-based approach for increasing security in cloud computing infrastructure. In: Proceedings of the UKSim 15th International conference on computer modeling and simulation, pp 717–721. https://doi.org/10.1109/UKSim.2013.39 Banirostam, H., Hedayati, A., Khadem Zadeh, A., & Shamsinezhad, E. (2013). A trust-based approach for increasing security in cloud computing infrastructure. In: Proceedings of the UKSim 15th International conference on computer modeling and simulation, pp 717–721. https://​doi.​org/​10.​1109/​UKSim.​2013.​39
26.
go back to reference Banirostam T, Shamsinejad E, Pedram MM, Rahmani AM (2021) A review of anonymity algorithms in big data. J Adv Computer Eng Technol 7:187–196 Banirostam T, Shamsinejad E, Pedram MM, Rahmani AM (2021) A review of anonymity algorithms in big data. J Adv Computer Eng Technol 7:187–196
36.
go back to reference Mehta B.B, Rao, UP (2018) Toward scalable anonymization for privacy-preserving big data publishing. In: Sa P, Bakshi S, Hatzilygeroudis I, Sahoo M (eds), Recent findings in intelligent computing techniques advances in intelligent systems and computing. vol 708. Springer, Singapore, https://doi.org/10.1007/978-981-10-8636-6_31 Mehta B.B, Rao, UP (2018) Toward scalable anonymization for privacy-preserving big data publishing. In: Sa P, Bakshi S, Hatzilygeroudis I, Sahoo M (eds), Recent findings in intelligent computing techniques advances in intelligent systems and computing. vol 708. Springer, Singapore, https://​doi.​org/​10.​1007/​978-981-10-8636-6_​31
37.
go back to reference El Ouazzani Z, El Bakkali H (2018) A new technique ensuring privacy in big data: K-anonymity without prior value of the threshold k. In: Proceedings of the first International conference on intelligent computing in data sciences, vol 127, pp 52–59 El Ouazzani Z, El Bakkali H (2018) A new technique ensuring privacy in big data: K-anonymity without prior value of the threshold k. In: Proceedings of the first International conference on intelligent computing in data sciences, vol 127, pp 52–59
38.
go back to reference Banirostam T, Banirostam H, Pedram MM, Rahmani AM (2021) A review of fraud detection algorithms for electronic payment card transactions. J Adv Comput Eng Technol 7:157–166 Banirostam T, Banirostam H, Pedram MM, Rahmani AM (2021) A review of fraud detection algorithms for electronic payment card transactions. J Adv Comput Eng Technol 7:157–166
Metadata
Title
Representing a Model for the Anonymization of Big Data Stream Using In-Memory Processing
Authors
Elham Shamsinejad
Touraj Banirostam
Mir Mohsen Pedram
Amir Masoud Rahmani
Publication date
13-07-2024
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-024-00556-x

Premium Partner