Top

Cluster Computing

Published in:

21-08-2017

Detection of duplicated data with minimum overhead and secure data transmission for sensor big data

Authors: S. Beulah, F. Ramesh Dhanaseelan

Published in: Cluster Computing | Special Issue 5/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Big data refers to the data sets that are difficult to deal with traditional data processing applications because of its speed, size and variety of data. The big data were generated from activities, sensing devices, mobile devices, Internet, RFID readers etc. One of the key sources of big data is the data from the sensor. The significant amounts of the data from the sensor are either redundant or almost similar. It initiates the requirement of de-duplication of the sensor data. The data from the sensors need to be stored for further process or analysis which requires end-to-end security for the data. A method is proposed in this paper for detecting the similar data with light-weight process using pattern analysis and matching. The distributed encoding process is proposed here for imposing end-to-end security for the generated data with reduced communication overhead. The data received in the processing server are decoded, analyzed and matched with patterns for removing similar and duplicated data. The result shows that the proposed system secures data during transmission with light-weighted processes. The duplicated and similar data are detected efficiently through inline process before the data enter into the storage. Experimental results are given as proof of the above mentioned concept.

previous article HHSRP: a cluster based hybrid hierarchical secure routing protocol for wireless sensor networks

next article Prediction of user’s type and navigation pattern using clustering and classification algorithms

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Digital Data Created in 2020 Forecasted at 35 Zettabytes; Cloud Computing Will Manage Data Growth [Online]. By Todd Erickson

Gantz, J., Reinsel, D.: Extracting value from chaos. IDC Rev. 1142, 1–12 (2011)

Xia, W., et al.: DARE: A deduplication-aware resemblance detection and elimination scheme for data reduction with low overheads. IEEE Trans. Comput. 65(6), 1692–1705 (2016)MathSciNetCrossRef

Quinlan, S., Dorward, S.: Venti: a new approach to archival storage. In: Proceedings USENIX Conferences File Storage Technologies, Jan, pp. 89–101 (2002)

Zhu, B., Li, K., Patterson, R.H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings 6th USENIX Conferences File Storage Technologies, Feb, vol. 8, pp. 1–14 (2008)

Muthitacharoen, A., Chen, B., Mazieres, D.: A low-bandwidth network file system. In: Proceedings ACM Symposium on Operating Systems Principles. Oct, pp. 1–14 (2001)

Shilane, P., Huang, M., Wallace, G., Hsu, W.: WAN optimized replication of backup datasets using stream-informed delta compression. In: Proceedings 10th USENIX Conferences File Storage Technol., Feb, pp. 49–64 (2012)

Kulkarni, P., Douglis, F., LaVoie, J.D., Tracey, J.M.: Redundancy elimination within large collections of files. In: Proceedings USENIX Annual Technical Conference, Jun, pp. 59–72 (2012)

Yang, Q., Ren, J.: I-cash: Intelligently coupled array of SSD and HDD. In: Proceedings 17th IEEE International Symposium High Perform. Computer Architecture , Feb, pp. 278–289 (2011)

10.

Gupta, D., Lee, S., Vrable, M., Savage, S., Snoeren, A.C., Varghese, G., Voelker, G.M., Vahdat, A.: Difference engine: Harnessing memory redundancy in virtu’al machines. In: Proceedings 5th Symposium on Operating Systems Design Implementation., Dec, pp. 309–322 ( 2008)

11.

Dong, X., et al.: Secure sensitive data sharing on a big data platform. Tsinghua Sci. Technol. 20(1), 72–80 (2015)MathSciNetCrossRef

12.

Muhammad, K.; Steganography: A Secure way for Transmission in Wireless Sensor Networks. arXiv preprint arXiv:1511.08865 (2015)

13.

Lu, H., Li, J., Guizani, M.: Secure and efficient data transmission for cluster-based wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 25(3), 750–761 (2014)CrossRef

14.

Xu, K., Yue, H., Guo, L., Guo, Y., Fang, Y.; Privacy-preserving machine learning algorithms for big data systems. In: ICDCS, IEEE (2015)

15.

Zakerzadeh, H., Aggarwal, C.C, Barker, K.: Privacy-preserving big data publishing. In: Proceedings of the 27th International Conference on Scientific and Statistical Database Management. ACM (2015)

16.

Yan, Z., et al.: Deduplication on encrypted big data in cloud. IEEE Trans. Big Data 2(2), 138–150 (2016)MathSciNetCrossRef

17.

Hashem, I.A.T., et al.: The rise of “big data” on cloud computing: Review and open research issues. Inf. Syst. 47, 98–115 (2015)CrossRef

18.

Harnik, D., Pinkas, B., Shulman-Peleg, A.: Side channels in cloud services: Deduplication in cloud storage. IEEE Secur. Priv. 8(6), 40–47 (2010)CrossRef

19.

Akhila, K., Ganesh, A., Sunitha, C.: A study on deduplication techniques over encrypted data. Proc. Comput. Sci. 87, 38–43 (2016)CrossRef

20.

Jayapandian, N., Md Rahman, A.M.J.: Secure and efficient online data storage and sharing over cloud environment using probabilistic with homomorphic encryption. Clust. Comput. 20(2), 1561–1573 (2017)CrossRef

21.

Stanek, J. et al.: A secure data deduplication scheme for cloud storage. In: International Conference on Financial Cryptography and Data Security. Springer, Heidelberg (2014)

22.

Low, W.L., Lee, M.L., Ling, T.W.: A knowledge-based approach for duplicate elimination in data cleaning. Inf. Syst. 26(8), 585–606 (2001)CrossRef

23.

Li, Y. et al.: Outsourced privacy-preserving C4. 5 decision tree algorithm over horizontally and vertically partitioned dataset among multiple parties. Clust. Comput. (2017). doi:10.1007/s10586-017-1019-9

24.

Sharma, P.K., Mahajan, R.: A security architecture for attacks detection and authentication in wireless mesh networks. Clust. Comput. (2017). doi:10.1007/s10586-017-0970-9

Title: Detection of duplicated data with minimum overhead and secure data transmission for sensor big data
Authors: S. Beulah
F. Ramesh Dhanaseelan
Publication date: 21-08-2017
Publisher: Springer US
Published in: Cluster Computing / Issue Special Issue 5/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-017-1079-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Special Issue 5/2019

Monitoring IaaS using various cloud monitors

A modified multi objective heuristic for effective feature selection in text classification

Clutch control of a hybrid electrical vehicle based on neuron-adaptive PID algorithm

A spatio-frequency orientational energy based medical image fusion using non-sub sampled contourlet transform

Markovian model based indoor location tracking for Internet of Things (IoT) applications

Selection and optimization of cooperative advertising strategies in supply chain based on stackelberg game method

Premium Partner