Top

Distributed and Parallel Databases

Published in:

05-01-2021

Prefetched wald adaptive boost classification based Czekanowski similarity MapReduce for user query processing with bigdata

Authors: S. Tamil Selvan, P. Balamurugan, M. Vijayakumar

Published in: Distributed and Parallel Databases | Issue 4/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

With large volumes of data being generated in recent years and the inception of big data analytics on social media necessitates accurate user query processing with minimum time complexity. Several research works have been conducted in this area, to address accuracy and time complexity involved in query processing, in this work, Wald Adaptive Prefetched Boosting Classification based Czekanowski Similarity MapReduce (WAPBC–CSMR) technique is introduced. The WAPBC–CSMR technique uses the big dataset for processing large number of user queries. First, a technique called, Wald Adaptive Prefetched Boosting is employed with the objective of classifying the big dataset into different classes. To reduce the time involved in classification, in this paper a classifier called Gaussian distributive Rocchio is used that achieves significant classification in minimum time. With the classified results, a Likelihood Radio Test is applied to integrate the weak learner results into strong classification results. Then the classified and refined data are stored on the prefetcher cache. Upon reception of multi-dimensional user queries by the prefetch manager, the queries are now split into multiple keywords and are fed into the map phase, where mapping function is performed using Czekanowski Similarity Index with the objective of identifying the repeated jobs with maximum query processing accuracy. Followed by which the relevant data are retrieved from the prefetcher cache and repeated user query task is removed in the reduce phase via statistical function, therefore contributing to minimum time. Result analysis of WAPBC–CSMR is performed with big dataset using different metrics such as query processing accuracy, error rate and processing time for varied number of user queries. The result shows that WAPBC–CSMR technique enhances query processing accuracy and lessens the time as well as the error rate than the conventional methods.

next article An intelligent surveillance video analytics framework using NACT-Hadoop/MapReduce on cloud services

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Fathimabi, S., Subramanyam, R.B.V., Somayajulu, D.V.L.N.: MSP: multiple sub-graph query processing using structure-based graph partitioning strategy and map-reduce. J. King Saud Univ.-Comput. Inf. Sci. 31, 22–34 (2019)

Shi, M., Shen, D., Nie, T., Kou, Y., Yu, G.: HPPQ: a parallel package queries processing approach for large-scale data. Big Data Min. Anal. 1(2), 146–159 (2018)CrossRef

Smys, S., Joe, C.V.: Big data business analytics as a strategic asset for health care industry. J. ISMAC 1(2), 92–100 (2019)

Lee, K., Liu, L., Ganti, R.K., Srivatsa, M., Zhang, Q., Zho, Y.: Lightweight indexing and querying services for big spatial data. IEEE Trans. Serv. Comput. 12(3), 343–355 (2019)CrossRef

Wang, H., Qin, X., Zhou, X., Li, F., Qin, Z., Zhu, Q., Wang, S.: Efficient query processing framework for a big data warehouse: an almost join-free approach. Front. Comput. Sci. 9(2), 224–236 (2015)MathSciNetCrossRef

Karthiban, M.K., Raj, J.S.: Big data analytics for developing secure internet of everything. J. ISMAC 1(02), 129–136 (2019)

Tang, Y., Wang, H.S.Q., Liu, X.: Handling multi-dimensional complex queries in key-value data stores. Inf. Syst. 66, 82–96 (2017)CrossRef

Birjali, M., Beni-Hssane, A., Erritali, M.: Evaluation of high-level query languages based on MapReduce in Big Data. J. Big Data 5, 1–21 (2018)CrossRef

Xiao, G., Li, K., Zhou, X., Li, K.: Efficient monochromatic and bichromatic probabilistic reverse top-k query processing for uncertain big data. J. Comput. Syst. Sci. 89, 92–113 (2017)MathSciNetCrossRef

10.

Smys, S.: Energy-aware security routing protocol for WSN in big-data applications. J. ISMAC 1(01), 38–55 (2019)

11.

Kim, M., Liu, L., Choi, W.: A GPU-aware parallel index for processing high-dimensional big data. IEEE Trans. Comput. 67(10), 1388–1402 (2018)MathSciNetCrossRef

12.

Fan, H., Ma, Z., Wang, D., Liu, J.: Handling distributed XML queries over large XML data based on MapReduce framework. Inf. Sci. 453, 1–20 (2018)MathSciNetCrossRef

13.

Franciscus, N., Ren, X., Stantic, B.: Precomputing architecture for flexible and efficient big data analytics. Vietnam J. Comput. Sci. 5(2), 133–142 (2018)CrossRef

14.

García-García, F., Corral, A., Iribarne, L., Vassilakopoulos, M.: Improving distance-join query processing with Voronoi-Diagram based partitioning in SpatialHadoop. Future Gener. Comput. Syst. 111, 723–740 (2020)CrossRef

15.

Pandian, A.P.: Enhanced edge model for big data in the internet of things based applications. J. Trends Comput. Sci. Smart Technol. (TCSST) 1(1), 63–73 (2019)CrossRef

16.

Al-Naami, K.M., Seker, S.E., Khan, L.: GISQAF: MapReduce guided spatial query processing and analytics system. Software 46(10), 1329–1349 (2016)

17.

Li, H., Yoo, J.: Efficient continuous skyline query processing scheme over large dynamic data sets. ETRI J. 38(6), 1197–1206 (2016)CrossRef

18.

Sahal, R., Khafagy, M.H., Omara, F.A.: Exploiting coarse-grained reused-based opportunities in big data multi-query optimization. J. Comput. Sci. 26, 432–452 (2018)CrossRef

19.

Joseph, S.I.T., Thanakumar, I.: Survey of data mining algorithm’s for intelligent computing system. J. Trends Comput. Sci. Smart Technol. (TCSST) 1(1), 14–24 (2019)CrossRef

20.

Wang, Y., Xia, Y., Fang, Q., Xu, X.: AQP++: a hybrid approximate query processing framework for generalized aggregation queries. J. Comput. Sci. 26, 419–431 (2018)MathSciNetCrossRef

21.

Kim, T., Li, W., Behma, A., Cetindila, I., Vernica, R., Borkar, V., Carey, M.J., Li, C.: Similarity query support in big data management systems. Inf. Syst. 88, 10455 (2020)CrossRef

Title: Prefetched wald adaptive boost classification based Czekanowski similarity MapReduce for user query processing with bigdata
Authors: S. Tamil Selvan
P. Balamurugan
M. Vijayakumar
Publication date: 05-01-2021
Publisher: Springer US
Published in: Distributed and Parallel Databases / Issue 4/2021
Print ISSN: 0926-8782
Electronic ISSN: 1573-7578
DOI: https://doi.org/10.1007/s10619-020-07319-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2021

Distributed arrays: an algebra for generic distributed query processing

Finding the most profitable candidate product by dynamic skyline and parallel processing

An intelligent surveillance video analytics framework using NACT-Hadoop/MapReduce on cloud services

Parallel query processing in a polystore

Semantic-based Big Data integration framework using scalable distributed ontology matching strategy

Premium Partner