Skip to main content
Erschienen in: Artificial Intelligence Review 2/2020

01.02.2019

The state of the art and taxonomy of big data analytics: view from new big data framework

verfasst von: Azlinah Mohamed, Maryam Khanian Najafabadi, Yap Bee Wah, Ezzatul Akmal Kamaru Zaman, Ruhaila Maskat

Erschienen in: Artificial Intelligence Review | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Big data has become a significant research area due to the birth of enormous data generated from various sources like social media, internet of things and multimedia applications. Big data has played critical role in many decision makings and forecasting domains such as recommendation systems, business analysis, healthcare, web display advertising, clinicians, transportation, fraud detection and tourism marketing. The rapid development of various big data tools such as Hadoop, Storm, Spark, Flink, Kafka and Pig in research and industrial communities has allowed the huge number of data to be distributed, communicated and processed. Big data applications use big data analytics techniques to efficiently analyze large amounts of data. However, choosing the suitable big data tools based on batch and stream data processing and analytics techniques for development a big data system are difficult due to the challenges in processing and applying big data. Practitioners and researchers who are developing big data systems have inadequate information about the current technology and requirement concerning the big data platform. Hence, the strengths and weaknesses of big data technologies and effective solutions for Big Data challenges are needed to be discussed. Hence, due to that, this paper presents a review of the literature that analyzes the use of big data tools and big data analytics techniques in areas like health and medical care, social networking and internet, government and public sector, natural resource management, economic and business sector. The goals of this paper are to (1) understand the trend of big data-related research and current frames of big data technologies; (2) identify trends in the use or research of big data tools based on batch and stream processing and big data analytics techniques; (3) assist and provide new researchers and practitioners to place new research activity in this domain appropriately. The findings of this study will provide insights and knowledge on the existing big data platforms and their application domains, the advantages and disadvantages of big data tools, big data analytics techniques and their use, and new research opportunities in future development of big data systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Agerri R, Artola X, Beloki Z, Rigau G, Soroa A (2015) Big data for natural language processing: a streaming approach. Knowl-Based Syst 79:36–42 Agerri R, Artola X, Beloki Z, Rigau G, Soroa A (2015) Big data for natural language processing: a streaming approach. Knowl-Based Syst 79:36–42
Zurück zum Zitat Ahmad A, Paul A, Rathore MM (2016) An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication. Neurocomputing 174:439–453 Ahmad A, Paul A, Rathore MM (2016) An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication. Neurocomputing 174:439–453
Zurück zum Zitat Ahmad A, Khan M, Paul A, Din S, Rathore MM, Jeon G, Choi GS (2017) Toward modeling and optimization of features selection in big data based social internet of things. Future Gener Comput Syst Ahmad A, Khan M, Paul A, Din S, Rathore MM, Jeon G, Choi GS (2017) Toward modeling and optimization of features selection in big data based social internet of things. Future Gener Comput Syst
Zurück zum Zitat Ai W, Li K, Li K (2017) An effective hot topic detection method for microblog on spark. Appl Soft Comput Ai W, Li K, Li K (2017) An effective hot topic detection method for microblog on spark. Appl Soft Comput
Zurück zum Zitat Amato F, Moscato V, Picariello A, Piccialli F (2017) SOS: a multimedia recommender system for online social networks. Future Gener Comput Syst Amato F, Moscato V, Picariello A, Piccialli F (2017) SOS: a multimedia recommender system for online social networks. Future Gener Comput Syst
Zurück zum Zitat Apiletti D, Baralis E, Cerquitelli T, Garza P, Pulvirenti F, Michiardi P (2017) A parallel MapReduce algorithm to efficiently support itemset mining on high dimensional data. Big Data Res 10:53–69 Apiletti D, Baralis E, Cerquitelli T, Garza P, Pulvirenti F, Michiardi P (2017) A parallel MapReduce algorithm to efficiently support itemset mining on high dimensional data. Big Data Res 10:53–69
Zurück zum Zitat Arias J, Gamez JA, Puerta JM (2017) Learning distributed discrete Bayesian network classifiers under MapReduce with Apache spark. Knowl-Based Syst 117:16–26 Arias J, Gamez JA, Puerta JM (2017) Learning distributed discrete Bayesian network classifiers under MapReduce with Apache spark. Knowl-Based Syst 117:16–26
Zurück zum Zitat Aufaure MA, Chiky R, Curé O, Khrouf H, Kepeklian G (2016) From business intelligence to semantic data stream management. Future Gener Comput Syst 63:100–107 Aufaure MA, Chiky R, Curé O, Khrouf H, Kepeklian G (2016) From business intelligence to semantic data stream management. Future Gener Comput Syst 63:100–107
Zurück zum Zitat Babar M, Arif F (2017) Smart urban planning using big data analytics to contend with the interoperability in Internet of Things. Future Gener Comput Syst 77:65–76 Babar M, Arif F (2017) Smart urban planning using big data analytics to contend with the interoperability in Internet of Things. Future Gener Comput Syst 77:65–76
Zurück zum Zitat Barba-González C, García-Nieto J, Nebro AJ, Cordero JA, Durillo JJ, Navas-Delgado I, Aldana-Montes JF (2017) jMetalSP: a framework for dynamic multi-objective big data optimization. Appl Soft Comput Barba-González C, García-Nieto J, Nebro AJ, Cordero JA, Durillo JJ, Navas-Delgado I, Aldana-Montes JF (2017) jMetalSP: a framework for dynamic multi-objective big data optimization. Appl Soft Comput
Zurück zum Zitat Basanta-Val P, Fernández-García N, Wellings AJ, Audsley NC (2015) Improving the predictability of distributed stream processors. Future Gener Comput Syst 52:22–36 Basanta-Val P, Fernández-García N, Wellings AJ, Audsley NC (2015) Improving the predictability of distributed stream processors. Future Gener Comput Syst 52:22–36
Zurück zum Zitat Basanta-Val, P., Fernández-García, N., & Sánchez-Fernández, L. (2017). Predictable remote invocations for distributed stream processing. Future Gener Comput Syst Basanta-Val, P., Fernández-García, N., & Sánchez-Fernández, L. (2017). Predictable remote invocations for distributed stream processing. Future Gener Comput Syst
Zurück zum Zitat Batarseh FA, Latif EA (2016) Assessing the quality of service using big data analytics: with application to healthcare. Big Data Res 4:13–24 Batarseh FA, Latif EA (2016) Assessing the quality of service using big data analytics: with application to healthcare. Big Data Res 4:13–24
Zurück zum Zitat Bechini A, Marcelloni F, Segatori A (2016) A MapReduce solution for associative classification of big data. Inf Sci 332:33–55 Bechini A, Marcelloni F, Segatori A (2016) A MapReduce solution for associative classification of big data. Inf Sci 332:33–55
Zurück zum Zitat Bei Z, Yu Z, Luo N, Jiang C, Xu C, Feng S (2018) Configuring in-memory cluster computing using random forest. Future Gener Comput Syst 79:1–15 Bei Z, Yu Z, Luo N, Jiang C, Xu C, Feng S (2018) Configuring in-memory cluster computing using random forest. Future Gener Comput Syst 79:1–15
Zurück zum Zitat Bharti SK, Vachha B, Pradhan RK, Babu KS, Jena SK (2016) Sarcastic sentiment detection in tweets streamed in real time: a big data approach. Digital Commun Netw 2(3):108–121 Bharti SK, Vachha B, Pradhan RK, Babu KS, Jena SK (2016) Sarcastic sentiment detection in tweets streamed in real time: a big data approach. Digital Commun Netw 2(3):108–121
Zurück zum Zitat Carcillo F, Dal Pozzolo A, Le Borgne YA, Caelen O, Mazzer Y, Bontempi G (2018) Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf Fusion 41:182–194 Carcillo F, Dal Pozzolo A, Le Borgne YA, Caelen O, Mazzer Y, Bontempi G (2018) Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf Fusion 41:182–194
Zurück zum Zitat Castiglione A, Colace F, Moscato V, Palmieri F (2017) CHIS: a big data infrastructure to manage digital cultural items. Future Gener Comput Syst Castiglione A, Colace F, Moscato V, Palmieri F (2017) CHIS: a big data infrastructure to manage digital cultural items. Future Gener Comput Syst
Zurück zum Zitat Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347 Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Zurück zum Zitat Chen G, Wu S, Wang Y (2015) The evolvement of big data systems: from the perspective of an information security application, 65–73 Chen G, Wu S, Wang Y (2015) The evolvement of big data systems: from the perspective of an information security application, 65–73
Zurück zum Zitat Chen H, Li T, Cai Y, Luo C, Fujita H (2016) Parallel attribute reduction in dominance-based neighborhood rough set. Inf Sci 373:351–368MATH Chen H, Li T, Cai Y, Luo C, Fujita H (2016) Parallel attribute reduction in dominance-based neighborhood rough set. Inf Sci 373:351–368MATH
Zurück zum Zitat Chen Y, Crespi N, Ortiz AM, Shu L (2017) Reality mining: a prediction algorithm for disease dynamics based on mobile big data. Inf Sci 379:82–93 Chen Y, Crespi N, Ortiz AM, Shu L (2017) Reality mining: a prediction algorithm for disease dynamics based on mobile big data. Inf Sci 379:82–93
Zurück zum Zitat De Maio C, Fenza G, Loia V, Orciuoli F (2017) Distributed online temporal fuzzy concept analysis for stream processing in smart cities. J Parallel Distrib Comput 110:31–41MATH De Maio C, Fenza G, Loia V, Orciuoli F (2017) Distributed online temporal fuzzy concept analysis for stream processing in smart cities. J Parallel Distrib Comput 110:31–41MATH
Zurück zum Zitat Del Río S, López V, Benítez JM, Herrera F (2014) On the use of MapReduce for imbalanced big data using random forest. Inf Sci 285:112–137 Del Río S, López V, Benítez JM, Herrera F (2014) On the use of MapReduce for imbalanced big data using random forest. Inf Sci 285:112–137
Zurück zum Zitat Ding L, Liu Y, Han B, Zhang S, Song B (2017) HB-file: an efficient and effective high-dimensional big data storage structure based on US-ELM. Neurocomputing 261:184–192 Ding L, Liu Y, Han B, Zhang S, Song B (2017) HB-file: an efficient and effective high-dimensional big data storage structure based on US-ELM. Neurocomputing 261:184–192
Zurück zum Zitat Eiras-Franco C, Bolón-Canedo V, Ramos S, González-Domínguez J, Alonso-Betanzos A, Touriño J (2016) Multithreaded and spark parallelization of feature selection filters. J Comput Sci 17:609–619 Eiras-Franco C, Bolón-Canedo V, Ramos S, González-Domínguez J, Alonso-Betanzos A, Touriño J (2016) Multithreaded and spark parallelization of feature selection filters. J Comput Sci 17:609–619
Zurück zum Zitat Elkano M, Galar M, Sanz J, Bustince H (2017) CHI-BD: a fuzzy rule-based classification system for big data classification problems. Fuzzy Sets Syst Elkano M, Galar M, Sanz J, Bustince H (2017) CHI-BD: a fuzzy rule-based classification system for big data classification problems. Fuzzy Sets Syst
Zurück zum Zitat Elsebakhi E, Lee F, Schendel E, Haque A, Kathireason N, Pathare T, Al-Ali R (2015) Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms. J Comput Sci 11:69–81MathSciNet Elsebakhi E, Lee F, Schendel E, Haque A, Kathireason N, Pathare T, Al-Ali R (2015) Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms. J Comput Sci 11:69–81MathSciNet
Zurück zum Zitat Fernández-Rodríguez JY, Álvarez-García JA, Fisteus JA, Luaces MR, Magaña VC (2017) Benchmarking real-time vehicle data streaming models for a Smart City. Inf Syst 72:62–76 Fernández-Rodríguez JY, Álvarez-García JA, Fisteus JA, Luaces MR, Magaña VC (2017) Benchmarking real-time vehicle data streaming models for a Smart City. Inf Syst 72:62–76
Zurück zum Zitat Ferranti A, Marcelloni F, Segatori A, Antonelli M, Ducange P (2017) A distributed approach to multi-objective evolutionary generation of fuzzy rule-based classifiers from big data. Inf Sci 415:319–340 Ferranti A, Marcelloni F, Segatori A, Antonelli M, Ducange P (2017) A distributed approach to multi-objective evolutionary generation of fuzzy rule-based classifiers from big data. Inf Sci 415:319–340
Zurück zum Zitat Fonseca A, Cabral B (2017) Prototyping a GPGPU neural network for deep-learning big data analysis. Big Data Res 8:50–56 Fonseca A, Cabral B (2017) Prototyping a GPGPU neural network for deep-learning big data analysis. Big Data Res 8:50–56
Zurück zum Zitat Gadiraju KK, Verma M, Davis KC, Talaga PG (2016) Benchmarking performance for migrating a relational application to a parallel implementation. Future Gener Comput Syst 63:148–156 Gadiraju KK, Verma M, Davis KC, Talaga PG (2016) Benchmarking performance for migrating a relational application to a parallel implementation. Future Gener Comput Syst 63:148–156
Zurück zum Zitat Genuer R, Poggi JM, Tuleau-Malot C, Villa-Vialaneix N (2017) Random forests for big data. Big Data Res 9:28–46 Genuer R, Poggi JM, Tuleau-Malot C, Villa-Vialaneix N (2017) Random forests for big data. Big Data Res 9:28–46
Zurück zum Zitat Guo J, Song B, Yu FR, Yan Z, Yang LT (2017) Object detection among multimedia big data in the compressive measurement domain under mobile distributed architecture. Future Gener Comput Syst 76:519–527 Guo J, Song B, Yu FR, Yan Z, Yang LT (2017) Object detection among multimedia big data in the compressive measurement domain under mobile distributed architecture. Future Gener Comput Syst 76:519–527
Zurück zum Zitat He W, Wu H, Yan G, Akula V, Shen J (2015) A novel social media competitive analytics framework with sentiment benchmarks. Inf Manag 52(7):801–812 He W, Wu H, Yan G, Akula V, Shen J (2015) A novel social media competitive analytics framework with sentiment benchmarks. Inf Manag 52(7):801–812
Zurück zum Zitat Hernández ÁB, Perez MS, Gupta S, Muntés-Mulero V (2017) Using machine learning to optimize parallelism in big data applications. Future Gener Comput Syst Hernández ÁB, Perez MS, Gupta S, Muntés-Mulero V (2017) Using machine learning to optimize parallelism in big data applications. Future Gener Comput Syst
Zurück zum Zitat Hidalgo N, Wladdimiro D, Rosas E (2017) Self-adaptive processing graph with operator fission for elastic stream processing. J Syst Softw 127:205–216 Hidalgo N, Wladdimiro D, Rosas E (2017) Self-adaptive processing graph with operator fission for elastic stream processing. J Syst Softw 127:205–216
Zurück zum Zitat Higashino WA, Capretz MA, Bittencourt LF (2016) CEPSim: modelling and simulation of complex event processing systems in cloud environments. Future Gener Comput Syst 65:122–139 Higashino WA, Capretz MA, Bittencourt LF (2016) CEPSim: modelling and simulation of complex event processing systems in cloud environments. Future Gener Comput Syst 65:122–139
Zurück zum Zitat Huang S, Wang B, Qiu J, Yao J, Wang G, Yu G (2016) Parallel ensemble of online sequential extreme learning machine based on map reduce. Neurocomputing 174:352–367 Huang S, Wang B, Qiu J, Yao J, Wang G, Yu G (2016) Parallel ensemble of online sequential extreme learning machine based on map reduce. Neurocomputing 174:352–367
Zurück zum Zitat Huang CS, Tsai MF, Huang PH, Su LD, Lee KS (2017) Distributed asteroid discovery system for large astronomical data. J Netw Comput Appl 93:27–37 Huang CS, Tsai MF, Huang PH, Su LD, Lee KS (2017) Distributed asteroid discovery system for large astronomical data. J Netw Comput Appl 93:27–37
Zurück zum Zitat Iqbal R, Doctor F, More B, Mahmud S, Yousuf U (2017) Big data analytics and computational intelligence for cyber–physical systems: recent trends and state of the art applications. Future Gener Comput Syst Iqbal R, Doctor F, More B, Mahmud S, Yousuf U (2017) Big data analytics and computational intelligence for cyber–physical systems: recent trends and state of the art applications. Future Gener Comput Syst
Zurück zum Zitat Jayasena KPN, Li L, Xie Q (2017) Multi-modal multimedia big data analyzing architecture and resource allocation on cloud platform. Neurocomputing 253:135–143 Jayasena KPN, Li L, Xie Q (2017) Multi-modal multimedia big data analyzing architecture and resource allocation on cloud platform. Neurocomputing 253:135–143
Zurück zum Zitat Jiang R, Lu R, Choo KKR (2018) Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data. Future Gener Comput Syst 78:392–401 Jiang R, Lu R, Choo KKR (2018) Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data. Future Gener Comput Syst 78:392–401
Zurück zum Zitat Karunaratne P, Karunasekera S, Harwood A (2017) Distributed stream clustering using micro-clusters on Apache Storm. J Parallel Distrib Comput 108:74–84 Karunaratne P, Karunasekera S, Harwood A (2017) Distributed stream clustering using micro-clusters on Apache Storm. J Parallel Distrib Comput 108:74–84
Zurück zum Zitat Kousiouris G, Akbar A, Sancho J, Ta-shma P, Psychas A, Kyriazis D, Varvarigou T (2018) An integrated information lifecycle management framework for exploiting social network data to identify dynamic large crowd concentration events in smart cities applications. Future Gener Comput Syst 78:516–530 Kousiouris G, Akbar A, Sancho J, Ta-shma P, Psychas A, Kyriazis D, Varvarigou T (2018) An integrated information lifecycle management framework for exploiting social network data to identify dynamic large crowd concentration events in smart cities applications. Future Gener Comput Syst 78:516–530
Zurück zum Zitat Kovalchuk SV, Krotov E, Smirnov PA, Nasonov DA, Yakovlev AN (2018) Distributed data-driven platform for urgent decision making in cardiological ambulance control. Future Gener Comput Syst 79:144–154 Kovalchuk SV, Krotov E, Smirnov PA, Nasonov DA, Yakovlev AN (2018) Distributed data-driven platform for urgent decision making in cardiological ambulance control. Future Gener Comput Syst 79:144–154
Zurück zum Zitat Kranjc J, Orač R, Podpečan V, Lavrač N, Robnik-Šikonja M (2017) ClowdFlows: online workflows for distributed big data mining. Future Gener Comput Syst 68:38–58 Kranjc J, Orač R, Podpečan V, Lavrač N, Robnik-Šikonja M (2017) ClowdFlows: online workflows for distributed big data mining. Future Gener Comput Syst 68:38–58
Zurück zum Zitat Kumar M, Rath SK (2015) Classification of microarray using MapReduce based proximal support vector machine classifier. Knowl-Based Syst 89:584–602 Kumar M, Rath SK (2015) Classification of microarray using MapReduce based proximal support vector machine classifier. Knowl-Based Syst 89:584–602
Zurück zum Zitat Liang Y, Wu D, Liu G, Li Y, Gao C, Ma ZJ, Wu W (2016) Big data-enabled multiscale serviceability analysis for aging bridges. Digit Commun Netw 2(3):97–107 Liang Y, Wu D, Liu G, Li Y, Gao C, Ma ZJ, Wu W (2016) Big data-enabled multiscale serviceability analysis for aging bridges. Digit Commun Netw 2(3):97–107
Zurück zum Zitat Lin W, Dou W, Zhou Z, Liu C (2015) A cloud-based framework for home-diagnosis service over big medical data. J Syst Softw 102:192–206 Lin W, Dou W, Zhou Z, Liu C (2015) A cloud-based framework for home-diagnosis service over big medical data. J Syst Softw 102:192–206
Zurück zum Zitat Maillo J, Ramírez S, Triguero I, Herrera F (2017) kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data. Knowl-Based Syst 117:3–15 Maillo J, Ramírez S, Triguero I, Herrera F (2017) kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data. Knowl-Based Syst 117:3–15
Zurück zum Zitat Manco G, Ritacco E, Rullo P, Gallucci L, Astill W, Kimber D, Antonelli M (2017) Fault detection and explanation through big data analysis on sensor streams. Expert Syst Appl 87:141–156 Manco G, Ritacco E, Rullo P, Gallucci L, Astill W, Kimber D, Antonelli M (2017) Fault detection and explanation through big data analysis on sensor streams. Expert Syst Appl 87:141–156
Zurück zum Zitat Manogaran G, Varatharajan R, Lopez D, Kumar PM, Sundarasekar R, Thota C (2017) A new architecture of internet of things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gener Comput Syst Manogaran G, Varatharajan R, Lopez D, Kumar PM, Sundarasekar R, Thota C (2017) A new architecture of internet of things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gener Comput Syst
Zurück zum Zitat Maté A, Peral J, Ferrández A, Gil D, Trujillo J (2016) A hybrid integrated architecture for energy consumption prediction. Future Gener Comput Syst 63:131–147 Maté A, Peral J, Ferrández A, Gil D, Trujillo J (2016) A hybrid integrated architecture for energy consumption prediction. Future Gener Comput Syst 63:131–147
Zurück zum Zitat Mavridis I, Karatza H (2017) Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark. J Syst Softw 125:133–151 Mavridis I, Karatza H (2017) Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark. J Syst Softw 125:133–151
Zurück zum Zitat Mestre DG, Pires CES, Nascimento DC (2017) Towards the efficient parallelization of multi-pass adaptive blocking for entity matching. J Parallel Distrib Comput 101:27–40 Mestre DG, Pires CES, Nascimento DC (2017) Towards the efficient parallelization of multi-pass adaptive blocking for entity matching. J Parallel Distrib Comput 101:27–40
Zurück zum Zitat Mohapatra SK, Sahoo PK, Wu SL (2016a) Big data analytic architecture for intruder detection in heterogeneous wireless sensor networks. J Netw Comput Appl 66:236–249 Mohapatra SK, Sahoo PK, Wu SL (2016a) Big data analytic architecture for intruder detection in heterogeneous wireless sensor networks. J Netw Comput Appl 66:236–249
Zurück zum Zitat Mohapatra SK, Sahoo PK, Wu SL (2016b) Big data analytic architecture for intruder detection in heterogeneous wireless sensor networks. J Netw Comput Appl 66:236–249 Mohapatra SK, Sahoo PK, Wu SL (2016b) Big data analytic architecture for intruder detection in heterogeneous wireless sensor networks. J Netw Comput Appl 66:236–249
Zurück zum Zitat Nair LR, Shetty SD, Shetty SD (2017) Applying spark based machine learning model on streaming big data for health status prediction. Comput Electr Eng Nair LR, Shetty SD, Shetty SD (2017) Applying spark based machine learning model on streaming big data for health status prediction. Comput Electr Eng
Zurück zum Zitat Najafabadi MK, Mahrin MNR (2016) A systematic literature review on the state of research and practice of collaborative filtering technique and implicit feedback. Artif Intell Rev 45(2):167–201 Najafabadi MK, Mahrin MNR (2016) A systematic literature review on the state of research and practice of collaborative filtering technique and implicit feedback. Artif Intell Rev 45(2):167–201
Zurück zum Zitat Najafabadi MK, Mohamed AH, Mahrin MNR (2017) A survey on data mining techniques in recommender systems. Soft Comput, 1–28 Najafabadi MK, Mohamed AH, Mahrin MNR (2017) A survey on data mining techniques in recommender systems. Soft Comput, 1–28
Zurück zum Zitat Najafabadi MK, Mahrin MNR, Chuprat S, Sarkan HM (2017b) Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data. Comput Hum Behav 67:113–128 Najafabadi MK, Mahrin MNR, Chuprat S, Sarkan HM (2017b) Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data. Comput Hum Behav 67:113–128
Zurück zum Zitat Nghiem PP, Figueira SM (2016) Towards efficient resource provisioning in MapReduce. J Parallel Distrib Comput 95:29–41 Nghiem PP, Figueira SM (2016) Towards efficient resource provisioning in MapReduce. J Parallel Distrib Comput 95:29–41
Zurück zum Zitat Nguyen T, Larsen ME, O’Dea B, Nguyen DT, Yearwood J, Phung D, Christensen H (2017) Kernel-based features for predicting population health indices from geocoded social media data. Decis Support Syst 102:22–31 Nguyen T, Larsen ME, O’Dea B, Nguyen DT, Yearwood J, Phung D, Christensen H (2017) Kernel-based features for predicting population health indices from geocoded social media data. Decis Support Syst 102:22–31
Zurück zum Zitat Oneto L, Fumeo E, Clerico G, Canepa R, Papa F, Dambra C, Anguita D (2017) Train delay prediction systems: a big data analytics perspective. Big Data Res Oneto L, Fumeo E, Clerico G, Canepa R, Papa F, Dambra C, Anguita D (2017) Train delay prediction systems: a big data analytics perspective. Big Data Res
Zurück zum Zitat Pedersen E, Bongo LA (2017) Large-scale biological meta-database management. Future Gener Comput Syst 67:481–489 Pedersen E, Bongo LA (2017) Large-scale biological meta-database management. Future Gener Comput Syst 67:481–489
Zurück zum Zitat Peralta D, García S, Benitez JM, Herrera F (2017) Minutiae-based fingerprint matching decomposition: methodology for big data frameworks. Inf Sci 408:198–212 Peralta D, García S, Benitez JM, Herrera F (2017) Minutiae-based fingerprint matching decomposition: methodology for big data frameworks. Inf Sci 408:198–212
Zurück zum Zitat Plimpton SJ, Shead T (2014) Streaming data analytics via message passing with application to graph algorithms. J Parallel Distrib Comput 74(8):2687–2698 Plimpton SJ, Shead T (2014) Streaming data analytics via message passing with application to graph algorithms. J Parallel Distrib Comput 74(8):2687–2698
Zurück zum Zitat Prajapati DJ, Garg S, Chauhan NC (2017) MapReduce based multilevel consistent and inconsistent association rule detection from big data using interestingness measures. Big Data Res 9:18–27 Prajapati DJ, Garg S, Chauhan NC (2017) MapReduce based multilevel consistent and inconsistent association rule detection from big data using interestingness measures. Big Data Res 9:18–27
Zurück zum Zitat Pulgar-Rubio F, Rivera-Rivas AJ, Pérez-Godoy MD, González P, Carmona CJ, del Jesus MJ (2017) MEFASD-BD: multi-objective evolutionary fuzzy algorithm for subgroup discovery in big data environments-A MapReduce solution. Knowl-Based Syst 117:70–78 Pulgar-Rubio F, Rivera-Rivas AJ, Pérez-Godoy MD, González P, Carmona CJ, del Jesus MJ (2017) MEFASD-BD: multi-objective evolutionary fuzzy algorithm for subgroup discovery in big data environments-A MapReduce solution. Knowl-Based Syst 117:70–78
Zurück zum Zitat Qian J, Lv P, Yue X, Liu C, Jing Z (2015) Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl-Based Syst 73:18–31 Qian J, Lv P, Yue X, Liu C, Jing Z (2015) Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl-Based Syst 73:18–31
Zurück zum Zitat Rahman MN, Esmailpour A, Zhao J (2016) Machine learning with big data an efficient electricity generation forecasting system. Big Data Res 5:9–15 Rahman MN, Esmailpour A, Zhao J (2016) Machine learning with big data an efficient electricity generation forecasting system. Big Data Res 5:9–15
Zurück zum Zitat Rathore MM, Ahmad A, Paul A, Rho S (2016) Urban planning and building smart cities based on the internet of things using big data analytics. Comput Netw 101:63–80 Rathore MM, Ahmad A, Paul A, Rho S (2016) Urban planning and building smart cities based on the internet of things using big data analytics. Comput Netw 101:63–80
Zurück zum Zitat Rathore MM, Paul A, Ahmad A, Chilamkurthi N, Hong WH, Seo H (2017) Real-time secure communication for Smart City in high-speed big data environment. Future Gener Comput Syst Rathore MM, Paul A, Ahmad A, Chilamkurthi N, Hong WH, Seo H (2017) Real-time secure communication for Smart City in high-speed big data environment. Future Gener Comput Syst
Zurück zum Zitat Ruan G, Zhang H (2017) Closed-loop big data analysis with visualization and scalable computing. Big Data Res 8:12–26 Ruan G, Zhang H (2017) Closed-loop big data analysis with visualization and scalable computing. Big Data Res 8:12–26
Zurück zum Zitat Sahal R, Khafagy MH, Omara FA (2017) Exploiting coarse-grained reused-based opportunities in Big Data multi-query optimization. J Comput Sci Sahal R, Khafagy MH, Omara FA (2017) Exploiting coarse-grained reused-based opportunities in Big Data multi-query optimization. J Comput Sci
Zurück zum Zitat Singh H, Bawa S (2017) A MapReduce-based scalable discovery and indexing of structured big data. Future Gener Comput Syst 73:32–43 Singh H, Bawa S (2017) A MapReduce-based scalable discovery and indexing of structured big data. Future Gener Comput Syst 73:32–43
Zurück zum Zitat Singh K, Guntuku SC, Thakur A, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection using random forests. Inf Sci 278:488–497 Singh K, Guntuku SC, Thakur A, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection using random forests. Inf Sci 278:488–497
Zurück zum Zitat Singh S, Garg R, Mishra PK (2017) Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. Comput Electr Eng Singh S, Garg R, Mishra PK (2017) Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. Comput Electr Eng
Zurück zum Zitat Spivak A, Razumovskiy A, Nasonov D, Boukhanovsky A, Redice A (2018) Storage tier-aware replicative data reorganization with prioritization for efficient workload processing. Future Gener Comput Syst 79:618–629 Spivak A, Razumovskiy A, Nasonov D, Boukhanovsky A, Redice A (2018) Storage tier-aware replicative data reorganization with prioritization for efficient workload processing. Future Gener Comput Syst 79:618–629
Zurück zum Zitat Sun D, Zhang G, Yang S, Zheng W, Khan SU, Li K (2015) Re-stream: real-time and energy-efficient resource scheduling in big data stream computing environments. Inf Sci 319:92–112MathSciNet Sun D, Zhang G, Yang S, Zheng W, Khan SU, Li K (2015) Re-stream: real-time and energy-efficient resource scheduling in big data stream computing environments. Inf Sci 319:92–112MathSciNet
Zurück zum Zitat Tennant M, Stahl F, Rana O, Gomes JB (2017) Scalable real-time classification of data streams with concept drift. Future Gener Comput Syst 75:187–199 Tennant M, Stahl F, Rana O, Gomes JB (2017) Scalable real-time classification of data streams with concept drift. Future Gener Comput Syst 75:187–199
Zurück zum Zitat Triguero I, Peralta D, Bacardit J, García S, Herrera F (2015) MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150:331–345 Triguero I, Peralta D, Bacardit J, García S, Herrera F (2015) MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150:331–345
Zurück zum Zitat Tripathy BK, Mittal D (2016) Hadoop based uncertain possibilistic kernelized c-means algorithms for image segmentation and a comparative analysis. Appl Soft Comput 46:886–923 Tripathy BK, Mittal D (2016) Hadoop based uncertain possibilistic kernelized c-means algorithms for image segmentation and a comparative analysis. Appl Soft Comput 46:886–923
Zurück zum Zitat Tsai CW, Liu SJ, Wang YC (2017) A parallel metaheuristic data clustering framework for cloud. J Parallel Distrib Comput Tsai CW, Liu SJ, Wang YC (2017) A parallel metaheuristic data clustering framework for cloud. J Parallel Distrib Comput
Zurück zum Zitat Um JH, Lee S, Kim TH, Jeong CH, Song SK, Jung H (2016) Semantic complex event processing model for reasoning research activities. Neurocomputing 209:39–45 Um JH, Lee S, Kim TH, Jeong CH, Song SK, Jung H (2016) Semantic complex event processing model for reasoning research activities. Neurocomputing 209:39–45
Zurück zum Zitat Vennila V, Kannan AR (2016) Symmetric matrix-based predictive classifier for big data computation and information sharing in cloud. Comput Electr Eng 56:831–841 Vennila V, Kannan AR (2016) Symmetric matrix-based predictive classifier for big data computation and information sharing in cloud. Comput Electr Eng 56:831–841
Zurück zum Zitat Wang H, Belhassena A (2017) Parallel trajectory search based on distributed index. Inf Sci 388:62–83 Wang H, Belhassena A (2017) Parallel trajectory search based on distributed index. Inf Sci 388:62–83
Zurück zum Zitat Wang B, Huang S, Qiu J, Liu Y, Wang G (2015) Parallel online sequential extreme learning machine based on MapReduce. Neurocomputing 149:224–232 Wang B, Huang S, Qiu J, Liu Y, Wang G (2015) Parallel online sequential extreme learning machine based on MapReduce. Neurocomputing 149:224–232
Zurück zum Zitat Wang H, Xu Z, Fujita H, Liu S (2016) Towards felicitous decision making: an overview on challenges and trends of big data. Inf Sci 367:747–765 Wang H, Xu Z, Fujita H, Liu S (2016) Towards felicitous decision making: an overview on challenges and trends of big data. Inf Sci 367:747–765
Zurück zum Zitat Wang J, He C, Liu Y, Tian G, Peng I, Xing J, Wang FL (2017a) Efficient alarm behavior analytics for telecom networks. Inf Sci 402:1–14 Wang J, He C, Liu Y, Tian G, Peng I, Xing J, Wang FL (2017a) Efficient alarm behavior analytics for telecom networks. Inf Sci 402:1–14
Zurück zum Zitat Wang, Y., Geng, S., & Gao, H. (2017). A proactive decision support method based on deep reinforcement learning and state partition. Knowl-Based Syst Wang, Y., Geng, S., & Gao, H. (2017). A proactive decision support method based on deep reinforcement learning and state partition. Knowl-Based Syst
Zurück zum Zitat Xia Y, Chen J, Lu X, Wang C, Xu C (2016) Big traffic data processing framework for intelligent monitoring and recording systems. Neurocomputing 181:139–146 Xia Y, Chen J, Lu X, Wang C, Xu C (2016) Big traffic data processing framework for intelligent monitoring and recording systems. Neurocomputing 181:139–146
Zurück zum Zitat Yuan J, Chen M, Jiang T, Li T (2017) Complete tolerance relation based parallel filling for incomplete energy big data. Knowl-Based Syst 132:215–225 Yuan J, Chen M, Jiang T, Li T (2017) Complete tolerance relation based parallel filling for incomplete energy big data. Knowl-Based Syst 132:215–225
Zurück zum Zitat Zhang F, Cao J, Khan SU, Li K, Hwang K (2015) A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications. Future Gener Comput Syst 43:149–160 Zhang F, Cao J, Khan SU, Li K, Hwang K (2015) A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications. Future Gener Comput Syst 43:149–160
Zurück zum Zitat Zhang CY, Chen CP, Chen D, Ng KT (2016) MapReduce based distributed learning algorithm for restricted Boltzmann machine. Neurocomputing 198:4–11 Zhang CY, Chen CP, Chen D, Ng KT (2016) MapReduce based distributed learning algorithm for restricted Boltzmann machine. Neurocomputing 198:4–11
Metadaten
Titel
The state of the art and taxonomy of big data analytics: view from new big data framework
verfasst von
Azlinah Mohamed
Maryam Khanian Najafabadi
Yap Bee Wah
Ezzatul Akmal Kamaru Zaman
Ruhaila Maskat
Publikationsdatum
01.02.2019
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 2/2020
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-019-09685-9

Weitere Artikel der Ausgabe 2/2020

Artificial Intelligence Review 2/2020 Zur Ausgabe

Premium Partner