Skip to main content
Top

2018 | OriginalPaper | Chapter

A Comprehensive Survey and Open Challenges of Mining Bigdata

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Bigdata comes into big picture in early 2000, since it becomes focus of researchers and data scientist. Main purpose of research and development in the field of Bigdata is to extract and predicts meaningful information from large amount of structured as well as unstructured real world data. In this paper, systematic review of background, existing related technologies used by various big enterprises, data researchers, government officials has been discussed. In addition, presented standardized complex processes to extract useful information such as data generation, storage, modeling/analysis, visualization and interpretation. Finally discusses open issues, challenges and point out the emerging directions in which researchers can work in the age of Bigdata

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford digital library metadata architecture. Int. J. Digit. Libr. 1, 108–121 (1997)CrossRef Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford digital library metadata architecture. Int. J. Digit. Libr. 1, 108–121 (1997)CrossRef
2.
go back to reference Lohr, S.: The age of big data. New York Times 11 (2012) Lohr, S.: The age of big data. New York Times 11 (2012)
3.
go back to reference Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)CrossRef Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)CrossRef
4.
go back to reference Alexandros, L., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef Alexandros, L., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef
5.
go back to reference Gantz, J., Reinsel, D.: Extracting value from chaos. IDC iView, pp. 1–12 (2011) Gantz, J., Reinsel, D.: Extracting value from chaos. IDC iView, pp. 1–12 (2011)
6.
go back to reference Turner, V., Reinsel, D., Gantz, J.F., Minton, S.: The digital universe of opportunities: rich data and the increasing value of the internet of things. IDC Anal. Future (2014) Turner, V., Reinsel, D., Gantz, J.F., Minton, S.: The digital universe of opportunities: rich data and the increasing value of the internet of things. IDC Anal. Future (2014)
7.
go back to reference Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003). ACM Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003). ACM
8.
go back to reference Jeffrey, D., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Jeffrey, D., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
9.
go back to reference Chang, F.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)CrossRef Chang, F.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)CrossRef
10.
go back to reference Győrödi, C., Győrödi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB vs. MySQL. In: 2015 13th International Conference on Engineering of Modern Electric Systems (EMES), Oradea (2015) Győrödi, C., Győrödi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB vs. MySQL. In: 2015 13th International Conference on Engineering of Modern Electric Systems (EMES), Oradea (2015)
11.
go back to reference DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007). ACM DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007). ACM
12.
go back to reference Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRef Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRef
13.
go back to reference Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytic. CIDR 11, 261–272 (2011) Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytic. CIDR 11, 261–272 (2011)
14.
go back to reference Nagwani, N.K.: Summarizing large text collection using topic modeling and clustering based on MapReduce framework. J. Big Data 2(1), 1–18 (2015)CrossRef Nagwani, N.K.: Summarizing large text collection using topic modeling and clustering based on MapReduce framework. J. Big Data 2(1), 1–18 (2015)CrossRef
15.
go back to reference Palit, I., Reddy, C.K.: Scalable and parallel boosting with mapreduce. IEEE Trans. Knowl. Data Eng. 24(10), 1904–1916 (2012)CrossRef Palit, I., Reddy, C.K.: Scalable and parallel boosting with mapreduce. IEEE Trans. Knowl. Data Eng. 24(10), 1904–1916 (2012)CrossRef
16.
go back to reference Wu, C.-J., Ku, C.-F., Ho, J.-M., Chen, M.-S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016) Wu, C.-J., Ku, C.-F., Ho, J.-M., Chen, M.-S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016)
17.
go back to reference Rathore, M.M., Paul, A., Ahmad, A., Rho, S.: Urban planning and building smart cities based on the internet of things using big data analytics. Comput. Netw. (2016) Rathore, M.M., Paul, A., Ahmad, A., Rho, S.: Urban planning and building smart cities based on the internet of things using big data analytics. Comput. Netw. (2016)
18.
go back to reference SAS Institute Inc.: Five big data challenges and how to overcome them with visual analytics. Report, pp. 1–2 (2013) SAS Institute Inc.: Five big data challenges and how to overcome them with visual analytics. Report, pp. 1–2 (2013)
19.
go back to reference Lü, H., Fogarty, J.: Cascaded treemaps: examining the visibility and stability of structure in treemaps. In: Proceedings of Graphics Interface, Toronto, ON, Canada, pp. 259–266 (2014) Lü, H., Fogarty, J.: Cascaded treemaps: examining the visibility and stability of structure in treemaps. In: Proceedings of Graphics Interface, Toronto, ON, Canada, pp. 259–266 (2014)
20.
go back to reference Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: IEEE 30th International Conference on Data Engineering, IL, Chicago, pp. 6–9 (2013) Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: IEEE 30th International Conference on Data Engineering, IL, Chicago, pp. 6–9 (2013)
21.
go back to reference Riondato, M., DeBrabant, J.A., Fonseca, R., Upfal, E.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of the CIKM, pp. 85–94. ACM (2012) Riondato, M., DeBrabant, J.A., Fonseca, R., Upfal, E.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of the CIKM, pp. 85–94. ACM (2012)
22.
go back to reference Malek, M., Kadima, H.: Searching frequent itemsets by clustering data: towards a parallel approach using mapreduce. In: Proceedings of the WISE 2011 and 2012 Workshops, pp. 251–258. Springer, Heidelberg (2013) Malek, M., Kadima, H.: Searching frequent itemsets by clustering data: towards a parallel approach using mapreduce. In: Proceedings of the WISE 2011 and 2012 Workshops, pp. 251–258. Springer, Heidelberg (2013)
23.
go back to reference Zhang, F., et al.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef Zhang, F., et al.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef
24.
go back to reference Joao, G.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)CrossRef Joao, G.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)CrossRef
25.
go back to reference Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 345–353 (2014) Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 345–353 (2014)
26.
go back to reference Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl.-Based Syst. 79, 36–42 (2015)CrossRef Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl.-Based Syst. 79, 36–42 (2015)CrossRef
27.
28.
go back to reference Shekhar, S.: Spatial big data challenges. In: Keynote at ARO/NSF Workshop on Big Data at Large: Applications and Algorithms, Durham, NC (2012) Shekhar, S.: Spatial big data challenges. In: Keynote at ARO/NSF Workshop on Big Data at Large: Applications and Algorithms, Durham, NC (2012)
Metadata
Title
A Comprehensive Survey and Open Challenges of Mining Bigdata
Authors
Bharat Tidke
Rupa Mehta
Jenish Dhanani
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-63673-3_53

Premium Partner