Skip to main content

2018 | OriginalPaper | Buchkapitel

A Comprehensive Survey and Open Challenges of Mining Bigdata

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Bigdata comes into big picture in early 2000, since it becomes focus of researchers and data scientist. Main purpose of research and development in the field of Bigdata is to extract and predicts meaningful information from large amount of structured as well as unstructured real world data. In this paper, systematic review of background, existing related technologies used by various big enterprises, data researchers, government officials has been discussed. In addition, presented standardized complex processes to extract useful information such as data generation, storage, modeling/analysis, visualization and interpretation. Finally discusses open issues, challenges and point out the emerging directions in which researchers can work in the age of Bigdata

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford digital library metadata architecture. Int. J. Digit. Libr. 1, 108–121 (1997)CrossRef Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford digital library metadata architecture. Int. J. Digit. Libr. 1, 108–121 (1997)CrossRef
2.
Zurück zum Zitat Lohr, S.: The age of big data. New York Times 11 (2012) Lohr, S.: The age of big data. New York Times 11 (2012)
3.
Zurück zum Zitat Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)CrossRef Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)CrossRef
4.
Zurück zum Zitat Alexandros, L., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef Alexandros, L., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRef
5.
Zurück zum Zitat Gantz, J., Reinsel, D.: Extracting value from chaos. IDC iView, pp. 1–12 (2011) Gantz, J., Reinsel, D.: Extracting value from chaos. IDC iView, pp. 1–12 (2011)
6.
Zurück zum Zitat Turner, V., Reinsel, D., Gantz, J.F., Minton, S.: The digital universe of opportunities: rich data and the increasing value of the internet of things. IDC Anal. Future (2014) Turner, V., Reinsel, D., Gantz, J.F., Minton, S.: The digital universe of opportunities: rich data and the increasing value of the internet of things. IDC Anal. Future (2014)
7.
Zurück zum Zitat Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003). ACM Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003). ACM
8.
Zurück zum Zitat Jeffrey, D., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Jeffrey, D., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
9.
Zurück zum Zitat Chang, F.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)CrossRef Chang, F.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)CrossRef
10.
Zurück zum Zitat Győrödi, C., Győrödi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB vs. MySQL. In: 2015 13th International Conference on Engineering of Modern Electric Systems (EMES), Oradea (2015) Győrödi, C., Győrödi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB vs. MySQL. In: 2015 13th International Conference on Engineering of Modern Electric Systems (EMES), Oradea (2015)
11.
Zurück zum Zitat DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007). ACM DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007). ACM
12.
Zurück zum Zitat Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRef Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRef
13.
Zurück zum Zitat Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytic. CIDR 11, 261–272 (2011) Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytic. CIDR 11, 261–272 (2011)
14.
Zurück zum Zitat Nagwani, N.K.: Summarizing large text collection using topic modeling and clustering based on MapReduce framework. J. Big Data 2(1), 1–18 (2015)CrossRef Nagwani, N.K.: Summarizing large text collection using topic modeling and clustering based on MapReduce framework. J. Big Data 2(1), 1–18 (2015)CrossRef
15.
Zurück zum Zitat Palit, I., Reddy, C.K.: Scalable and parallel boosting with mapreduce. IEEE Trans. Knowl. Data Eng. 24(10), 1904–1916 (2012)CrossRef Palit, I., Reddy, C.K.: Scalable and parallel boosting with mapreduce. IEEE Trans. Knowl. Data Eng. 24(10), 1904–1916 (2012)CrossRef
16.
Zurück zum Zitat Wu, C.-J., Ku, C.-F., Ho, J.-M., Chen, M.-S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016) Wu, C.-J., Ku, C.-F., Ho, J.-M., Chen, M.-S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016)
17.
Zurück zum Zitat Rathore, M.M., Paul, A., Ahmad, A., Rho, S.: Urban planning and building smart cities based on the internet of things using big data analytics. Comput. Netw. (2016) Rathore, M.M., Paul, A., Ahmad, A., Rho, S.: Urban planning and building smart cities based on the internet of things using big data analytics. Comput. Netw. (2016)
18.
Zurück zum Zitat SAS Institute Inc.: Five big data challenges and how to overcome them with visual analytics. Report, pp. 1–2 (2013) SAS Institute Inc.: Five big data challenges and how to overcome them with visual analytics. Report, pp. 1–2 (2013)
19.
Zurück zum Zitat Lü, H., Fogarty, J.: Cascaded treemaps: examining the visibility and stability of structure in treemaps. In: Proceedings of Graphics Interface, Toronto, ON, Canada, pp. 259–266 (2014) Lü, H., Fogarty, J.: Cascaded treemaps: examining the visibility and stability of structure in treemaps. In: Proceedings of Graphics Interface, Toronto, ON, Canada, pp. 259–266 (2014)
20.
Zurück zum Zitat Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: IEEE 30th International Conference on Data Engineering, IL, Chicago, pp. 6–9 (2013) Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: IEEE 30th International Conference on Data Engineering, IL, Chicago, pp. 6–9 (2013)
21.
Zurück zum Zitat Riondato, M., DeBrabant, J.A., Fonseca, R., Upfal, E.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of the CIKM, pp. 85–94. ACM (2012) Riondato, M., DeBrabant, J.A., Fonseca, R., Upfal, E.: PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce. In: Proceedings of the CIKM, pp. 85–94. ACM (2012)
22.
Zurück zum Zitat Malek, M., Kadima, H.: Searching frequent itemsets by clustering data: towards a parallel approach using mapreduce. In: Proceedings of the WISE 2011 and 2012 Workshops, pp. 251–258. Springer, Heidelberg (2013) Malek, M., Kadima, H.: Searching frequent itemsets by clustering data: towards a parallel approach using mapreduce. In: Proceedings of the WISE 2011 and 2012 Workshops, pp. 251–258. Springer, Heidelberg (2013)
23.
Zurück zum Zitat Zhang, F., et al.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef Zhang, F., et al.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef
24.
Zurück zum Zitat Joao, G.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)CrossRef Joao, G.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)CrossRef
25.
Zurück zum Zitat Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 345–353 (2014) Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 345–353 (2014)
26.
Zurück zum Zitat Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl.-Based Syst. 79, 36–42 (2015)CrossRef Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl.-Based Syst. 79, 36–42 (2015)CrossRef
27.
28.
Zurück zum Zitat Shekhar, S.: Spatial big data challenges. In: Keynote at ARO/NSF Workshop on Big Data at Large: Applications and Algorithms, Durham, NC (2012) Shekhar, S.: Spatial big data challenges. In: Keynote at ARO/NSF Workshop on Big Data at Large: Applications and Algorithms, Durham, NC (2012)
Metadaten
Titel
A Comprehensive Survey and Open Challenges of Mining Bigdata
verfasst von
Bharat Tidke
Rupa Mehta
Jenish Dhanani
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-63673-3_53