nach oben

Erschienen in:

2011 | OriginalPaper | Buchkapitel

3. ECL/HPCC: A Unified Approach to Big Data

verfasst von : Anthony M. Middleton, David Alan Bayliss, Gavin Halliday

Erschienen in: Handbook of Data Intensive Computing

Verlag: Springer New York

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

As a result of the continuing information explosion, many organizations are experiencing what is now called the “Big Data” problem. This results in the inability of organizations to effectively use massive amounts of their data in datasets which have grown too big to process in a timely manner. Data-intensive computing represents a new computing paradigm [26] which can address the big data problem using high-performance architectures supporting scalable parallel processing to allow government, commercial organizations, and research environments to process massive amounts of data and implement new applications previously thought to be impractical or infeasible.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Architecting Data-Intensive Software Systems

Nächstes Kapitel Scalable Storage for Data-Intensive Computing

Abbas, A. (2004). Grid computing: A practical guide to technology and applications. Hingham, MA: Charles River Media, Inc.

Agichtein, E. (2004). Scaling information extraction to large document collections: Microsoft Research.

Agichtein, E., & Ganti, V. (2004). Mining reference tables for automatic text segmentation. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, 20–29.

Bayliss, D. A. (2010a). Aggregated data analysis: The paradigm shift (Whitepaper): LexisNexis.

Bayliss, D. A. (2010b). Enterrprise control language overview (Whitepaper): LexisNexis.

Bayliss, D. A. (2010c). Thinking declaratively (Whitepaper).

Berman, F. (2008). Got data? A guide to data preservation in the information age. Communications of the ACM, 51(12), 50–56.CrossRef

Bryant, R. E. (2008). Data intensive scalable computing. Carnegie Mellon University. Retrieved August 10, 2009, from http://www.cs.cmu.edu/$nsim$bryant/presentations/DISCconcept.ppt

Buyya, R. (1999). High performance cluster computing. Upper Saddle River, NJ: Prentice Hall.

10.

Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud computing and emerging it platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, 25(6), 599–616.CrossRef

11.

Cerf, V. G. (2007). An information avalanche. IEEE Computer, 40(1), 104–105.CrossRef

12.

Chaiken, R., Jenkins, B., Larson, P.-A., Ramsey, B., Shakib, D., Weaver, S., et al. (2008). Scope: Easy and efficient parallel processing of massive data sets. Proceedings of the VLDB Endowment, 1, 1265–1276.

13.

Dean, J., & Ghemawat, S. (2004). Mapreduce: Simplified data processing on large clusters. Proceedings of the Sixth Symposium on Operating System Design and Implementation (OSDI).

14.

Dean, J., & Ghemawat, S. (2010). Mapreduce: A flexible data processing tool. Communications of the ACM, 53(1), 72–77.CrossRef

15.

Dowd, K., & Severance, C. (1998). High performance computing. Sebastopol, CA: O’Reilly and Associates, Inc.

16.

Gantz, J. F., Reinsel, D., Chute, C., Schlichting, W., McArthur, J., Minton, S., et al. (2007). The expanding digital universe (White Paper): IDC.

17.

Gates, A. F., Natkovich, O., Chopra, S., Kamath, P., Narayanamurthy, S. M., Olston, C., et al. (2009, Aug 24–28). Building a high-level dataflow system on top of map-reduce: The pig experience. Proceedings of the 35th International Conference on Very Large Databases (VLDB 2009), Lyon, France.

18.

Gokhale, M., Cohen, J., Yoo, A., & Miller, W. M. (2008). Hardware technologies for high-performance data-intensive computing. IEEE Computer, 41(4), 60–68.CrossRef

19.

Gorton, I., Greenfield, P., Szalay, A., & Williams, R. (2008). Data-intensive computing in the 21st century. IEEE Computer, 41(4), 30–32.CrossRef

20.

Gray, J. (2008). Distributed computing economics. ACM Queue, 6(3), 63–68.CrossRef

21.

Grossman, R., & Gu, Y. (2008). Data mining using high performance data clouds: Experimental studies using sector and sphere. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA.

22.

Grossman, R. L., Gu, Y., Sabala, M., & Zhang, W. (2009). Compute and storage clouds using wide area high performance networks. Future Generation Computer Systems, 25(2), 179–183.CrossRef

23.

Gu, Y., & Grossman, R. L. (2009). Lessons learned from a year’s worth of benchmarks of large data clouds. Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, Portland, Oregon.

24.

Hellerstein, J. M. (2010). The declarative imperative. SIGMOD Record, 39(1), 5–19.CrossRef

25.

Johnston, W. E. (1998). High-speed, wide area, data intensive computing: A ten year retrospective, Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing: IEEE Computer Society.

26.

Kouzes, R. T., Anderson, G. A., Elbert, S. T., Gorton, I., & Gracio, D. K. (2009). The changing paradigm of data-intensive computing. Computer, 42(1), 26–34.CrossRef

27.

Liu, H., & Orban, D. (2008). Gridbatch: Cloud computing for large-scale data-intensive batch applications. Proceedings of the Eighth IEEE International Symposium on Cluster Computing and the Grid, 295–305.

28.

Llor, X., Acs, B., Auvil, L. S., Capitanu, B., Welge, M. E., & Goldberg, D. E. (2008). Meandre: Semantic-driven data-intensive flows in the clouds. Proceedings of the Fourth IEEE International Conference on eScience, 238–245.

29.

Lyman, P., & Varian, H. R. (2003). How much information? 2003 (Research Report): School of Information Management and Systems, University of California at Berkeley.

30.

Middleton, A. M. (2009). Data-intensive computing solutions (Whitepaper): LexisNexis.

31.

NSF. (2009). Data-intensive computing. National Science Foundation. Retrieved August 10, 2009, from http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503324&org=IIS

32.

Nyland, L. S., Prins, J. F., Goldberg, A., & Mills, P. H. (2000). A design methodology for data-parallel applications. IEEE Transactions on Software Engineering, 26(4), 293–314.CrossRef

33.

O’Malley, O. (2008). Introduction to hadoop. Retrieved August 10, 2009, from http://wiki.apache.org/hadoop-data/attachments/HadoopPresentations/attachments/YahooHadoopIntro-apachecon-us-2008.pdf

34.

Olston, C., Reed, B., Srivastava, U., Kumar, R., & Tomkins, A. (2008, June 9–12). Pig latin: A not-so_foreign language for data processing. Proceedings of the 28th ACM SIGMOD/PODS International Conference on Management of Data/Principles of Database Systems, Vancouver, BC, Canada, 1099–1110.

35.

Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., Dewitt, D. J., Madden, S., et al. (2009, June 29–July 2). A comparison of approaches to large-scale data analysis. Proceedings of the 35th SIGMOD international conference on Management of data, Providence, RI, 165–168.

36.

Pike, R., Dorward, S., Griesemer, R., & Quinlan, S. (2004). Interpreting the data: Parallel analysis with sawzall. Scientific Programming Journal, 13(4), 227–298.

37.

PNNL. (2008). Data intensive computing. Pacific Northwest National Laboratory. Retrieved August 10, 2009, from http://www.cs.cmu.edu/$nsim$bryant/presentations/DISC-concept.ppt

38.

Ravichandran, D., Pantel, P., & Hovy, E. (2004). The terascale challenge. Proceedings of the KDD Workshop on Mining for and from the Semantic Web.

39.

Rencuzogullari, U., & Dwarkadas, S. (2001). Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations. Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Snowbird, UT, 72–81.

40.

Skillicorn, D. B., & Talia, D. (1998). Models and languages for parallel computation. ACM Computing Surveys, 30(2), 123–169.CrossRef

41.

White, T. (2009). Hadoop: The definitive guide (First ed.). Sebastopol, CA: O’Reilly Media Inc.

42.

Yu, Y., Gunda, P. K., & Isard, M. (2009). Distributed aggregation for data-parallel computing: Interfaces and implementations. Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Big Sky, Montana, USA, 247–260.

Titel: ECL/HPCC: A Unified Approach to Big Data
verfasst von: Anthony M. Middleton
David Alan Bayliss
Gavin Halliday
Verlag: Springer New York
Buch: Handbook of Data Intensive Computing
Print ISBN: 978-1-4614-1414-8

Electronic ISBN: 978-1-4614-1415-5

Copyright-Jahr: 2011
DOI: https://doi.org/10.1007/978-1-4614-1415-5_3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"