Abstract
Data management research, systems, and technologies have drastically improved the availability of data analysis capabilities, particularly for non-experts, due in part to low-entry barriers and reduced ownership costs (e.g., for data management infrastructures and applications). Major reasons for the widespread success of database systems and today's multi-billion dollar data management market include data independence, separating physical representation and storage from the actual information, and declarative languages, separating the program specification from its intended execution environment. In contrast, today's big data solutions do not offer data independence and declarative specification. As a result, big data technologies are mostly employed in newly-established companies with IT-savvy employees or in large well-established companies with big IT departments. We argue that current big data solutions will continue to fall short of widespread adoption, due to usability problems, despite the fact that in-situ data analytics technologies achieve a good degree of schema independence. In particular, we consider the lack of a declarative specification to be a major road-block, contributing to the scarcity in available data scientists available and limiting the application of big data to the IT-savvy industries. In particular, data scientists currently have to spend a lot of time on tuning their data analysis programs for specific data characteristics and a specific execution environment. We believe that the research community needs to bring the powerful concepts of declarative specification to current data analysis systems, in order to achieve the broad big data technology adoption and effectively deliver the promise that novel big data technologies offer.
- A. Alexandrov, R. Bergmann, S. Ewen, et al.: "The Stratosphere Platform for Big Data Analytics," VLDB Journal 05/2014. Google ScholarDigital Library
- S. Schelter, S. Ewen, K. Tzoumas, et al.: "All Roads Lead to Rome: Optimistic Recovery for Distributed Iterative Data Processing," CIKM 2013: 1919--1928. Google ScholarDigital Library
- S. Ewen, K. Tzoumas, M. Kaufmann, et al.: "Spinning Fast Iterative Data Flows," PVLDB 5(11): 1268--1279 (2012). Google ScholarDigital Library
- M. Heimel, V. Markl: "A First Step Towards GPU-assisted Query Optimization," ADMS@VLDB 2012: 33--44.Google Scholar
- D. Battré, S. Ewen, F. Hueske, et al: "Nephele/PACTs: programming model and execution framework for web-scale analytical processing," SoCC 2010: 119--130. Google ScholarDigital Library
- M. Zaharia, M. Chowdhury, M. J. Franklin, et al: "Spark: cluster computing with working sets," HotCloud (2010). Google ScholarDigital Library
- D. Jiang, G. Chen, B. C. Ooi, K.-L. Tan, S. Wu: "epiC: an Extensible and Scalable System for Processing Big Data," PVLDB 7(7): 541--552 (2014). Google ScholarDigital Library
- S. Alsubaiee, Y. Altowim, H. Altwaijry, et al: "ASTERIX: An Open Source System for Big Data Management and Analysis." PVLDB 5(12): 1898--1901 (2012). Google ScholarDigital Library
- Stratosphere, http://www.stratosphere.eu, last checked Jul 7, 2014Google Scholar
- Apache Flink Incubator Project, http://flink.incubator.apache.org/ last checked Jul 7, 2014Google Scholar
Recommendations
Breaking BAD: a data serving vision for big active data
DEBS '16: Proceedings of the 10th ACM International Conference on Distributed and Event-based SystemsVirtually all of today's Big Data systems are passive in nature. Here we describe a project to shift Big Data platforms from passive to active. We detail a vision for a scalable system that can continuously and reliably capture Big Data to enable timely ...
Barriers to big data analytics in manufacturing supply chains: A case study from Bangladesh
Highlights- We identify and examine the critical barriers to big data analytics.
- We apply ...
AbstractRecently, big data (BD) has attracted researchers and practitioners due to its potential usefulness in decision-making processes. Big data analytics (BDA) is becoming increasingly popular among manufacturing companies as it helps gain ...
Breaking the vicious circle: A case study on why AI for software analytics and business intelligence does not take off in practice
AbstractIn recent years, the application of artificial intelligence (AI) has become an integral part of a wide range of areas, including software engineering. By analyzing various data sources generated in software engineering, it can provide ...
Comments