2019 | OriginalPaper | Chapter
On Developing Data Science
Author : Michael L. Brodie
Published in: Applied Data Science
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Understanding phenomena based on the facts—on the data—is a touchstone of data science. The power of evidence-based, inductive reasoning distinguishes data science from science. Hence, this chapter argues that, in its initial stages, data science applications and the data science discipline itself be developed inductively and deductively in a virtuous cycle.The virtues of the twentieth Century Virtuous Cycle (aka virtuous hardware-software cycle, Intel-Microsoft virtuous cycle) that built the personal computer industry (National Research Council, The new global ecosystem in advanced computing: Implications for U.S. competitiveness and national security. The National Academies Press, Washington, DC, 2012) were being grounded in reality and being self-perpetuating—more powerful hardware enabled more powerful software that required more powerful hardware, enabling yet more powerful software, and so forth. Being grounded in reality—solving genuine problems at scale—was critical to its success, as it will be for data science. While it lasted, it was self-perpetuating, due to a constant flow of innovation, and to benefitting all participants—producers, consumers, the industry, the economy, and society. It is a wonderful success story for twentieth Century applied science. Given the success of virtuous cycles in developing modern technology, virtuous cycles grounded in reality should be used to develop data science, driven by the wisdom of the sixteenth Century proverb, Necessity is the mother of invention.This chapter explores this hypothesis using the example of the evolution of database management systems over the last 40 years. For the application of data science to be successful and virtuous, it should be grounded in a cycle that encompasses industry (i.e., real problems), research, development, and delivery. This chapter proposes applying the principles and lessons of the virtuous cycle to the development of data science applications; to the development of the data science discipline itself, for example, a data science method; and to the development of data science education; all focusing on the critical role of collaboration in data science research and management, thereby addressing the development challenges faced by the more than 150 Data Science Research Institutes (DSRIs) worldwide. A companion chapter (Brodie, What is Data Science, in Braschler et al (Eds.), Applied data science – Lessons learned for the data-driven business, Springer 2019), addresses essential questions that DSRIs should answer in preparation for the developments proposed here: What is data science? What is world-class data science research?