Skip to main content
Top

2023 | Book

Data Science for Entrepreneurship

Principles and Methods for Data Engineering, Analytics, Entrepreneurship, and the Society

insite
SEARCH

About this book

The fast-paced technological development and the plethora of data create numerous opportunities waiting to be exploited by entrepreneurs. This book provides a detailed, yet practical, introduction to the fundamental principles of data science and how entrepreneurs and would-be entrepreneurs can take advantage of it. It walks the reader through sections on data engineering, and data analytics as well as sections on data entrepreneurship and data use in relation to society. The book also offers ways to close the research and practice gaps between data science and entrepreneurship. By having read this book, students of entrepreneurship courses will be better able to commercialize data-driven ideas that may be solutions to real-life problems. Chapters contain detailed examples and cases for a better understanding. Discussion points or questions at the end of each chapter help to deeply reflect on the learning material.

Table of Contents

Frontmatter
1. The Unlikely Wedlock Between Data Science and Entrepreneurship
Abstract
At first glance, data science and entrepreneurship are two seemingly unrelated fields of scientific research. In this introductory chapter of the book, we nonetheless try to bring together both disciplines, for which we introduce the concept of data entrepreneurship. We first discuss a number of well-known definitions of both data science and entrepreneurship, we disentangle them, and then we delineate striking differences and important similarities. We move on by discussing a number of prominent process models of both data science and entrepreneurship and again point at key differences and similarities in these typical processes. Both endeavors ultimately result in a conceptual framework that also forms the basis for the remainder of this book, which consists of sections on data engineering, data analytics, data entrepreneurship, and data and society, respectively.
Arjan van den Born, Werner Liebregts, Willem-Jan van den Heuvel

Data Engineering

Frontmatter
2. Big Data Engineering
Abstract
Going data intensive requires much effort not only in the design, but also in system/infrastructure configuration and deployment; most of these activities still happen via heavy manual fine-tuning and often costly trial-and-error experimentation.
This book chapter introduces the field of data engineering; sets out to list the key desiderata of modern-day, data-intensive applications and AI/ML analytics software; and argues the necessity of novel methods and techniques, including MLOps. All topics will be further elaborated in the remaining chapters of this first module on data engineering.
Damian Tamburri, Willem-Jan van den Heuvel
3. Data Governance
Abstract
Data-intensive products and services aim to turn big data to a value or strategic asset for the organizations. However, the inherent risk and cost of storing and managing a massive amount of data undermine the value creation from such products and services. Consequently, organizations need to adopt an appropriate data governance program to establish the necessary policies and structures in order to strike a balance between value creation and risk and cost. This chapter explores the data governance in detail, focusing on data governance principles, decision domains, and organizational structures. We discuss the data governance challenges, opportunities, and practices for big data and Internet of Things (IoT) domains. We also present two industrial big data applications/products whose data needs to be governed.
Indika Kumara, A. S. M. Kayes, Paul Mundt, Ralf Schneider
4. Big Data Architectures
Abstract
Big data has drawn huge attention from researchers and policy and decision makers in governments and enterprises. As the speed of information growth exceeded Moore’s law at the beginning of this new century, excessive data is making great troubles to businesses and organizations. Nevertheless, great potential and highly useful value are hidden in the huge volume of data. Throughout this chapter, we will discuss the main big data architectures that help coping with the above challenges. These architectures are technology-independent reference architectures that generalize published implementation architectures of big data use cases. This chapter is of value for academics, practitioners, and entrepreneurs alike. The analysis of existing reference architectures and success cases will facilitate architecture design, and the selection of most suitable technologies or commercial solutions, when constructing big data systems.
Martin Garriga, Geert Monsieur, Damian Tamburri
5. Data Engineering in Action
Abstract
For data engineers, the main task is to collect, organize, and process meaningful information that can be extracted from the plethora of data available from various sources to solve a specific business problem and generate value from it. However, in real-world problems, the time spent preparing the data infrastructure and selecting the right technologies to manage this data tends to be much longer than the time spent on implementing algorithms that analyze them.
In this chapter, we leverage the insights from previous chapters to delineate the process of gathering, managing, and analyzing data, by exemplifying three real-case scenarios from cybercrime fighting, protection of urban spaces, and biodiversity and sustainability domain. We carefully consider the main aspects involving decision-making in data engineering in practice. In addition, we analyze some of the main pros and cons of those decisions to guide the reader during the choice and evaluation of the desired technologies for the problem at hand.
Giuseppe Cascavilla, Stefano Dalla Palma, Stefan Driessen, Willem-Jan van den Heuvel, Daniel De Pascale, Mirella Sangiovanni, Gerard Schouten

Data Analytics

Frontmatter
6. Supervised Machine Learning in a Nutshell
Abstract
This chapter introduces the fundamental to supervised machine learning algorithms, namely the classification and regression problems. We explain each technique using an inspiring example and discuss how the corresponding algorithms work together with the data engineering pipelines. They provide some guidelines for implementing a classification or regression task for other problems and required materials to evaluate the supervised learning being used.
Majid Mohammadi, Dario Di Nucci
7. An Intuitive Introduction to Deep Learning
Abstract
This chapter presents an intuitive explanation of deep learning, focusing on convolutional neural networks. Starting with a perceptron and the concept of a linear decision boundary, the chapter guides the reader through networks of increasing depths. The concept of convolution is explained in an understandable manner. The integration of convolution with deep neural networks is shown to give rise to convolutional neural networks (CNNs). The final part of the chapter presents an example of the application of CNNs to skin cancer detection.
Eric Postma, Gerard Schouten
8. Sequential Experimentation and Learning
Abstract
While most chapters in this book cover situations in which data is available to draw inferences from, this chapter focuses on situations in which the learner selects actions that will generate new data which allow for future learning. Critically, it is assumed that only the data associated with the action taken by the learner is revealed: the potential outcomes of alternative actions are not disclosed. Generally, such sequential learning approaches are studied under the heading of reinforcement learning, a framework which we briefly introduce. Next, we discuss an extremely common, and useful, special case of reinforcement learning called the contextual multi-armed bandit (cMAB) problem. The cMAB problem formalizes the type of sequential experimentation encountered when, e.g., trying to choose between competing medical treatments or trying to select web content. We will describe several real-world problems that can be studied as a cMAB problem, and we will review effective solutions to this problem such as the UCB algorithm, Thompson sampling, and bootstrapped Thompson sampling. We focus on application but provide references for readers interested in the underlying theoretical results. Our focus on application is further amplified by our discussion of two recently developed software packages designed to experiment with (or solve) bandit problems. First, we detail contextual, an elaborate [R] package that allows users to quickly run simulations of various solutions to bandit problems and conduct offline policy evaluation. Second, we discuss streaming bandit, a popular python package to execute bandit policies in the field. This chapter should allow readers to get started with sequential experimentation methods in their own field or application area of interest.
Jules Kruijswijk, Robin van Emden, Maurits Kaptein
9. Advanced Analytics on Complex Industrial Data
Abstract
Complex data requires advanced analytics methods. In both science and industry, with the ever-increasing amounts of rich and large datasets available, advanced data analytics capabilities are required. Depending on the task at hand, several different techniques and methods can be used to analyze complex data (e.g., multivariate time series, log data, multimodal sensor data). In this chapter, we introduce approaches and methods for advanced analytics on complex industrial data. In particular, we focus on three exemplary methods for modeling and analyzing complex data, i.e., analytics for fault diagnosis, graph signal processing, and pattern mining on networks and graphs. We introduce the general approaches and methods and discuss their implementation in detail for an extended context. From a data perspective, we cover both sequential, i.e., time series, and relational, i.e., graph and network data, also bridging between both by analyzing signals on graphs. Besides setting the stage for the important theoretical background and concepts, we outline, in particular, the perspective on industrial applications and provide specific examples of the application of the presented methods in real-world industrial contexts, i.e., using complex industrial data.
Jurgen van den Hoogen, Stefan Bloemheuvel, Martin Atzmueller
10. Data Analytics in Action
Abstract
The previous chapters provided gentle introductions to various important topics in the area of data analytics. In this chapter, we present three real-life case studies that illustrate how the methods and approaches outlined in the previous chapters can be put into practice. The first case study shows how the Dutch company BagsID uses data analytics to improve the efficiency of luggage handling at airports. The second case study analyzes email communication between employees of a multinational service company to assess the efficacy of interventions aimed at stimulating the employees’ openness to innovation. The third case study considers how vehicle sensor data can be leveraged for Pay-How-You-Drive insurance policies. Together, the three case studies give a glimpse into the vast world of applied data analytics.
Gerard Schouten, Giuseppe Arena, Frederique van Leeuwen, Petra Heck, Joris Mulder, Rick Aalbers, Roger Leenders, Florian Böing-Messing

Data Entrepreneurship

Frontmatter
11. Data-Driven Decision-Making
Abstract
One of the most important prerequisites for creating impact with data science is the embedding of data science results in decision-making. One could say that for securing data science impact, data science should start and end with an extensive analysis of the related decision-making. The full embedding of data science in decision-making is often labeled data-driven decision-making (DDDM). This includes the use of data and data science concepts in preparing, processing, executing, and evaluating decisions. In this chapter, we describe the most relevant characteristics of decision-making, which are related to the need for, the form of, and the use of DDDM. Furthermore, we define DDDM, we discuss the most important reasons for applying DDDM, and we introduce the available concepts for the use of DDDM in programmed and nonprogrammed decision-making. We also include a brief description of the link between DDDM and successful data entrepreneurship. We conclude by listing some topics for discussion and further research.
Ronald Buijsse, Martijn Willemsen, Chris Snijders
12. Digital Entrepreneurship
Abstract
In the past few decades, technological development has led to the digitization and digitalization of (mostly developed) economies into what one could now call digital economies. In a digital economy, digital entrepreneurs pursue opportunities to produce and trade in digital artifacts on so-called digital artifact stores or platforms and/or to create these digital artifact “stores” or platforms themselves. In this chapter, we extensively discuss the effects of a number of typical features of digital economies, such as the presence of (indirect) network effects and digital technologies reducing a number of important economic costs, on the extent and nature of entrepreneurial activity in such economies. Digital platforms have become one of the most discussed forms of digital entrepreneurship. We elaborate on how to create and grow a successful digital platform firm, but also how to successfully compete on such digital platforms. The latter is not so easy, given a number of challenges that digital entrepreneurs typically face when being active on such platforms. Finally, we describe the main features of a digital entrepreneurial ecosystem, in which digital entrepreneurs typically operate, and explain how they can be supported and regulated by policymakers, if necessary at all.
Wim Naudé, Werner Liebregts
13. Strategy in the Era of Digital Disruption
Abstract
Although digital disruption has become a buzzword, we still lack a comprehensive understanding of what digital disruption actually is and what strategies can be employed to make it happen. Therefore, this chapter starts off by explaining the notion of digital disruption and consecutively illustrating its pervasiveness using a number of detailed examples. It then offers a sneak peek of the processes happening behind the scenes of digital disruption. Specifically, it explains business model innovations, innovation ecosystems, and platforms and network effects as the core strategic concepts that are of paramount importance for understanding the digitalization dynamics. The chapter ends with the state-of-the-art insights towards future challenges and avenues for further research.
Ksenia Podoynitsyna, Eglė Vaznytė-Hünermund
14. Digital Servitization in Agriculture
Abstract
Servitization describes the process of adding services to complement companies’ core products in order to create additional value for customers. Well-known examples of servitization are Atlas Copco’s compressed-air-as-a-service and Rolls-Royce’s power-by-the-hour service. For long, the manufacturing sector has been the subject of much debate about how digital technologies can enable companies to provide advanced services to customers; this transition is known as “digital servitization.” In this chapter, we explore how digital servitization benefits not only manufacturers but also farmers and their suppliers. We describe three cases in detail—a chicken poultry farm and two equipment suppliers, one for the horticultural sector and another for the dairy farm sector—and discuss their different transitions from a digital servitization theory lens. Specifically, we describe how the collection and analysis of data provide agricultural companies several pathways for servitization.
Wim Coreynen, Sicco Pier van Gosliga
15. Entrepreneurial Finance
Abstract
Restricted access to finance is often cited as one of the most prominent problems of innovative startups throughout their life cycle. Many entrepreneurial ventures, including big data startups, require external capital to realize their exponential growth and eventually achieve a successful exit in the form of an initial public offering (IPO) or an acquisition. Therefore, startup founders have to be fully aware of their funding options and potential added value that different types of investors may bring to the table. Funding decisions always include a number of strategic considerations, as investors do not only provide monetary funding. Quite to the contrary, startup investors fulfill various additional roles, including strategic advisors, network connectors, facilitators of human capital, and internal conflict resolution. Nonetheless, despite startup financing is definitely not a zero-sum game, agency problems and diverging incentives cloud the relationship between startup investors and entrepreneurs (Fried and Ganor, New York University Law Review 81:967–1025; 2006). This chapter provides an overview of different types of investors that provide financing for innovative (tech) startups, including their particular incentives in the startup investment process and the financial considerations. It follows the startup life cycle from its seed phase till exit.
Anne Lafarre, Ivona Schoonbrood
16. Entrepreneurial Marketing
Abstract
Firms often create products (and services) based on radically new technology. These new products have the power to change the marketplace, but still fail relatively often. While entrepreneurs typically focus on the unique features of a new product, end users are generally more interested in the solution it offers and/or whether it is easy to integrate in their business processes. Furthermore, if the product is radically new, potential customers will probably struggle to understand the new product, and hence, they need to be educated before they consider adopting the new product. Consequently, the marketing and sales for radically new products are complicated and differ from traditional marketing and selling that are more geared towards incremental innovation. In this chapter, we define the concepts of entrepreneurial marketing and sales, explain their differences, and discuss why the new product development process of young firms should be complemented with a customer development process. Before a young firm can grow, customers need to be discovered and built. Later, marketing and sales efforts can be optimized using data collected from the firm’s initial customer base. Modifications to traditional marketing concepts, such as segmentation and positioning, and to common marketing instruments are discussed.
Ed Nijssen, Shantanu Mullick

Data and Society

Frontmatter
17. Data Protection Law and Responsible Data Science
Abstract
This chapter provides data scientists with an introduction to data protection law. The aim of this chapter is to provide some basic knowledge and understanding concerning some of the most important principles of data protection law through a general explanation of these key concepts. It will show for instance that the notion of personal data is a very broad one, which encompasses the vast majority of the data processed in contemporary data processing technologies, or that the distinction between a data controller and a data processor can be tricky. It will also show that in order to start a processing of data, one has the choice between six different grounds; however, one ground must be chosen, and when processing data, all the provisions of Art. 5 of the GDPR must be respected. On the one hand, the explanations are general and do not go into too much detail so that they are easily understandable by the reader. On the other hand, they provide for “actionable knowledge”. That is, they will allow the reader to play with and apply the data protection principles herein discussed to their data science applications, so that they can be performed in a socially responsible way.
Five key points are discussed:
1.
What exactly is meant by data protection?
 
2.
What is personal data?
 
3.
Who are the actors of data protection law?
 
4.
Under what conditions it is possible to start processing data?
 
5.
What principles should be respected when processing data?
 
Raphaël Gellert
18. Perspectives from Intellectual Property Law
Abstract
This chapter focuses on intellectual property rights in data from an EU perspective. Datasets can have great potential value for those who have access to these datasets. However, access is often restricted by those who have effective control over such datasets. Intellectual property law provides them with potential tools to restrict the access and use by third parties of these datasets. When and how this can be restricted by who depends on the specific legal regime. In this chapter, these issues will be addressed by focusing on copyright, the sui generis database right, and trade secrets right. The legal definitions of these rights will be explored, as well as their limitations and exceptions relevant in the context of third-party use of the right holder’s dataset and software. Finally, the chapter looks into examples of alternative sources, such as data portability and public sector information, and their promise and limitations as a complementary source or substitute.
Lisa van Dongen
19. Liability and Contract Issues Regarding Data
Abstract
This chapter provides an introduction to contract law and liability, insofar as relevant for data scientists. The aim of this chapter is to provide basic knowledge to understand the main principles and elements of contract law and legal liability, including some pointers to avoid potential pitfalls. This is done by a general overview of legal issues, without going into detail about specific legal systems, except a few remarks about specifics of common law. Four key points are discussed:
  • The different legal meanings and qualification of “data”
  • The importance of appropriate contractual clauses to deal with data
  • The restrictions of compensation of pure economic loss
  • The applicable torts when dealing with data
Eric Tjong Tjin Tai
20. Data Ethics and Data Science: An Uneasy Marriage?
Abstract
Civil society around the world has called for data-driven companies to take their responsibility seriously and to work on becoming more fair, transparent, accountable, and trustworthy, to name just a few of the goals that have been set. Data ethics has been put forward as a promising strategy to make this happen. However, data ethics is a fuzzy concept that can mean different things to different people. This chapter is therefore dedicated to explaining data ethics from different angles. It will first look into data ethics as an academic discipline and illustrate how some of these academic viewpoints trickle down in the debate on data science and AI. Next, it will focus on how data ethics has been put forward as a regulatory strategy by data-driven companies. It will look into the relation between law and ethics, because if in this entrepreneurial context data ethics is not properly embedded, it can be used as an escape from legal regulation. This chapter ends with a reflection on the future relation of data ethics and data science and provides some discussion questions to instigate further debate.
Esther Keymolen, Linnet Taylor
21. Value-Sensitive Software Design
Abstract
Software is at the heart of many processes in contemporary life and innovation: software helps us to order vast amounts of data, retrieve information, make decisions, plan routes, etc. However, as will be explained in this chapter, like all technologies, software is inherently not neutral: design choices and characteristics of the technology itself affect our practices, choices, and even how we interpret the world around us. Software can harbor certain biases, can contain errors, or can have unintended side effects. Meanwhile, software can have a high impact on the lives of individuals. It is therefore important to be aware of this potential impact and figure out how to reap the fruits of the technology while reducing its potential harms. This chapter argues that we should take on this task already from the very beginning of the design of the technology. In order to help with the first few steps, the chapter introduces the main ideas underlying value-sensitive design.
Paulan Korenhof
22. Data Science for Entrepreneurship: The Road Ahead
Abstract
Recent advancements and trends in data engineering, data analytics, entrepreneurship, and business and societal context in which all this happens will usher a new wave in data entrepreneurship. This calls for new theories, approaches, methods, and techniques and opens up new possibilities for companies to find a competitive edge and, hopefully, to reap the associated benefits. This chapter concludes the book with a kaleidoscopic overview of several important developments, exploring their implications for areas where data science and entrepreneurship meet. Next to a number of implications for practice, this chapter ends with a brief discussion of interesting avenues for future research.
Willem-Jan van den Heuvel, Werner Liebregts, Arjan van den Born
Metadata
Title
Data Science for Entrepreneurship
Editors
Werner Liebregts
Willem-Jan van den Heuvel
Arjan van den Born
Copyright Year
2023
Electronic ISBN
978-3-031-19554-9
Print ISBN
978-3-031-19553-2
DOI
https://doi.org/10.1007/978-3-031-19554-9